HTML 엔티티
HTML에서 문자를 텍스트로 표현하는 방식. 세 가지 형태: 이름(&), 십진수(&), 16진수(&). HTML 구문과 충돌하는 문자에 필수적입니다.
What Are HTML Entities?
HTML entities are special text sequences that represent characters in HTML documents. They allow you to display characters that would otherwise be interpreted as HTML markup, characters outside the printable ASCII range, or characters that are difficult to type directly on a keyboard.
An HTML entity begins with an ampersand (&) and ends with a semicolon (;). Between them, you write either a named reference (like &) or a numeric reference (like A or A).
Why HTML Entities Exist
HTML uses certain characters for its own syntax. The less-than sign (<) opens tags, the greater-than sign (>) closes them, and the ampersand itself starts entity sequences. If you want to display these characters as content rather than markup, you must escape them. Without entities, writing <b> in your content would be parsed as an HTML tag, not displayed as literal text.
Beyond reserved characters, entities also cover the full Unicode range, letting you embed any character — from accented letters and currency symbols to mathematical operators and emoji — using only ASCII source code.
Named vs. Numeric Entities
Named entities use a human-readable keyword: © for ©, < for <, for a non-breaking space. HTML5 defines over 2,000 named references.
Numeric entities reference a character by its Unicode code point. Decimal: © for ©. Hexadecimal: © for the same character. Every Unicode character can be written as a numeric entity; named entities exist only for a curated subset.
Common Examples
<!-- Reserved characters -->
< <!-- < -->
> <!-- > -->
& <!-- & -->
" <!-- " -->
' <!-- ' (HTML5) -->
<!-- Common symbols -->
© <!-- © -->
® <!-- ® -->
™ <!-- ™ -->
<!-- non-breaking space -->
— <!-- — em dash -->
€ <!-- € -->
<!-- Math -->
× <!-- × -->
÷ <!-- ÷ -->
± <!-- ± -->
<!-- Numeric equivalents -->
A <!-- A (decimal) -->
A <!-- A (hex) -->
😀 <!-- 😀 emoji -->
Using Entities in Practice
In modern web development, you should always declare your document encoding as UTF-8 in the <meta charset="UTF-8"> tag. With UTF-8, you can type most Unicode characters directly in your source file and avoid entities for non-reserved characters. Entities remain necessary only for the five reserved HTML characters and for generating characters programmatically.
Template engines like Jinja2, Django templates, and React's JSX automatically escape <, >, &, ", and ' when outputting user content — protecting against XSS injection.
import html
html.escape("<script>alert('xss')</script>")
# "<script>alert('xss')</script>"
html.unescape("© 2024 — All rights reserved")
# "© 2024 — All rights reserved"
Browser Parsing
Browsers decode entities during HTML parsing, before the DOM is built. The decoded text character lives in the DOM; JavaScript accessing element.textContent sees the actual character, not the entity sequence.
Quick Facts
| Property | Value |
|---|---|
| Syntax | &name; or &#decimal; or &#xhex; |
| Minimum entity | < (4 chars) |
| Named entities in HTML5 | 2,231 |
| Must-escape in HTML | < > & (attributes also ") |
| UTF-8 recommendation | Encode source as UTF-8; use entities only for reserved chars |
| JavaScript decoding | element.textContent returns decoded character |
| Case sensitivity | Named entities are case-sensitive: &Amp; is invalid |
관련 용어
웹 & HTML의 더 많은 용어
응답의 문자 인코딩을 선언하는 HTTP 헤더 매개변수(Content-Type: text/html; charset=utf-8). 문서 내 인코딩 …
::before 및 ::after 의사 요소를 통해 유니코드 이스케이프를 사용하여 생성된 콘텐츠를 삽입하는 …
CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …
ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …
유니코드 도메인 이름을 ASCII 호환 인코딩으로 변환하여 xn-- 접두사가 붙은 ASCII 문자열로 …
CSS supports Unicode via escape sequences (\2713 for ✓), the content property …
XML 버전의 숫자 문자 참조: ✓ 또는 ✓. XML에는 명명된 엔티티가 5개(& …
비ASCII 유니코드 문자를 포함하는 도메인 이름으로, 내부적으로는 Punycode(xn--...)로 저장되지만 사용자에게는 유니코드로 표시됩니다. …
U+2060. 줄 바꿈을 방지하는 너비 없는 문자. 너비 없는 줄 바꿈 없는 …
사람이 읽기 쉬운 이름을 사용하는 HTML 엔티티: © → ©, — → …