명명된 문자 참조
사람이 읽기 쉬운 이름을 사용하는 HTML 엔티티: © → ©, — → —. HTML5는 2,231개의 명명된 참조를 정의하며 대소문자를 구분합니다.
What Are Named Character References?
Named character references (also called named HTML entities) are predefined shorthand sequences that represent specific Unicode characters in HTML. They follow the pattern &name; where name is a case-sensitive keyword registered in the HTML specification. Examples include & for the ampersand character, < for the less-than sign, and © for the copyright symbol ©.
The HTML5 specification defines 2,231 named references, covering characters across many Unicode blocks: Latin letters with diacritics, Greek and Cyrillic letters, mathematical symbols, arrows, currency signs, playing card suits, and more.
History and Design
Named references were introduced early in HTML to help authors write documents using only ASCII source files while displaying richer character sets. Before UTF-8 became universal, a document might be saved in Latin-1 but still need to display Greek letters or typographic dashes. Named entities provided a portable solution.
HTML4 borrowed many entities from ISO character sets and SGML. HTML5 dramatically expanded the list by adding all characters from MathML and a large portion of common Unicode symbols.
The Five Essential Entities
Five named references are special because they escape HTML syntax characters:
< → < (less-than, starts tags)
> → > (greater-than, ends tags)
& → & (ampersand, starts entities)
" → " (double quote, used in attributes)
' → ' (apostrophe, HTML5 only; use ' for HTML4)
Forgetting to escape these — especially & and < in user-generated content — is the root cause of many XSS vulnerabilities.
Commonly Used Named References
<!-- Typography -->
→ non-breaking space (U+00A0)
— → — (em dash, U+2014)
– → – (en dash, U+2013)
“ → " (left double quote)
” → " (right double quote)
‘ → ' (left single quote)
’ → ' (right single quote, apostrophe)
… → … (ellipsis)
<!-- Currency -->
€ → € (U+20AC)
£ → £ (U+00A3)
¥ → ¥ (U+00A5)
¢ → ¢ (U+00A2)
<!-- Math and science -->
× → × (U+00D7)
÷ → ÷ (U+00F7)
± → ± (U+00B1)
∞ → ∞ (U+221E)
∑ → ∑ (U+2211)
√ → √ (U+221A)
π → π (U+03C0)
<!-- Arrows -->
← → ← (U+2190)
→ → → (U+2192)
↑ → ↑ (U+2191)
↓ → ↓ (U+2193)
Case Sensitivity
Named references are case-sensitive. © works, © and &Copy; do not. A few names have uppercase variants that mean something different: Α is Α (Greek capital Alpha), while α is α (Greek small alpha).
Verifying Named References
// Decode a named entity by injecting into DOM
function decodeEntity(entity) {
const el = document.createElement("textarea");
el.innerHTML = entity;
return el.value;
}
decodeEntity("—"); // "—"
decodeEntity("…"); // "…"
decodeEntity("¬aref;"); // "¬aref;" — unknown entities are left as-is
When to Use Named References Today
With UTF-8 encoding, you can paste ©, —, and ∞ directly into HTML source. Named references are mainly useful for: code samples that must remain plain ASCII, generated HTML in environments where the encoding might shift, and developer readability when the intent matters ( is clearer than a bare invisible space character).
Quick Facts
| Property | Value |
|---|---|
| Total defined in HTML5 | 2,231 named references |
| Case sensitivity | Case-sensitive (& valid, & invalid) |
| Must always escape | & < > " |
| Trailing semicolon | Required in most contexts; optional legacy exceptions exist |
| XML compatibility | Only & < > " ' are predefined in XML |
| Browser behavior | Unknown named references displayed as literal text |
관련 용어
웹 & HTML의 더 많은 용어
응답의 문자 인코딩을 선언하는 HTTP 헤더 매개변수(Content-Type: text/html; charset=utf-8). 문서 내 인코딩 …
::before 및 ::after 의사 요소를 통해 유니코드 이스케이프를 사용하여 생성된 콘텐츠를 삽입하는 …
CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …
HTML에서 문자를 텍스트로 표현하는 방식. 세 가지 형태: 이름(&), 십진수(&), 16진수(&). HTML …
ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …
유니코드 도메인 이름을 ASCII 호환 인코딩으로 변환하여 xn-- 접두사가 붙은 ASCII 문자열로 …
CSS supports Unicode via escape sequences (\2713 for ✓), the content property …
XML 버전의 숫자 문자 참조: ✓ 또는 ✓. XML에는 명명된 엔티티가 5개(& …
비ASCII 유니코드 문자를 포함하는 도메인 이름으로, 내부적으로는 Punycode(xn--...)로 저장되지만 사용자에게는 유니코드로 표시됩니다. …
U+2060. 줄 바꿈을 방지하는 너비 없는 문자. 너비 없는 줄 바꿈 없는 …