What is संख्यात्मक वर्ण संदर्भ?

Unicode code point संख्या का उपयोग करने वाली HTML entity: decimal (© → ©) या hexadecimal (© → ©)। named references के विपरीत किसी भी Unicode वर्ण के लिए काम करती है।

What is HTML इकाई?

HTML में किसी वर्ण का पाठीय प्रतिनिधित्व। तीन रूप: named (&), decimal (&), hexadecimal (&)। HTML syntax से टकराने वाले वर्णों के लिए आवश्यक।

वेब और HTML

संख्यात्मक वर्ण संदर्भ

Q: What is HTML इकाई?

HTML में किसी वर्ण का पाठीय प्रतिनिधित्व। तीन रूप: named (&amp;), decimal (&#38;), hexadecimal (&#x26;)। HTML syntax से टकराने वाले वर्णों के लिए आवश्यक।

Unicode code point संख्या का उपयोग करने वाली HTML entity: decimal (© → ©) या hexadecimal (© → ©)। named references के विपरीत किसी भी Unicode वर्ण के लिए काम करती है।

2023-10-30 · Updated 2025-03-10

What Are Numeric Character References?

Numeric character references (NCRs) are HTML escape sequences that represent any Unicode character by its code point number. They take two forms:

Decimal: &#N; where N is a base-10 integer — e.g., © for ©
Hexadecimal: &#xH; where H is a base-16 integer — e.g., © for ©

Both forms refer to the Unicode scalar value of the character. Since Unicode covers over 1.1 million code points, NCRs can represent virtually any character ever assigned — from basic Latin letters to rare CJK ideographs and emoji — using only ASCII characters in the source.

Decimal vs. Hexadecimal

Decimal NCRs (&#N;) are straightforward for readers who know the decimal code points of common characters (65 = 'A', 169 = '©'). Hexadecimal NCRs (&#xH;) align with how Unicode code points are conventionally written — U+00A9 maps directly to ©. When working with Unicode documentation or character tables that list code points in hex, the hex form is easier to use without mental conversion.

<!-- These are identical -->
&#65;     = &#x41;   = A
&#169;    = &#xA9;   = ©
&#8364;   = &#x20AC; = €
&#128512; = &#x1F600; = 😀

Valid Range

Valid code points for NCRs are: 1–55295 (U+0001–U+D7FF) and 57344–1114111 (U+E000–U+10FFFF). The surrogate range U+D800–U+DFFF is invalid and must not be encoded. U+0000 (NULL) is also excluded. Browsers may render other disallowed code points (such as U+0001–U+001F control characters) as the replacement character U+FFFD.

Supplementary Characters

NCRs fully support Unicode supplementary characters (code points above U+FFFF). In UTF-16 these require surrogate pairs, but in HTML you write a single NCR:

<!-- U+1F4A9 PILE OF POO — supplementary character -->
&#128169;      <!-- decimal -->
&#x1F4A9;      <!-- hex -->

This is one advantage of NCRs over raw UTF-16 encoding in old environments.

Practical Use

<!-- Escaping in content -->
<p>The formula is a &#60; b &lt; c</p>
<!-- &#60; and &lt; both render as < -->

<!-- Characters outside keyboard reach -->
<p>The currency symbol is &#x20B9; (Indian Rupee)</p>

<!-- In HTML attributes -->
<input placeholder="Enter &#x2764; here">

<!-- Emoji -->
<title>Unicode Guide &#x1F4DA;</title>

# Python: convert character to NCR
char = "©"
f"&#{ord(char)};"    # "&#169;"
f"&#x{ord(char):X};" # "&#xA9;"

# Python: decode NCR
import html
html.unescape("&#169;")   # "©"
html.unescape("&#xA9;")   # "©"

NCRs vs. Named References vs. Direct Characters

Approach	Example	Readability	Coverage
Named reference	`©`	High (for known names)	2,231 characters
Decimal NCR	`©`	Medium	All Unicode
Hex NCR	`©`	Medium (for Unicode users)	All Unicode
Direct UTF-8	`©`	Highest	All Unicode

In modern UTF-8 documents, direct characters are preferred. NCRs remain valuable in legacy ASCII environments and when generating HTML programmatically.

Quick Facts

Property	Value
Decimal syntax	`&#N;` (N is base-10 code point)
Hex syntax	`&#xH;` or `&#XH;` (H is base-16 code point)
Valid range	U+0001–U+D7FF and U+E000–U+10FFFF
Covers all Unicode	Yes — any assigned code point
Surrogates allowed	No — invalid in HTML
Case of hex digits	Case-insensitive: `©` = `©`
Trailing semicolon	Required; optional only in certain legacy contexts

वेब और HTML में और

Content-Type कैरेक्टर सेट

HTTP header parameter जो response की character encoding घोषित करता है (Content-Type: …

CSS content प्रॉपर्टी

Unicode escapes का उपयोग करके ::before और ::after pseudo-elements के माध्यम से …

CSS Text Direction

CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …

HTML इकाई

HTML में किसी वर्ण का पाठीय प्रतिनिधित्व। तीन रूप: named (&), decimal …

Internationalized Domain Name (IDN)

non-ASCII Unicode वर्ण युक्त domain names, आंतरिक रूप से Punycode (xn--...) के …

JavaScript Intl API

ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …

Punycode

Unicode domain names का ASCII-compatible encoding, अंतर्राष्ट्रीयकृत labels को xn-- उपसर्ग वाले …

Unicode in CSS

CSS supports Unicode via escape sequences (\2713 for ✓), the content property …

XML वर्ण संदर्भ

XML का न्यूमेरिक कैरेक्टर रेफ़रेंस: ✓ या ✓। XML में केवल 5 …

इमोजी प्रस्तुति

किसी वर्ण को रंगीन emoji ग्लिफ़ के साथ रेंडर करना, आमतौर पर …

← शब्दावली पर वापस जाएं