Ký tự được gán
Embed This Widget
Add the script tag and a data attribute to embed this widget.
Embed via iframe for maximum compatibility.
<iframe src="https://unicodefyi.com/iframe/glossary/assigned-character/" width="420" height="400" frameborder="0" style="border:0;border-radius:10px;max-width:100%" loading="lazy"></iframe>
Paste this URL in WordPress, Medium, or any oEmbed-compatible platform.
https://unicodefyi.com/glossary/assigned-character/
Add a dynamic SVG badge to your README or docs.
[](https://unicodefyi.com/glossary/assigned-character/)
Use the native HTML custom element.
Điểm mã đã được gán ký tự trong một phiên bản Unicode. Tính đến Unicode 16.0, có 154.998 điểm mã được gán trong tổng số 1.114.112 điểm có thể có.
What is an Assigned Character?
An assigned character is a Unicode code point that has been given a formal designation in the Unicode Standard — it represents a specific character, symbol, or abstract entity with an official name, category, and set of properties. The Unicode Consortium adds new assigned characters in each version of the standard; once a code point is assigned, it is permanently assigned and the assignment is never revoked or changed.
As of Unicode 16.0, approximately 154,998 code points are assigned characters out of the 1,114,112 total code space.
What Makes a Code Point "Assigned"
A code point is assigned when the Unicode Consortium:
- Gives it a normative character name (e.g., LATIN SMALL LETTER A, SNOWMAN, GRINNING FACE)
- Assigns it a General Category (letter, number, punctuation, symbol, etc.)
- Defines its relevant character properties in the Unicode Character Database
The formal assignment appears in UnicodeData.txt — the primary UCD file. Any code point with
an entry in that file (other than range sentinels) is assigned.
Categories of Assigned Characters
Assigned characters are not all printable glyphs. The Unicode Standard assigns code points to:
| Category | Examples | General Category Code |
|---|---|---|
| Letters | A, a, α, あ, 字 | Lu, Ll, Lo... |
| Digits | 0–9, ², ③ | Nd, Nl, No |
| Punctuation | . , ! « » | Po, Ps, Pe... |
| Symbols | €, ©, ★, ☃, 😀 | So, Sm, Sc, Sk |
| Marks | combining acute ◌́ | Mn, Mc, Me |
| Separators | space, line separator | Zs, Zl, Zp |
| Control codes | U+0009 TAB, U+000A LF | Cc |
| Format characters | U+200C ZWNJ, U+FEFF BOM | Cf |
Even control characters like TAB, LF, and NULL (U+0000) are assigned characters — they have
official names and category Cc (Control).
Checking Assignment Status
import unicodedata
def is_assigned(char: str) -> bool:
# Cn = Unassigned; Co = Private Use; Cs = Surrogate
# All other categories indicate assigned characters
cat = unicodedata.category(char)
return cat not in ("Cn",) # Cn = not assigned
# Assigned characters
print(is_assigned("A")) # True — Lu (Uppercase Letter)
print(is_assigned("😀")) # True — So (Other Symbol)
print(is_assigned("\t")) # True — Cc (Control)
print(is_assigned("\uE001")) # True — Co (Private Use — assigned category, user-defined)
# Unassigned
print(is_assigned("\u0378")) # False — Cn (Unassigned)
# Get character name
print(unicodedata.name("A")) # LATIN CAPITAL LETTER A
print(unicodedata.name("😀")) # GRINNING FACE
print(unicodedata.name("\t")) # HORIZONTAL TABULATION
Stability of Assignments
The Unicode Stability Policy guarantees that once a code point is assigned: - Its character name is permanent (corrections become formal aliases, not replacements) - Its General Category will not change in ways that break normalization or sorting - Its decomposition mapping will not change - The code point will never be unassigned or reassigned to a different character
This stability is essential for backward compatibility: text files created with Unicode 1.0 can still be read correctly with Unicode 16.0 implementations.
Growth Over Versions
| Version | Year | Assigned Characters |
|---|---|---|
| 1.0 | 1991 | 7,129 |
| 3.0 | 1999 | 49,194 |
| 5.0 | 2006 | 99,024 |
| 8.0 | 2015 | 120,737 |
| 12.0 | 2019 | 137,994 |
| 15.0 | 2022 | 149,186 |
| 16.0 | 2024 | 154,998 |
Quick Facts
| Property | Value |
|---|---|
| Total assigned (v16.0) | 154,998 |
| Percentage of code space | ~13.9% |
| Earliest assigned | U+0000–U+007F (ASCII, Unicode 1.0) |
| General category for unassigned | Cn |
| Stability guarantee | Never unassigned or reassigned |
| Primary source file | UnicodeData.txt |
| Includes control characters? | Yes (U+0000–U+001F, U+007F, U+0080–U+009F) |
Thuật ngữ liên quan
Thêm trong Tiêu chuẩn Unicode
Đảm bảo rằng một khi ký tự được gán, điểm mã và …
Trung Quốc, Nhật Bản và Hàn Quốc — thuật ngữ tập thể …
Tập hợp các tệp dữ liệu có thể đọc được bằng máy …
Bất kỳ điểm mã nào ngoại trừ các điểm mã surrogate (U+D800–U+DFFF). …
The process of mapping Chinese, Japanese, and Korean ideographs that share a …
The individual consonant and vowel components (jamo) of the Korean Hangul writing …
Tổ chức phi lợi nhuận phát triển và duy trì Tiêu chuẩn …
Tiêu chuẩn quốc tế (ISO/IEC 10646) được đồng bộ hóa với Unicode, …
Toàn bộ phạm vi các điểm mã Unicode có thể có: U+0000 …
Các điểm mã U+D800–U+DFFF được dành riêng cho các cặp thay thế …