What is เวอร์ชัน Unicode?

เวอร์ชันหลักของ Unicode Standard แต่ละเวอร์ชันเพิ่มอักขระ อักษร และคุณสมบัติใหม่ เวอร์ชันปัจจุบันคือ Unicode 16.0 (กันยายน 2025)

What is อักขระที่ได้รับการกำหนด?

จุดรหัสที่ได้รับการกำหนดอักขระในเวอร์ชัน Unicode ณ Unicode 16.0 มีจุดรหัสที่กำหนดแล้ว 154,998 จุดจาก 1,114,112 จุดที่เป็นไปได้

คุณสมบัติ

คุณสมบัติเวอร์ชัน

เวอร์ชัน Unicode ที่มีการกำหนดอักขระเป็นครั้งแรก มีประโยชน์สำหรับการตรวจสอบการรองรับอักขระในระบบและซอฟต์แวร์เวอร์ชันต่างๆ

2022-04-06 · Updated 2024-07-18

What Is the Age Property?

The Age property records the Unicode version in which a character was first assigned a code point. This property allows developers and researchers to determine which version of Unicode introduced a particular character—essential for compatibility testing, font planning, and detecting whether a string contains characters newer than a supported Unicode version.

Age values are version strings like 1.1, 2.0, 3.0, ..., 15.1. A code point that has never been assigned has an implied age of Unassigned. The property is immutable: a character's Age never changes once assigned, even if its properties (name, category) are later corrected.

Checking Age in Python

Python's unicodedata module does not expose Age directly, but the unicodedata.unidata_version string tells you which Unicode version the module implements. To check a character's introduction version you can use the unicodedata module's character lookup or a third-party library:

import unicodedata
import sys

# The Unicode version Python's unicodedata module implements
print(unicodedata.unidata_version)   # e.g., "15.1.0"
print(sys.version)

# For Age lookups, use the 'unicodedata2' or 'unicodedataplus' package
# pip install unicodedataplus
try:
    import unicodedataplus as udp
    for char in ["A", "€", "😀", "\U0001F600"]:
        age = udp.age(char)
        name = unicodedata.name(char, "<unnamed>")
        print(f"  U+{ord(char):04X}  Age={age:6}  {name}")
except ImportError:
    print("Install unicodedataplus for Age support")

# U+0041  Age=1.1    LATIN CAPITAL LETTER A
# U+20AC  Age=2.1    EURO SIGN
# U+1F600 Age=6.1    GRINNING FACE

Why Age Matters

Emoji support timelines: Emoji characters have been added across many Unicode versions. Age tells you whether a character requires Unicode 6.0+ (face emoji), 8.0+ (skin tone modifiers), or 13.0+ (newer additions). Mobile operating systems typically lag Unicode releases by 12–24 months.

Legacy system compatibility: A database or file format limited to Unicode 3.0 cannot store CJK Extension B characters (added in 4.1) or any emoji. Age detection lets you validate input against a maximum supported version.

Font auditing: Font engineers use Age to prioritize glyph development—newer characters with high Age values may not yet be covered by commonly deployed fonts.

Notable Age Milestones

Version	Year	Notable Additions
1.1	1993	Latin, Greek, Cyrillic, CJK core, Hebrew, Arabic
2.1	1998	Euro sign €, many combining marks
4.1	2005	CJK Extension B (42,711 characters)
6.0	2010	Emoji (first large batch, 722 characters)
8.0	2015	Skin tone modifiers
13.0	2020	55 new emoji, Chorasmian, Yezidi scripts
15.1	2023	Latest release

Quick Facts

Property	Value
Unicode property name	`Age`
Type	Enumerated (version string)
Python built-in	No (use `unicodedataplus` or `regex` package)
Immutability	Age never changes after assignment
Unassigned value	`Unassigned` (implied)
Spec reference	Unicode Standard Annex #44, `DerivedAge.txt`

คำศัพท์ที่เกี่ยวข้อง

เวอร์ชัน Unicode อักขระที่ได้รับการกำหนด

เพิ่มเติมใน คุณสมบัติ

East Asian Width

Unicode property (UAX#11) classifying characters as Narrow, Wide, Fullwidth, Halfwidth, Ambiguous, or …

Joining Type

Unicode property controlling how Arabic and Syriac characters connect to adjacent characters. …

Script Extensions

Unicode property listing all scripts that use a character, broader than the …

กลุ่มกราฟีม

อักขระที่ผู้ใช้รับรู้ได้ — สิ่งที่รู้สึกเหมือนหน่วยเดียว อาจประกอบด้วยหลายจุดรหัส (ฐาน + เครื่องหมายรวม หรือลำดับ emoji ZWJ) 👩‍💻 = …

การแมปตัวพิมพ์

กฎสำหรับแปลงอักขระระหว่างตัวพิมพ์ใหญ่ ตัวพิมพ์เล็ก และตัวพิมพ์หัวเรื่อง อาจขึ้นอยู่กับ locale (ปัญหาตัว I ในภาษาตุรกี) และอาจเป็นแบบหนึ่ง-ต่อ-หลาย (ß → SS)

การแยกส่วน

การแมปอักขระเป็นส่วนประกอบย่อย การแยกส่วนแบบ canonical รักษาความหมาย (é → e + ́) ในขณะที่การแยกส่วนแบบ compatibility อาจเปลี่ยนความหมาย …

คลาสการรวม

ค่าตัวเลข (0–254) ที่ควบคุมลำดับของเครื่องหมายรวมระหว่างการแยกส่วนแบบ canonical กำหนดว่าเครื่องหมายรวมใดสามารถเรียงลำดับใหม่ได้

ความสมมูลความเข้ากันได้

ลำดับอักขระสองชุดที่มีเนื้อหาเชิงนามธรรมเดียวกันแต่อาจแตกต่างในรูปลักษณ์ กว้างกว่าความเท่าเทียมแบบ canonical ตัวอย่าง: ﬁ ≈ fi, ² ≈ 2

ความสมมูลมาตรฐาน

ลำดับอักขระสองชุดที่มีความหมายเหมือนกันและควรถือว่าเท่าเทียมกัน ตัวอย่าง: é (U+00E9) ≡ e + ◌́ (U+0065 + U+0301)

คุณสมบัติการสะท้อน

อักขระที่รูปร่างควรสะท้อนในแนวนอนในบริบท RTL ตัวอย่าง: ( → ), [ → ], { → }, …

← กลับไปยังอภิธานศัพท์