유니코드 안정성 정책
문자가 한 번 할당되면 코드 포인트와 이름이 절대 변경되지 않음을 보장하는 정책. 속성은 정제될 수 있지만 할당은 영구적입니다.
What is the Unicode Stability Policy?
The Unicode Stability Policy is a set of formal commitments by the Unicode Consortium guaranteeing that certain properties and assignments in the Unicode Standard will not change in ways that would break existing implementations. These policies protect developers who build software on Unicode: you can rely on the behavior of characters you support today remaining consistent in future Unicode versions.
Stability policies are published on unicode.org and are considered binding. They were formalized over several Unicode versions and have become more comprehensive as the standard matured.
Core Stability Guarantees
Character Assignment Stability
Once a code point is assigned to a character, that assignment is permanent: - No character is ever unassigned from its code point - No code point is ever reassigned to a different character - A Unicode 1.0 file can be decoded correctly by a Unicode 16.0 implementation
Character Name Stability
Normative character names are immutable. If a name was published with a typo or error (e.g.,
U+FE18's name has a documented misspelling "BRAKCET" instead of "BRACKET"), the wrong name
cannot be changed. Instead, a formal alias is added via NameAliases.txt, and applications
are expected to use both the name and its aliases.
Normalization Stability (since Unicode 4.1)
The canonical decomposition of any character assigned before Unicode 4.1 is fixed permanently. This means: - NFC/NFD results for pre-4.1 characters will never change - Software can normalize text and store it without re-normalizing after Unicode updates - New characters (post-4.1) may have decompositions, but once published they are also stable
import unicodedata
# This NFC result will be the same in all future Unicode versions
# for characters assigned before Unicode 4.1
print(unicodedata.normalize("NFC", "caf\u0065\u0301")) # "café"
Identifier Stability (since Unicode 5.1)
Code points in ID_Start and ID_Continue properties do not lose those properties. This ensures
that identifiers valid in one Unicode version remain valid in future versions.
Case Mapping Stability (partial)
Simple case mappings (uppercase, lowercase, titlecase) do not change for existing characters. Special case mappings (context-dependent) may be updated, but simple cases are stable.
Bidi Stability (Unicode 6.3+)
The bidi class of assigned characters does not change in ways that would alter the visual presentation of existing text.
What Is NOT Stable
The Unicode Stability Policy does not guarantee everything:
| Property | Stable? | Notes |
|---|---|---|
| Character assignment | Yes | Never revoked |
| Character name | Yes | Errors become aliases |
| Normalization (pre-4.1) | Yes | Fixed in Unicode 4.1 |
| General category | Mostly | Won't change in breaking ways |
| Emoji presentation | No | Can change between versions |
| Script property | Mostly | New scripts may reclassify |
| Default ignorable | Partially | May change for unassigned |
Practical Impact for Developers
Text storage: Because character assignments and normalization are stable, you can safely store Unicode text in databases. A code point stored today will mean the same character forever.
Identifier parsing: Programming languages using ID_Start/ID_Continue for identifier
validation (Python, JavaScript, Rust) benefit from identifier stability — valid identifiers
remain valid.
Emoji rendering: Emoji properties are not fully stable — an emoji can gain or lose a
Emoji_Presentation flag, changing its default rendering. This is why platform emoji updates
sometimes appear unexpected.
Normalization in indexes: Because NFC/NFD results are stable for assigned characters, you can normalize text before indexing it and rely on those indexes remaining valid after Unicode updates.
Quick Facts
| Property | Value |
|---|---|
| Published by | Unicode Consortium (unicode.org/policies/) |
| Character assignment stability | Permanent — no changes ever |
| Name stability | Permanent — errors become aliases |
| Normalization stability | Since Unicode 4.1 |
| Identifier stability | Since Unicode 5.1 |
| Bidi stability | Since Unicode 6.3 |
| Emoji presentation stability | Not guaranteed |
| Key beneficiaries | Programming languages, databases, text search |
관련 용어
유니코드 표준의 더 많은 용어
한중일 — 유니코드에서 통합 한자 블록 및 관련 문자 체계를 아우르는 집합적 …
The process of mapping Chinese, Japanese, and Korean ideographs that share a …
The individual consonant and vowel components (jamo) of the Korean Hangul writing …
유니코드와 동기화된 국제 표준(ISO/IEC 10646)으로, 동일한 문자 목록과 코드 포인트를 정의하지만 유니코드의 …
모든 문자 체계의 모든 문자에 고유 번호(코드 포인트)를 부여하는 범용 문자 인코딩 …
Normative or informative documents that are integral parts of the Unicode Standard. …
Informational documents published by the Unicode Consortium covering specific topics like security …
평면 0(U+0000~U+FFFF)으로, 라틴, 그리스, 키릴, CJK, 아랍 문자 및 대부분의 기호 등 …
어느 유니코드 버전에서도 문자가 할당되지 않은 코드 포인트로, Cn(미할당)으로 분류됩니다. 향후 버전에서 …
평면 1~16(U+10000~U+10FFFF)으로, 이모지, 고대 문자, CJK 확장, 악보 등을 포함합니다. UTF-16에서는 서로게이트 …