What is Atanmamış kod noktası?

Henüz hiçbir Unicode sürümünde bir karaktere atanmamış kod noktası, Cn (Atanmamış) olarak kategorize edilir. Gelecek sürümlerde atanabilir.

What is Karakter olmayan?

Dahili kullanım için kalıcı olarak ayrılmış kod noktaları (toplam 66): U+FDD0–U+FDEF ve her düzlem için U+nFFFE/U+nFFFF. Metinde geçerlidir ancak harici olarak paylaşılmamalıdır.

What is Özel kullanım alanı?

Kuruluşların kendi karakterlerini atayabileceği ayrılmış aralıklar: BMP PUA (U+E000–U+F8FF) ve Düzlem 15 ve 16'daki Ek PUA'lar.

Unicode Standardı

Ayrılmış kod noktası

Gelecekteki standardizasyon için ayrılmış kod noktası; kalıcı olarak ayrılan noncharacter'lardan ve kullanıcı tarafından atanabilen özel kullanım alanlarından farklıdır.

2021-11-01 · Updated 2024-09-17

What is a Reserved Code Point?

A reserved code point is a position in the Unicode code space that has not yet been assigned to any character and is not permanently designated for a specific purpose (like noncharacters or private use). The Unicode Consortium holds these positions in reserve for potential future character assignments. As new scripts, symbols, and characters are added in future Unicode versions, they are taken from the pool of reserved code points.

Reserved code points are distinct from: - Unassigned code points: Often used interchangeably with "reserved," but technically "unassigned" means not yet having a character assignment, while "reserved" may imply more deliberate designation - Noncharacters: 66 code points permanently reserved and never to be assigned characters - Private Use Area: Permanently designated for user-defined characters

Current State

As of Unicode 16.0 (154,998 assigned characters), approximately 819,000 code points are unassigned — a vast majority of the 1,114,112 total code space. The Unicode Consortium has far more space than it currently needs:

Total code space:       1,114,112
Assigned characters:      154,998  (~13.9%)
Private Use Area:         137,468  (~12.4%)
Surrogates:                 2,048  ( ~0.2%)
Noncharacters:                 66  ( ~0.01%)
Available (unassigned):  ~819,000  (~73.5%)

Where Reserved Code Points Appear

Reserved code points are scattered throughout the code space, not concentrated in one region. Some patterns:

Gaps within blocks: A block may have some code points assigned and others reserved (e.g., the Greek block has specific reserved positions where uncommon letters were not initially added)
Entire sub-ranges: Planes 4–13 (U+40000–U+DFFFF) are entirely unassigned
Within the BMP: Scattered positions within named blocks

Handling Reserved Code Points

Applications should treat reserved code points gracefully:

import unicodedata

def classify_code_point(cp: int) -> str:
    char = chr(cp)
    category = unicodedata.category(char)
    # Cn = Unassigned (reserved/not yet assigned)
    if category == "Cn":
        return "unassigned/reserved"
    elif category == "Co":
        return "private use"
    elif category == "Cs":
        return "surrogate"
    else:
        return f"assigned ({category})"

print(classify_code_point(0x0041))    # assigned (Lu)
print(classify_code_point(0xE001))    # private use
print(classify_code_point(0xD800))    # surrogate
print(classify_code_point(0x0378))    # unassigned/reserved

Stability Guarantee

A core Unicode stability policy states that reserved code points may become assigned in future versions, but: - An assigned code point is never unassigned - A code point is never reassigned to a different character - The properties of reserved code points may change when they are assigned

This means software written today that skips or rejects reserved code points may need updating when those points are assigned in a future Unicode version.

The U+0378 Example

U+0378 is an example of a reserved code point within the Greek block (U+0370–U+03FF). The Greek block contains letters and symbols, but U+0378 and U+0379 have no assigned characters. They were skipped in the original Greek assignments and remain reserved pending any future need.

Quick Facts

Property	Value
General category	Cn (Unassigned)
Approximate count	~819,000 (Unicode 16.0)
Percentage of code space	~73.5%
Can become assigned?	Yes — in future Unicode versions
Ever removed once assigned?	No — stability policy prohibits this
Entirely unassigned planes	Planes 4–13
Can be used privately?	Not recommended — use PUA instead

İlgili Terimler

Atanmamış kod noktası Karakter olmayan Özel kullanım alanı

Unicode Standardı içinde daha fazlası

Atanmamış kod noktası

Henüz hiçbir Unicode sürümünde bir karaktere atanmamış kod noktası, Cn (Atanmamış) olarak …

Atanmış karakter

Bir Unicode sürümünde karakter ataması yapılmış kod noktası. Unicode 16.0 itibariyle, 1.114.112 …

Basic Multilingual Plane (BMP)

Düzlem 0 (U+0000–U+FFFF), Latin, Yunan, Kiril, CJK, Arap ve çoğu sembol dahil …

CJK

Çince, Japonca ve Korece — Unicode'da birleştirilmiş Han ideograf bloğu ve ilgili …

Düzlem

65.536 kod noktasından oluşan bitişik blok. Unicode'da 17 düzlem vardır (0–16): Düzlem …

Ek düzlem

Düzlem 1–16 (U+10000–U+10FFFF), emoji, tarihi yazılar, CJK uzantıları ve müzik notasyonu içerir. …

Han Unification

The process of mapping Chinese, Japanese, and Korean ideographs that share a …

Hangul Jamo

The individual consonant and vowel components (jamo) of the Korean Hangul writing …

ISO 10646 / Universal Character Set

Unicode ile senkronize edilmiş, aynı karakter repertuvarını ve kod noktalarını tanımlayan ancak …

Karakter olmayan

Dahili kullanım için kalıcı olarak ayrılmış kod noktaları (toplam 66): U+FDD0–U+FDEF ve …

← Sözlüğe Geri Dön