Ebene
Ein zusammenhängender Block von 65.536 Codepunkten. Unicode hat 17 Ebenen (0–16): Ebene 0 ist der BMP, Ebene 1 der SMP (Emoji, historische Schriften), Ebene 2 der SIP (CJK-Erweiterungen).
What is a Unicode Plane?
A plane is a group of 65,536 contiguous code points in the Unicode code space. Unicode divides its 1,114,112 total code points into 17 planes, numbered 0 through 16. Each plane spans exactly 0x10000 (65,536) code points:
- Plane 0: U+0000–U+FFFF
- Plane 1: U+10000–U+1FFFF
- Plane 2: U+20000–U+2FFFF
- ...
- Plane 16: U+100000–U+10FFFF
The plane number appears as the first two hex digits of a 6-digit code point. For example,
U+1F600 is in Plane 1 (the leading 1), and U+20001 is in Plane 2.
The 17 Planes
| Plane | Range | Name | Key Contents |
|---|---|---|---|
| 0 | U+0000–U+FFFF | Basic Multilingual Plane (BMP) | Latin, CJK, Arabic, Hangul, most modern scripts |
| 1 | U+10000–U+1FFFF | Supplementary Multilingual Plane (SMP) | Historic scripts, emoji, musical notation, math |
| 2 | U+20000–U+2FFFF | Supplementary Ideographic Plane (SIP) | CJK unified ideograph extensions B–F |
| 3 | U+30000–U+3FFFF | Tertiary Ideographic Plane (TIP) | CJK extension G (added Unicode 13.0) |
| 4–13 | U+40000–U+DFFFF | Unassigned | No characters assigned |
| 14 | U+E0000–U+EFFFF | Supplementary Special-purpose Plane (SSP) | Tags (U+E0000–U+E007F), variation selectors |
| 15 | U+F0000–U+FFFFF | Private Use Area A (PUA-A) | Application-defined characters |
| 16 | U+100000–U+10FFFF | Private Use Area B (PUA-B) | Application-defined characters |
Why Planes Exist
The plane structure was introduced when Unicode expanded from a 16-bit code space (65,536 points) to a 21-bit code space (1,114,112 points). The 16-bit expansion was called UCS-2; the 21-bit extension was encoded as UTF-16 using surrogate pairs for code points above the BMP.
The upper bound of U+10FFFF was chosen deliberately to match the maximum value encodable by UTF-16 surrogate pairs (which can reach exactly 1,048,576 supplementary characters, plus the 65,536 BMP positions = 1,114,112 total). This made UTF-16 the natural boundary for the code space.
Accessing Code Points in Planes
# Python: code point value reveals the plane
cp = 0x1F600 # 😀 GRINNING FACE
plane = cp >> 16 # Right-shift 16 bits to get plane number
print(plane) # 1 (Plane 1 / SMP)
# Check if BMP
is_bmp = cp <= 0xFFFF # False for 😀
// JavaScript: detect supplementary plane characters
function isSupplementary(cp) {
return cp > 0xFFFF;
}
console.log(isSupplementary(0x1F600)); // true
Planes in Encoding
The plane determines how a character is encoded in variable-length encodings:
| Plane | UTF-8 bytes | UTF-16 code units | UTF-32 code units |
|---|---|---|---|
| 0 (BMP) | 1–3 | 1 | 1 |
| 1–16 | 4 | 2 (surrogate pair) | 1 |
Planes 4–13 are entirely unassigned, creating a large reserved area for future expansion.
Common Misconceptions
"Planes are always full" — Most planes are sparsely populated. Plane 1 has many assigned characters but also large unassigned regions. Planes 4–13 are completely empty.
"Higher plane = less important" — Plane 1 contains all modern emoji and many important historic scripts. Plane number reflects historical encoding order, not character importance.
Quick Facts
| Property | Value |
|---|---|
| Total planes | 17 (planes 0–16) |
| Code points per plane | 65,536 |
| BMP plane number | 0 |
| Emoji primary plane | 1 (SMP) |
| CJK extension plane | 2 (SIP), 3 (TIP) |
| Private use planes | 15 and 16 |
| Completely unassigned | Planes 4–13 |
| First non-BMP Unicode version | Unicode 2.0 (1996) |
Verwandte Begriffe
Mehr in Unicode-Standard
Eine Informationseinheit zur Organisation, Steuerung oder Darstellung von Textdaten — die konzeptionelle …
Ebene 0 (U+0000–U+FFFF) mit den am häufigsten verwendeten Zeichen, darunter Lateinisch, Griechisch, …
Chinesisch, Japanisch und Koreanisch — der Sammelbegriff für den vereinheitlichten Han-Ideogramm-Block und …
Die kleinste Kodierungseinheit: ein 8-Bit-Byte in UTF-8, ein 16-Bit-Wort in UTF-16, ein …
Ein numerischer Wert im Unicode-Coderaum (U+0000 bis U+10FFFF), geschrieben als U+XXXX. Nicht …
Der vollständige Bereich möglicher Unicode-Codepunkte: U+0000 bis U+10FFFF (insgesamt 1.114.112), aufgeteilt in …
Ebenen 1–16 (U+10000–U+10FFFF) mit Emoji, historischen Schriften, CJK-Erweiterungen und Musiknotation. Erfordert Ersatzzeichenpaare …
Codepunkte U+D800–U+DFFF, ausschließlich für UTF-16-Ersatzzeichenpaare reserviert. Keine gültigen Unicode-Skalarwerte und dürfen nie …
The process of mapping Chinese, Japanese, and Korean ideographs that share a …
The individual consonant and vowel components (jamo) of the Korean Hangul writing …