A contiguous block of 65,536 code points. Unicode has 17 planes (0–16): Plane 0 is the BMP, Plane 1 is the SMP (emoji, historic scripts), Plane 2 is the SIP (CJK extensions).

What is Supplementary Plane / Astral Plane?

Planes 1–16 (U+10000–U+10FFFF), containing emoji, historic scripts, CJK extensions, and musical notation. Requires surrogate pairs in UTF-16.

A numerical value in the Unicode code space (U+0000 to U+10FFFF), written as U+XXXX. Not all code points are assigned to characters.

Unicode Standard

Basic Multilingual Plane (BMP)

Plane 0 (U+0000–U+FFFF), containing the most commonly used characters including Latin, Greek, Cyrillic, CJK, Arabic, and most symbols. Characters here fit in a single UTF-16 code unit.

2021-06-02 · Updated 2024-08-15

What is the Basic Multilingual Plane?

The Basic Multilingual Plane (BMP) is Plane 0 of the Unicode code space, covering code points U+0000 through U+FFFF — a range of exactly 65,536 positions. It was designed to hold all the characters needed for modern text in the world's actively used scripts, and it largely succeeded: the Latin, Greek, Cyrillic, Arabic, Hebrew, Devanagari, CJK, and dozens of other scripts all fit within the BMP.

The BMP's boundaries matter beyond just organization. Because BMP code points fit in a single 16-bit value (0x0000–0xFFFF), they can be stored as a single code unit in UTF-16, and they are the direct ancestors of UCS-2, the predecessor to UTF-16.

What Lives in the BMP

The BMP is organized into blocks — contiguous ranges assigned to specific scripts or purposes. Notable regions include:

Range	Contents
U+0000–U+007F	Basic Latin (ASCII)
U+0080–U+00FF	Latin-1 Supplement
U+0370–U+03FF	Greek and Coptic
U+0400–U+04FF	Cyrillic
U+0600–U+06FF	Arabic
U+0900–U+097F	Devanagari
U+3040–U+309F	Hiragana
U+30A0–U+30FF	Katakana
U+4E00–U+9FFF	CJK Unified Ideographs (core)
U+AC00–U+D7AF	Hangul Syllables (11,172 precomposed)
U+E000–U+F8FF	Private Use Area
U+D800–U+DFFF	Surrogate range (not real characters)
U+FFF0–U+FFFF	Specials (including U+FFFD replacement character)

The Surrogate Hole

One important quirk: the range U+D800–U+DFFF (2,048 code points) is permanently reserved for surrogates — the mechanism UTF-16 uses to encode characters above U+FFFF. These code points can never be assigned to real characters. You will sometimes see UTF-16 described as covering the "BMP minus surrogates."

BMP vs Supplementary Characters

Any character with a code point above U+FFFF is a supplementary character and requires special handling in encodings optimized for the BMP:

Encoding	BMP character	Supplementary character
UTF-8	1–3 bytes	4 bytes
UTF-16	1 code unit (2 bytes)	2 code units (4 bytes, surrogate pair)
UTF-32	1 code unit (4 bytes)	1 code unit (4 bytes, no difference)

In UTF-16, supplementary characters require a surrogate pair — two 16-bit code units working together. Most emoji fall into Plane 1 (U+1F000+) and are therefore supplementary.

Historical Significance

Early Unicode architects hoped that 65,536 code points would be enough for all world languages forever. They were wrong. By Unicode 2.0, it was clear that CJK ideographs alone would eventually overflow the BMP, and the standard was extended to 17 planes. This is why legacy systems built on UCS-2 (a fixed-width 16-bit encoding) failed: they could only represent BMP characters.

Common Pitfalls

UCS-2 vs UTF-16: UCS-2 encodes only the BMP using fixed 2-byte units. UTF-16 extends UCS-2 with surrogate pairs for supplementary characters. Many old systems claiming "Unicode support" actually only support UCS-2 (BMP-only).

Emoji in JavaScript: Because JavaScript strings are UTF-16, emoji (Plane 1) have .length of 2, not 1. Iterating with spread or Array.from() corrects this.

"🎉".length      // 2 (two UTF-16 code units)
[..."🎉"].length // 1 (one Unicode code point)

Quick Facts

Property	Value
Code point range	U+0000–U+FFFF
Total positions	65,536
Plane number	0
Also known as	Plane 0, BMP
UTF-16 code units needed	1 (for all non-surrogate BMP chars)
Surrogate range (excluded)	U+D800–U+DFFF (2,048 points)
Characters assigned (approx.)	~55,000
Predecessor encoding	UCS-2 (BMP-only, no surrogates)

Related Terms

Plane Supplementary Plane / Astral Plane Code Point

More in Unicode Standard

Abstract Character

A unit of information used for organizing, controlling, or representing textual data …

Assigned Character

A code point that has been given a character designation in a …

CJK

Chinese, Japanese, and Korean — the collective term for the unified Han …

Code Point

A numerical value in the Unicode code space (U+0000 to U+10FFFF), written …

Code Space

The complete range of possible Unicode code points: U+0000 to U+10FFFF (1,114,112 …

Code Unit

The minimal unit of encoding: an 8-bit byte in UTF-8, a 16-bit …

Han Unification

The process of mapping Chinese, Japanese, and Korean ideographs that share a …

Hangul Jamo

The individual consonant and vowel components (jamo) of the Korean Hangul writing …

ISO 10646 / Universal Character Set

International standard (ISO/IEC 10646) synchronized with Unicode, defining the same character repertoire …

Noncharacter

Code points permanently reserved for internal use (66 total): U+FDD0–U+FDEF and U+nFFFE/U+nFFFF …

← Back to Glossary