🧱 Block Explorer

Emoji Blocks Overview

Emoji in Unicode span multiple blocks across the Supplementary Multilingual Plane, including Emoticons, Miscellaneous Symbols and Pictographs, and Transport and Map Symbols, with sequences using ZWJ and variation selectors. This guide maps out where emoji live in Unicode, how sequences work, and how emoji are updated in each Unicode version.

·

Emoji have exploded from a single Japanese carrier's quirky addition into a global visual language encoded across multiple Unicode blocks. What started as 176 pictographs from NTT Docomo in 1999 now encompasses thousands of characters spread across several blocks, with new additions ratified annually by the Unicode Consortium. Understanding how emoji are organized in Unicode — and the complex mechanisms that make sequences and modifiers work — is essential for any developer handling modern text.

The Core Emoji Blocks

Emoji are not confined to a single block. They are distributed across several ranges in the Unicode codespace:

Block Range Notable Contents
Miscellaneous Symbols U+2600–U+26FF ☀️ ☁️ ⛄ ♻️
Dingbats U+2700–U+27BF ✂️ ✈️ ✉️
Emoticons U+1F600–U+1F64F 😀 😂 🥺 🤔
Misc. Symbols and Pictographs U+1F300–U+1F5FF 🌍 🎉 🔥 💎
Transport and Map Symbols U+1F680–U+1F6FF 🚀 🚗 ✈️ 🛸
Supplemental Symbols U+1F900–U+1F9FF 🤯 🦊 🧠 🧲
Symbols and Pictographs Extended-A U+1FA00–U+1FA6F 🪄 🪐 🦷

Emoticons Block (U+1F600–U+1F64F)

This is the "classic" emoji block in most people's minds — the smiley faces. It begins with 😀 GRINNING FACE at U+1F600 and covers the full range of facial expressions, from joy to sadness, surprise to skepticism.

Notable characters: - 😀 U+1F600 — Grinning Face - 😂 U+1F602 — Face With Tears of Joy (consistently one of the most-used emoji globally) - 🥺 U+1F97A — Pleading Face (added Unicode 11.0, 2018) - 😶‍🌫️ — Face in Clouds (a ZWJ sequence, not a single code point)

Miscellaneous Symbols and Pictographs (U+1F300–U+1F5FF)

This large block covers nature, weather, food, activities, objects, and more. It is one of the densest and most varied emoji blocks:

  • 🌍 U+1F30D — Earth Globe Europe-Africa
  • 🎃 U+1F383 — Jack-O-Lantern
  • 🔥 U+1F525 — Fire ("lit", trending, hot takes)
  • 💬 U+1F4AC — Speech Bubble
  • 🖥️ U+1F5A5 — Desktop Computer

Transport and Map Symbols (U+1F680–U+1F6FF)

Originally designed to represent transportation modes and map features, this block also contains many everyday objects:

  • 🚀 U+1F680 — Rocket
  • 🚗 U+1F697 — Automobile
  • 🛸 U+1F6F8 — Flying Saucer (added Unicode 10.0)
  • 🛑 U+1F6D1 — Stop Sign
  • 🧳 U+1F9F3 — (actually in Supplemental Symbols)

Supplemental Symbols and Pictographs (U+1F900–U+1F9FF)

Added to accommodate the rapidly growing emoji vocabulary, this block covers animals, food, activities, and abstract concepts:

  • 🤯 U+1F92F — Exploding Head
  • 🦊 U+1F98A — Fox Face
  • 🧠 U+1F9E0 — Brain
  • 🧲 U+1F9F2 — Magnet

Emoji Presentation vs Text Presentation

Many characters in Unicode can appear either as emoji (colorful, graphical) or as plain text (monochrome, symbolic), depending on context. This is controlled by variation selectors:

  • U+FE0F (Variation Selector-16): Forces emoji presentation
  • U+FE0E (Variation Selector-15): Forces text presentation

For example: - ☀ U+2600 alone → text presentation (a simple black sun) - ☀️ U+2600 + U+FE0F → emoji presentation (colorful sun)

Developers must be aware that a "single emoji" may actually be two code points, affecting string length calculations.

Skin Tone Modifiers

Unicode 8.0 introduced five skin tone modifiers based on the Fitzpatrick scale:

Modifier Code Point Tone
🏻 U+1F3FB Light
🏼 U+1F3FC Medium-Light
🏽 U+1F3FD Medium
🏾 U+1F3FE Medium-Dark
🏿 U+1F3FF Dark

These modifiers follow a base emoji (typically a person or hand gesture) to modify skin color. For example, 👋🏽 is U+1F44B followed by U+1F3FD. The string length is 2 code points despite appearing as a single visual unit.

ZWJ Sequences: Building Complex Emoji

Zero Width Joiner (U+200D, ZWJ) is used to create multi-character emoji sequences that render as a single glyph when supported. These sequences enable:

Family emoji: 👨‍👩‍👧‍👦 = Man + ZWJ + Woman + ZWJ + Girl + ZWJ + Boy (7 code points, 1 visual unit)

Profession emoji: 👩‍💻 = Woman + ZWJ + Laptop (3 code points)

Gender-neutral forms: Some platforms use ZWJ sequences to build gender variants from a base person emoji.

If a platform does not support a ZWJ sequence, it typically renders each component separately — so 👨‍👩‍👧 might display as three separate emoji.

Regional Indicator Symbols and Flag Emoji

Country flags are encoded as sequences of two Regional Indicator Symbols from the Enclosed Alphanumeric Supplement block (U+1F1E0–U+1F1FF). Each letter A–Z has a corresponding regional indicator:

  • 🇺🇸 = U+1F1FA (Regional Indicator U) + U+1F1F8 (Regional Indicator S)
  • 🇯🇵 = U+1F1EF (Regional Indicator J) + U+1F1F5 (Regional Indicator P)

Platforms that do not support flag rendering will display the two letters (US, JP) as fallback.

Emoji Version History and Compatibility

Each Unicode version adds new emoji. Key milestones:

Unicode Version Year Notable Additions
6.0 2010 First official emoji standardization
8.0 2015 Skin tone modifiers
11.0 2018 Supervillain, lobster, infinity symbol
13.0 2020 Transgender flag, bubble tea
15.0 2022 Moose, phoenix, lime

Older operating systems will display replacement boxes (□) for emoji added after the OS was released. This is a persistent cross-platform compatibility challenge.

Developer Considerations

Working with emoji in code requires care:

  1. String length: len("😀") returns 1 in Python 3 (code points), but the UTF-16 encoding uses 2 units. In JavaScript, "😀".length returns 2.
  2. Grapheme clusters: A ZWJ family emoji might be 7+ code points but 1 grapheme cluster. Use a grapheme segmentation library for accurate character counts.
  3. Sorting: Emoji sort by code point value unless you use a locale-aware collator.
  4. Regular expressions: Match emoji with \\p{Emoji} in Unicode-aware regex engines.
  5. Database storage: Use UTF-8mb4 in MySQL (not plain UTF-8) to store supplementary plane characters including most emoji.

Thêm trong Block Explorer

Basic Latin (ASCII) Block

The Basic Latin block (U+0000–U+007F) is the first Unicode block and covers …

Latin-1 Supplement Block

The Latin-1 Supplement block (U+0080–U+00FF) extends ASCII with accented Latin characters for …

General Punctuation Block

The General Punctuation block (U+2000–U+206F) contains typographic spaces, dashes, quotation marks, and …

Mathematical Operators Block

The Mathematical Operators block (U+2200–U+22FF) contains 256 symbols covering set theory, logic, …

Arrows Block

The Arrows block (U+2190–U+21FF) contains 112 arrow characters including simple directional arrows, …

Dingbats Block

The Dingbats block (U+2700–U+27BF) was created to encode the Zapf Dingbats typeface …

Miscellaneous Symbols Block

The Miscellaneous Symbols block (U+2600–U+26FF) is one of Unicode's most eclectic, containing …

CJK Unified Ideographs Overview

The CJK Unified Ideographs block (U+4E00–U+9FFF) is one of the largest Unicode …

Hangul Block

The Hangul Syllables block (U+AC00–U+D7A3) contains 11,172 precomposed Korean syllable blocks algorithmically …

Currency Symbols Block

The Currency Symbols block (U+20A0–U+20CF) contains dedicated Unicode characters for currencies that …

Box Drawing & Block Elements Blocks

The Box Drawing block (U+2500–U+257F) and Block Elements block (U+2580–U+259F) provide characters …

Enclosed Alphanumerics Block

The Enclosed Alphanumerics block (U+2460–U+24FF) contains circled numbers, parenthesized numbers and letters, …

Geometric Shapes Blocks

The Geometric Shapes block (U+25A0–U+25FF) and related blocks contain squares, circles, triangles, …

Musical Symbols Block

The Musical Symbols block (U+1D100–U+1D1FF) is a Supplementary Multilingual Plane block containing …