Hangul Jamo
The individual consonant and vowel components (jamo) of the Korean Hangul writing system. Unicode encodes both precomposed Hangul syllables (U+AC00–U+D7A3) and decomposed jamo (U+1100–U+11FF).
What is Hangul Jamo?
Hangul is the alphabetic writing system of the Korean language, invented in 1443 by King Sejong the Great. Unlike logographic Chinese characters, Hangul is fully phonetic: each syllable is composed of individual phonetic units called jamo. A jamo is a single consonant or vowel element — similar in concept to a Latin letter — but Hangul syllables are written as compact two-dimensional blocks rather than as linear sequences.
Unicode encodes Hangul across three distinct blocks, each serving a different purpose.
The Three Unicode Hangul Blocks
1. Hangul Jamo (U+1100–U+11FF)
This block contains the individual jamo components in their "combining" form: 19 initial consonants (choseong), 21 vowels (jungseong), and 28 final consonants (jongseong, including a null final). These code points are the raw building blocks. They are not normally displayed in isolation; their purpose is algorithmic syllable composition. A Unicode-conformant renderer receiving a choseong, jungseong, and optional jongseong in sequence will compose and render them as a single syllable block.
2. Hangul Compatibility Jamo (U+3130–U+318F)
This block provides jamo in their standalone "compatibility" form, suitable for display as individual characters — for example, in alphabetical lists, keyboard labels, or dictionary entries. These are distinct from the composing jamo in U+1100. They cannot be algorithmically combined into syllable blocks and are intended for display contexts only. Mixing them with composing jamo can cause unexpected rendering.
3. Hangul Syllables (U+AC00–U+D7AF)
This is the largest of the three blocks, containing all 11,172 precomposed modern Hangul syllable blocks. Every legal combination of initial consonant, vowel, and optional final consonant has a dedicated code point. The block is algorithmically structured: given a syllable's code point S, you can compute its components exactly.
Algorithmic Composition and Decomposition
Unicode defines a precise algorithm for mapping between precomposed syllables (U+AC00 range) and their jamo components (U+1100 range):
# Hangul syllable composition
HANGUL_BASE = 0xAC00
CHOSEONG_COUNT = 19 # initial consonants
JUNGSEONG_COUNT = 21 # vowels
JONGSEONG_COUNT = 28 # final consonants (including null)
def compose_hangul(lead: int, vowel: int, trail: int = 0) -> str:
# Compose a Hangul syllable from jamo indices (0-based).
code_point = (
HANGUL_BASE
+ (lead * JUNGSEONG_COUNT + vowel) * JONGSEONG_COUNT
+ trail
)
return chr(code_point)
def decompose_hangul(syllable: str) -> tuple[int, int, int]:
# Decompose a Hangul syllable to (lead, vowel, trail) indices.
index = ord(syllable) - HANGUL_BASE
trail = index % JONGSEONG_COUNT
vowel = (index // JONGSEONG_COUNT) % JUNGSEONG_COUNT
lead = index // (JUNGSEONG_COUNT * JONGSEONG_COUNT)
return lead, vowel, trail
This algorithm underpins NFD/NFC normalization for Korean text and enables efficient Korean text processing without exhaustive lookup tables.
Quick Facts
| Property | Value |
|---|---|
| Invented | 1443, King Sejong the Great |
| Hangul Jamo block | U+1100–U+11FF (combining jamo) |
| Compatibility Jamo block | U+3130–U+318F (standalone display) |
| Hangul Syllables block | U+AC00–U+D7AF (11,172 precomposed) |
| Components | 19 initial + 21 vowel + 28 final (incl. null) = 11,172 syllables |
| Normalization | NFD decomposes U+AC00 syllables to U+1100 jamo |
| Unicode algorithm | Defined in Chapter 3 of the Unicode Standard |
관련 용어
유니코드 표준의 더 많은 용어
한중일 — 유니코드에서 통합 한자 블록 및 관련 문자 체계를 아우르는 집합적 …
The process of mapping Chinese, Japanese, and Korean ideographs that share a …
유니코드와 동기화된 국제 표준(ISO/IEC 10646)으로, 동일한 문자 목록과 코드 포인트를 정의하지만 유니코드의 …
모든 문자 체계의 모든 문자에 고유 번호(코드 포인트)를 부여하는 범용 문자 인코딩 …
Normative or informative documents that are integral parts of the Unicode Standard. …
Informational documents published by the Unicode Consortium covering specific topics like security …
평면 0(U+0000~U+FFFF)으로, 라틴, 그리스, 키릴, CJK, 아랍 문자 및 대부분의 기호 등 …
어느 유니코드 버전에서도 문자가 할당되지 않은 코드 포인트로, Cn(미할당)으로 분류됩니다. 향후 버전에서 …
평면 1~16(U+10000~U+10FFFF)으로, 이모지, 고대 문자, CJK 확장, 악보 등을 포함합니다. UTF-16에서는 서로게이트 …
내부 사용을 위해 영구 예약된 코드 포인트(총 66개): 각 평면의 U+FDD0~U+FDEF 및 …