What is CJK (한중일)?

한중일 — 유니코드에서 통합 한자 블록 및 관련 문자 체계를 아우르는 집합적 용어. CJK 통합 한자는 20,992개 이상의 문자를 포함합니다.

What is 일반 범주?

모든 코드 포인트를 7개 주요 분류(문자, 기호, 숫자, 구두점, 기호, 구분자, 기타)로 나뉜 30개 범주(Lu, Ll, Nd, So 등) 중 하나로 분류하는 체계.

What is Unicode Standard Annex (UAX)?

Normative or informative documents that are integral parts of the Unicode Standard. UAX#9 (Bidi Algorithm), UAX#11 (East Asian Width), UAX#15 (Normalization Forms) are key examples.

속성

East Asian Width

Unicode property (UAX#11) classifying characters as Narrow, Wide, Fullwidth, Halfwidth, Ambiguous, or Neutral. Wide characters (CJK ideographs, katakana) occupy two columns in terminal emulators.

What is East Asian Width?

East Asian Width is a Unicode character property defined in UAX #11 — East Asian Width that classifies characters according to the display width they should occupy in fixed-width (monospaced) rendering environments, particularly traditional East Asian terminals and text layouts. The property answers the question: "Does this character occupy one column or two columns when displayed in a terminal or monospaced layout?"

The property was introduced because East Asian scripts — Chinese, Japanese, Korean — were historically displayed at twice the width of ASCII characters on fixed-pitch terminals. Mixing ASCII and CJK text in a single terminal line required a consistent model for how much horizontal space each character would consume.

The Six Width Categories

Category	Property Value	Description	Examples
Narrow	N	ASCII and most Latin/Greek/Cyrillic — one column	A, a, 1, @
Wide	W	Most CJK ideographs, Hangul syllables — two columns	漢, 가, ア
Fullwidth	F	ASCII-range characters in their fullwidth CJK form — two columns	Ａ, １, ！
Halfwidth	H	Katakana and Hangul in their halfwidth (legacy) form — one column	ｱ, ｦ
Ambiguous	A	Characters that are narrow in Western contexts but wide in some East Asian contexts	© , ☆, α
Neutral	N	Non-East-Asian-specific characters with no width ambiguity, typically narrow	Arrows, math operators

Terminal Implications

In a terminal emulator, the renderer must know the East Asian Width of every character to correctly advance the cursor. If a Wide or Fullwidth character is assumed to be narrow, subsequent characters will overwrite existing content, causing display corruption.

The POSIX standard function wcwidth() (from <wchar.h>) returns 0 for combining characters, 1 for narrow characters, and 2 for wide characters. Modern terminal emulators implement wcwidth() based on UAX #11 data.

# Python: get East Asian Width property
import unicodedata

def display_width(char: str) -> int:
    # Return terminal display width of a single character.
    eaw = unicodedata.east_asian_width(char)
    return 2 if eaw in ("W", "F") else 1

# Examples
display_width("A")   # → 1 (Narrow)
display_width("漢")  # → 2 (Wide)
display_width("Ａ")  # → 2 (Fullwidth)
display_width("ｱ")  # → 1 (Halfwidth)

The wcwidth Python package provides a conformant implementation updated with each Unicode release.

The Ambiguous Category

Characters with Ambiguous (A) width are the most problematic in practice. Their width is context-dependent:

In an East Asian context (a terminal set to a CJK locale), they display as Wide (2 columns)
In a Western context, they display as Narrow (1 column)

This affects many common symbols: degree sign (°), copyright symbol (©), Greek letters (α, β), and many box-drawing characters. Terminal emulators that serve mixed-locale user bases must make a policy choice, and mismatches between the application and terminal settings cause visible misalignment.

Quick Facts

Property	Value
Defined in	UAX #11 — East Asian Width
Number of categories	6 (Narrow, Wide, Fullwidth, Halfwidth, Ambiguous, Neutral)
Two-column characters	Wide (W) and Fullwidth (F)
Most problematic category	Ambiguous (A) — context-dependent
POSIX function	`wcwidth()` — returns 0, 1, or 2
Python stdlib	`unicodedata.east_asian_width(char)`
Python package	`wcwidth` (conformant, kept updated)