너비 없는 비결합자 (ZWNJ)
U+200C. 인접 문자의 결합을 방지합니다. 페르시아어/아랍어에서 올바른 글자 형태를 위해 필수적이며, 데바나가리에서 합자를 방지하는 데 사용됩니다.
What is ZWNJ (Zero Width Non-Joiner)?
ZWNJ stands for Zero Width Non-Joiner, encoded at U+200C. Like its counterpart ZWJ (Zero Width Joiner, U+200D), ZWNJ is an invisible formatting character with zero visual width. Where ZWJ instructs rendering engines to join or ligate adjacent characters, ZWNJ does the opposite: it prevents joining, ligature formation, or cursive connection between characters that would otherwise be combined by default.
ZWNJ is primarily used in scripts that employ cursive or ligature-forming typography — most notably Arabic, Persian (Farsi), Urdu, Devanagari, and other Brahmic scripts — as well as in general typography for ligature control in Latin script.
How ZWNJ Works in Arabic and Persian
In Arabic script, most letters connect to their neighbors in a cursive flow, changing their shape based on position. Persian (Farsi) is written in Arabic script with a few additional characters. In both languages, there are grammatical and typographic situations where a word boundary or morpheme boundary should visually break the cursive connection, even though the characters are part of the same word.
For example, the Persian word میرود (miravad, "goes") is written with a ZWNJ between می and رود to show that these are two morphemes — a prefix and a verb — without inserting a full space. The ZWNJ breaks the cursive connection and creates a slight visual gap (a "half-space") while keeping the word as a single typographic unit for justification and line-breaking purposes. This usage is standard and required for correct Persian typography.
How ZWNJ Works in Devanagari and Brahmic Scripts
In Devanagari (used for Hindi, Sanskrit, Marathi, Nepali), ZWNJ prevents the formation of conjunct consonants. When two consonants appear together, the Devanagari rendering system normally forms a conjunct ligature — a combined glyph. Inserting ZWNJ between them preserves both consonants in their independent (virama-terminated) forms rather than combining them.
For example: क + ् + ष normally renders as the conjunct क्ष (ksha). Inserting ZWNJ (क + ् + ZWNJ + ष) renders as क् ष — the halant form of ka followed by independent sha.
ZWNJ in Latin Typography
In Latin script, ZWNJ can prevent automatic ligature formation. High-quality typography systems (TeX, OpenType) automatically ligate character pairs like fi, fl, ff, ffi, ffl. In some contexts — such as compound words in German where the ligature would cross a morpheme boundary — the ligature is typographically incorrect. ZWNJ inserted between the two letters prevents the ligature.
ZWNJ and ZWJ Comparison
| Property | ZWNJ (U+200C) | ZWJ (U+200D) |
|---|---|---|
| Full name | Zero Width Non-Joiner | Zero Width Joiner |
| Effect on joining | Prevents joining/ligature | Encourages joining/ligature |
| Primary use | Persian half-space, Devanagari conjunct prevention | Emoji sequences, Arabic joining |
| Visual width | Zero | Zero |
| Unicode category | Cf (Format) | Cf (Format) |
Security and Data Considerations
ZWNJ, like all invisible characters, can be used to insert hidden content into text. Two visually identical strings may differ because one contains ZWNJ characters. This matters for:
- Password comparison: A password with embedded ZWNJ is technically different from one without
- Text search: Search engines typically ignore ZWNJ for matching purposes
- Data normalization: Applications processing user input should define a policy for stripping or preserving ZWNJ
Quick Facts
| Property | Value |
|---|---|
| Code point | U+200C |
| Name | ZERO WIDTH NON-JOINER |
| Unicode category | Cf (Format character) |
| Visual width | Zero — completely invisible |
| Primary script use | Persian (Farsi), Arabic, Devanagari, Brahmic scripts |
| Persian function | Half-space for morpheme boundary (میرود) |
| Latin use | Prevents automatic fi/fl ligatures |
| Introduced | Unicode 1.1 (1993) |
관련 용어
보안의 더 많은 용어
Exploiting Unicode bidirectional control characters to disguise malicious code or filenames. The …
도메인 이름에 시각적으로 유사한 유니코드 문자를 사용하여 합법적인 사이트를 사칭하는 공격. аpple.com(키릴 …
Exploiting Unicode normalization to bypass security filters. Input validated before normalization may …
U+200D. 인접 문자의 결합을 요청합니다. 이모지 시퀀스에 필수적입니다(👩+ZWJ+💻=👩💻). 인도 문자에서는 합자 형성을 …
서로 다른 문자 체계에서 동일하거나 매우 유사하게 보이는 문자. 예: 라틴 'a'와 …
유니코드 양방향 재정의 문자(U+202A~U+202E, U+2066~U+2069)를 사용하여 악성 파일 이름이나 코드를 위장하는 공격. …
유니코드 기능을 사용하여 사용자를 속이는 것: 가짜 도메인을 위한 동형이자, 가짜 파일 …
confusables.txt(UCD)에 정의된 시각적으로 혼동될 수 있는 문자 쌍에 대한 유니코드 공식 용어. …
서로 다른 문자 체계의 문자를 혼합하는 텍스트를 식별합니다(예: 라틴 + 키릴). 동형이자 …