発音区別符号 / ダイアクリティック
発音や意味を変えるために文字に追加される記号。合成済み形(é U+00E9)または結合形(e + ◌́ U+0065+U+0301)で表現されます。アクセント・ウムラウト・セジラ・チルデなどが含まれます。
What is a Diacritical Mark?
A diacritical mark (also called a diacritic) is a small sign or symbol added to a letter to modify its pronunciation, indicate stress, distinguish between words that would otherwise be spelled identically, or mark grammatical features. Diacritical marks are foundational to most writing systems that use the Latin, Greek, Cyrillic, Arabic, Hebrew, and many other scripts.
Common examples in Latin-script languages include the acute accent (é), grave accent (è), circumflex (ê), umlaut (ü), tilde (ñ), cedilla (ç), and the ring above (å). These are not decorations — they represent distinct sounds and often change the meaning of a word entirely.
Precomposed vs. Combining Forms
Unicode encodes diacritical characters in two ways:
Precomposed characters are single code points that combine a base letter and its diacritic. For example, é is U+00E9 (a single code point). These exist for compatibility with legacy encodings and convenience.
Combining characters are separate diacritical marks (U+0300–U+036F) that attach to the preceding base character. The same é can be represented as U+0065 (e) followed by U+0301 (combining acute accent).
Both representations are canonically equivalent — Unicode Normalization Form C (NFC) prefers precomposed forms, while NFD decomposes them into base + combining sequences.
| Diacritic | Precomposed | Base + Combining |
|---|---|---|
| é (e acute) | U+00E9 | U+0065 + U+0301 |
| ü (u umlaut) | U+00FC | U+0075 + U+0308 |
| ñ (n tilde) | U+00F1 | U+006E + U+0303 |
| ç (c cedilla) | U+00E7 | U+0063 + U+0327 |
Common Diacritical Marks
| Mark | Name | Example | Used In |
|---|---|---|---|
| ´ | Acute accent | é, á, ó | French, Spanish, Portuguese, many others |
| ` | Grave accent | è, à, ù | French, Italian |
| ^ | Circumflex | ê, â, ô | French, Romanian |
| ¨ | Diaeresis/Umlaut | ü, ö, ä | German, French, Swedish |
| ~ | Tilde | ñ, ã, õ | Spanish, Portuguese |
| ¸ | Cedilla | ç, ş | French, Turkish, Romanian |
| ° | Ring above | å, ů | Swedish, Norwegian, Czech |
| ˇ | Caron (háček) | č, š, ž | Czech, Slovak, Slovenian |
Typing Diacritical Marks
macOS: Hold a key to see a popover (e.g., hold e to choose é, è, ê). Or use Option key combos: Option+E then E = é.
Windows: Use Alt codes, the Character Map app, or configure a locale keyboard layout.
HTML entities:
é <!-- é -->
Ü <!-- Ü -->
ñ <!-- ñ -->
ç <!-- ç -->
Unicode escape:
"\u00e9" # é in Python
"\u00fc" # ü
Quick Facts
| Property | Value |
|---|---|
| Unicode block (combining) | Combining Diacritical Marks: U+0300–U+036F (112 characters) |
| Unicode block (extended) | Combining Diacritical Marks Extended: U+1AB0–U+1AFF |
| Precomposed Latin range | Latin-1 Supplement U+00C0–U+00FF |
| Normalization preference | NFC (precomposed) for storage; NFD for processing |
| Languages with most diacritics | Vietnamese (5 tone marks + vowel marks), Czech, Polish |
| Zero-width diacritics | Combining characters attach without taking width |
| Stacking | Multiple combining marks can stack on one base character |
関連用語
タイポグラフィ のその他の用語
CSS @font-face descriptor specifying which Unicode code points a font should cover. …
Em:フォントサイズと等しい幅。En:Emの半分。エムダッシュ幅・エムスペース・エンスペース・CSSユニット(1em・0.5em)の定義に使われます。
The mechanism by which a rendering engine substitutes glyphs from a secondary …
Modern font format developed by Microsoft and Adobe supporting up to 65,535 …
文字が右から左に流れるテキスト方向。アラビア語・ヘブライ語・ターナ文字などで使われ、正しい表示のために双方向アルゴリズムが必要です。
Fonts downloaded by the browser to render text, declared via CSS @font-face. …
視覚的な調和のために特定の文字ペア(例:AV・To・LT)間のスペーシングを調整すること。Unicodeの概念ではなくフォント機能ですが、Unicodeテキストのレンダリングに影響します。
フォントによってレンダリングされる文字の視覚的表現。1つの文字が複数のグリフを持つ場合があり(合字・文脈形態)、1つのグリフが複数の文字を表す場合もあります。
小文字の高さの大文字字形。CSS:font-variant: small-caps。Unicodeにはラテン拡張(ᴀ〜ᴢ)に実際のスモールキャップス文字があります。
前進幅がゼロの文字 — レンダリングでは見えませんがテキスト動作に影響します。ZWSP(単語区切り)・ZWJ(結合)・ZWNJ(結合防止)・WJ(改行防止)などがあります。