異体字セレクター
特定のグリフ変形を選択する文字(U+FE00〜U+FE0F・U+E0100〜U+E01EF)。VS15(U+FE0E)はテキスト表示、VS16(U+FE0F)は絵文字表示を表します。
What Are Variation Selectors?
Variation selectors are invisible Unicode characters appended immediately after a base character to select among alternate glyph presentations. The base character and its variation selector together form a variation sequence that is defined in the Unicode Standard or in a registered Ideographic Variation Database (IVD).
Unicode allocates 256 variation selectors across several ranges:
- VS1–VS16 (U+FE00–U+FE0F): General use, including emoji/text presentation toggle.
- VS17–VS256 (U+E0100–U+E01EF): Primarily for CJK ideographic variants.
The most commonly encountered variation selectors are VS15 (U+FE0E, text presentation) and VS16 (U+FE0F, emoji presentation).
Emoji and Text Variation Sequences
VS16 and VS15 form the fundamental mechanism for controlling whether a character renders as a colorful emoji or a monochrome text symbol:
U+2665 ♥ (BLACK HEART SUIT — text or emoji, platform-dependent)
U+2665 U+FE0F ♥️ (emoji presentation: colorful heart)
U+2665 U+FE0E ♥ (text presentation: monochrome suit)
U+26A0 ⚠ (WARNING SIGN)
U+26A0 U+FE0F ⚠️ (emoji warning)
U+26A0 U+FE0E ⚠ (text warning)
CJK Ideographic Variation Sequences
For Chinese, Japanese, and Korean characters, variation selectors select specific glyph forms from registered IVD collections. Different glyph forms for the same character matter in contexts like personal names, legal documents, and historical texts where the precise stroke form has legal or cultural significance.
U+8FBA 辺 (base character)
U+8FBA U+E0100 辺 (Adobe-Japan1 variant glyph 1)
U+8FBA U+E0101 辺 (Adobe-Japan1 variant glyph 2)
These sequences are registered in the IVD (Ideographic Variation Database) maintained by Unicode.
Working with Variation Selectors in Code
# Variation selectors are invisible — string length can surprise you
base = "\u2665" # ♥
vs16 = "\u2665\uFE0F" # ♥️ (emoji)
vs15 = "\u2665\uFE0E" # ♥ (text)
len(base) # 1
len(vs16) # 2
len(vs15) # 2
base == vs16 # False — different sequences
# Check for variation selector
def has_variation_selector(s):
return any(
"\uFE00" <= c <= "\uFE0F" or "\U000E0100" <= c <= "\U000E01EF"
for c in s
)
has_variation_selector("♥️") # True
has_variation_selector("♥") # False (no VS)
# Strip all variation selectors
def strip_variation_selectors(s):
return "".join(
c for c in s
if not ("\uFE00" <= c <= "\uFE0F" or "\U000E0100" <= c <= "\U000E01EF")
)
strip_variation_selectors("⚠️") # "⚠"
// JavaScript — variation selectors are separate code units
const warning = "\u26A0\uFE0F"; // ⚠️
warning.length; // 2
[...warning]; // ["⚠", "️"]
// Unicode property escapes can match VS range
const vsPattern = /[\uFE00-\uFE0F]/u;
vsPattern.test(warning[1]); // true
In HTML and CSS
<!-- Explicit emoji presentation in HTML -->
<span>⚠️</span> <!-- ⚠️ emoji -->
<span>⚠︎</span> <!-- ⚠ text -->
/* VS16 in CSS content property */
.warning::before {
content: "\26A0\FE0F"; /* ⚠️ emoji */
}
.warning-text::before {
content: "\26A0\FE0E"; /* ⚠ text */
}
Why They Are Invisible
Variation selectors have no visible rendering of their own. They are categorized as format characters (Cf) and non-spacing marks. Text rendering engines consume them silently; databases and search engines may or may not normalize them away, which affects string matching.
Quick Facts
| Property | Value |
|---|---|
| VS1–VS16 range | U+FE00–U+FE0F |
| VS17–VS256 range | U+E0100–U+E01EF |
| VS15 (U+FE0E) | Forces text (monochrome) presentation |
| VS16 (U+FE0F) | Forces emoji (colorful) presentation |
| Rendering | Invisible — applied to preceding base character |
| Unicode category | Mn (Non-Spacing Mark) or Cf (Format) |
| IVD | Ideographic Variation Database for CJK glyph variants |
関連用語
Web & HTML のその他の用語
レスポンスの文字エンコーディングを宣言するHTTPヘッダーパラメータ(Content-Type: text/html; charset=utf-8)。ドキュメント内のエンコーディング宣言より優先されます。
::beforeおよび::after疑似要素でUnicodeエスケープを使って生成コンテンツを挿入するCSSプロパティ:content: '\2713'は✓を挿入します。
CSS properties (direction, writing-mode, unicode-bidi) controlling text layout direction. Works with Unicode …
HTMLで文字をテキスト表現する方式。3つの形式:名前(&)・十進数(&)・16進数(&)。HTMLの構文と衝突する文字に必須です。
ECMAScript Internationalization API providing locale-aware string comparison (Collator), number formatting (NumberFormat), date …
Unicode ドメイン名をxn--プレフィックス付きのASCII文字列に変換するASCII互換エンコーディング。münchen.de → xn--mnchen-3ya.de。
CSS supports Unicode via escape sequences (\2713 for ✓), the content property …
XMLバージョンの数値文字参照:✓または✓。XMLには名前付きエンティティが5個(& < > " ')しかありませんが、HTML5は2,231個あります。
デフォルトの絵文字表示の代わりに、通常は異体字セレクター15(U+FE0E)を使って文字をモノクロのテキストグリフでレンダリングすること。
URLの非ASCII文字と予約文字を各バイトを%XXで置き換えてエンコードします。まずUTF-8に変換し、各バイトをパーセントエンコードします:é → %C3%A9。