🔣 Symbol Reference

Roman Numeral Symbols

Unicode includes a Number Forms block with precomposed Roman numeral characters such as Ⅰ Ⅱ Ⅲ Ⅳ, distinct from the Latin letters I, V, X, and L that are commonly used as substitutes. This guide explains the Unicode Roman numeral characters, when to use them, and provides copy-paste support.

·

Roman numerals have been in continuous use for over two thousand years, from the inscriptions of the Roman Republic to the copyright dates on modern films. Unicode provides two approaches to representing Roman numerals: using ordinary Latin letters (I, V, X, L, C, D, M) or using dedicated precomposed Roman numeral characters from the Number Forms block. This guide explains both approaches, catalogs every precomposed Roman numeral character in Unicode, and helps developers choose the right representation for their context.

Quick Copy-Paste Table: Precomposed Roman Numerals

Symbol Name Code Point HTML Entity Value
Roman Numeral One U+2160 Ⅰ 1
Roman Numeral Two U+2161 Ⅱ 2
Roman Numeral Three U+2162 Ⅲ 3
Roman Numeral Four U+2163 Ⅳ 4
Roman Numeral Five U+2164 Ⅴ 5
Roman Numeral Six U+2165 Ⅵ 6
Roman Numeral Seven U+2166 Ⅶ 7
Roman Numeral Eight U+2167 Ⅷ 8
Roman Numeral Nine U+2168 Ⅸ 9
Roman Numeral Ten U+2169 Ⅹ 10
Roman Numeral Eleven U+216A Ⅺ 11
Roman Numeral Twelve U+216B Ⅻ 12
Roman Numeral Fifty U+216C Ⅼ 50
Roman Numeral One Hundred U+216D Ⅽ 100
Roman Numeral Five Hundred U+216E Ⅾ 500
Roman Numeral One Thousand U+216F Ⅿ 1000

Lowercase Precomposed Roman Numerals

Symbol Name Code Point Value
Small Roman Numeral One U+2170 1
Small Roman Numeral Two U+2171 2
Small Roman Numeral Three U+2172 3
Small Roman Numeral Four U+2173 4
Small Roman Numeral Five U+2174 5
Small Roman Numeral Six U+2175 6
Small Roman Numeral Seven U+2176 7
Small Roman Numeral Eight U+2177 8
Small Roman Numeral Nine U+2178 9
Small Roman Numeral Ten U+2179 10
Small Roman Numeral Eleven U+217A 11
Small Roman Numeral Twelve U+217B 12
Small Roman Numeral Fifty U+217C 50
Small Roman Numeral One Hundred U+217D 100
Small Roman Numeral Five Hundred U+217E 500
Small Roman Numeral One Thousand U+217F 1000

The Number Forms Block (U+2150–U+218F)

All precomposed Roman numerals live in the Number Forms block (U+2150–U+218F), which also contains vulgar fractions (like ½, ⅓, ¼). The block provides:

  • Uppercase I–XII (U+2160–U+216B): Composite characters for 1–12
  • Uppercase L, C, D, M (U+216C–U+216F): Single-value characters for 50, 100, 500, 1000
  • Lowercase i–xii (U+2170–U+217B): Small versions of 1–12
  • Lowercase l, c, d, m (U+217C–U+217F): Small versions of 50, 100, 500, 1000

Why 1–12 Are Special

Unicode provides precomposed characters for 1 through 12 specifically because these values appear most frequently in typographic contexts: clock faces (I–XII), book chapter numbers, outline numbering, and list items. The precomposed forms like Ⅲ (U+2162) are single characters that render as a ligature — the three vertical strokes are part of one glyph, with font-specific kerning and spacing.

For numbers beyond 12, you combine the individual characters. For example, "14" would be ⅩⅣ (U+2169 + U+2163) — two precomposed characters — or simply written as XIV using Latin letters.

Precomposed vs Latin Letters: Which to Use

The fundamental question: should you write Roman numerals using precomposed Unicode characters (Ⅳ, U+2163) or ordinary Latin letters (IV)?

Comparison

Aspect Precomposed (Ⅳ) Latin Letters (IV)
Character count 1 code point 2 code points
Searchability Poor (rare encoding) Excellent
Font support Variable Universal
Copy-paste May paste as single char Predictable behavior
Sorting Has numeric value Sorts as letters
Screen readers May read as "Roman numeral four" Reads as "I V"
Compatibility Older systems may not support Works everywhere

The Unicode Consortium's Recommendation

The Unicode Standard itself states that these characters exist primarily for compatibility with East Asian encoding standards (like JIS X 0208 and KS X 1001) that included precomposed Roman numerals for vertical text layout. The Standard recommends using ordinary Latin letters for Roman numerals in most contexts.

From the Unicode Standard, Chapter 22:

"For most purposes, it is preferable to compose the Roman numerals from sequences of the appropriate Latin letters."

When Precomposed Characters Make Sense

Despite the general recommendation, precomposed Roman numerals are useful in:

  1. CJK vertical text: In Japanese, Chinese, and Korean vertical writing, a precomposed Ⅲ occupies a single character cell and rotates correctly. Writing "III" with three Latin I characters in vertical text creates three separate rotated letters.

  2. Clock faces: The sequence Ⅰ through Ⅻ represents the 12 positions on an analog clock. Using precomposed characters ensures consistent glyph design.

  3. Semantic markup: When you need software to recognize that a character is a Roman numeral (not a Latin letter), the precomposed form carries that semantic information in its Unicode properties.

Unicode Properties of Roman Numerals

Each precomposed Roman numeral carries a numeric value in Unicode's character database, making programmatic conversion straightforward:

import unicodedata

# Precomposed Roman numeral has numeric value
char = "\u2163"  # Ⅳ
name = unicodedata.name(char)       # "ROMAN NUMERAL FOUR"
value = unicodedata.numeric(char)    # 4.0
category = unicodedata.category(char)  # "Nl" (Number, letter)

# Latin letter "I" has NO numeric value for Roman numeral
latin_i = "I"
category_i = unicodedata.category(latin_i)  # "Lu" (Letter, uppercase)
# unicodedata.numeric(latin_i) raises ValueError

The General Category for precomposed Roman numerals is Nl (Number, letter), while ordinary Latin letters used as Roman numerals have category Lu (Letter, uppercase). This distinction allows programs to identify precomposed Roman numerals programmatically.

Case Mapping

Precomposed Roman numerals support case conversion:

upper = "\u2160"  # Ⅰ (uppercase)
lower = upper.lower()  # ⅰ (U+2170, lowercase)
back = lower.upper()   # Ⅰ (U+2160, uppercase)

# Works for all 16 pairs
roman_12 = "\u216B"  # Ⅻ
roman_12_lower = roman_12.lower()  # ⅻ (U+217B)

This case mapping is correctly defined in Unicode's CaseFolding.txt and SpecialCasing.txt data files.

Compatibility Decomposition

Each precomposed Roman numeral has a compatibility decomposition to its constituent Latin letters. Under NFKD (Compatibility Decomposition) or NFKC (Compatibility Composition) normalization:

Precomposed Decomposes To Normalization
Ⅰ (U+2160) I (U+0049) NFKD/NFKC
Ⅱ (U+2161) II (U+0049 U+0049) NFKD/NFKC
Ⅲ (U+2162) III NFKD/NFKC
Ⅳ (U+2163) IV NFKD/NFKC
Ⅻ (U+216B) XII NFKD/NFKC
import unicodedata

roman = "\u2162"  # Ⅲ
decomposed = unicodedata.normalize("NFKD", roman)
print(decomposed)  # "III" (three Latin I characters)
print(len(roman), len(decomposed))  # 1, 3

This means that NFKC/NFKD normalization will destroy the distinction between precomposed Roman numerals and Latin letters. If your application applies NFKC normalization (common in search indexing), all precomposed Roman numerals will be converted to their Latin letter equivalents.

Roman Numeral Values Beyond the Basic Set

Standard Roman numeral values and their Unicode representations:

Value Uppercase Lowercase Precomposed?
1 Ⅰ (U+2160) ⅰ (U+2170) Yes
2 Ⅱ (U+2161) ⅱ (U+2171) Yes
3 Ⅲ (U+2162) ⅲ (U+2172) Yes
4 Ⅳ (U+2163) ⅳ (U+2173) Yes
5 Ⅴ (U+2164) ⅴ (U+2174) Yes
6 Ⅵ (U+2165) ⅵ (U+2175) Yes
7 Ⅶ (U+2166) ⅶ (U+2176) Yes
8 Ⅷ (U+2167) ⅷ (U+2177) Yes
9 Ⅸ (U+2168) ⅸ (U+2178) Yes
10 Ⅹ (U+2169) ⅹ (U+2179) Yes
11 Ⅺ (U+216A) ⅺ (U+217A) Yes
12 Ⅻ (U+216B) ⅻ (U+217B) Yes
13 XIII xiii No — use Latin letters
14 XIV xiv No — use Latin letters
50 Ⅼ (U+216C) ⅼ (U+217C) Yes (single value)
100 Ⅽ (U+216D) ⅽ (U+217D) Yes (single value)
500 Ⅾ (U+216E) ⅾ (U+217E) Yes (single value)
1000 Ⅿ (U+216F) ⅿ (U+217F) Yes (single value)

For composite values like 14 (XIV), 27 (XXVII), or 2024 (MMXXIV), you can either use Latin letters or combine precomposed characters:

# Using Latin letters (recommended)
year = "MMXXIV"  # 2024

# Using precomposed characters (CJK/vertical text)
year_precomposed = "\u216F\u216F\u2169\u2169\u2163"  # Ⅿ Ⅿ Ⅹ Ⅹ Ⅳ = MMXXIV

Apostrophic and Vinculum Notation

Classical and medieval Roman numeral notation used additional marks for large numbers that are not encoded as dedicated Unicode characters:

  • Vinculum (overline): A bar above a numeral multiplies it by 1,000. V-with-overline = 5,000. Unicode has no precomposed "Roman numeral with vinculum," so you must use combining characters: V + U+0305 (COMBINING OVERLINE) = V̅.

  • Apostrophic notation: CIↃ for 1,000, CCIↃↃ for 10,000. The reverse C character Ↄ is encoded at U+2183 (ROMAN NUMERAL REVERSED ONE HUNDRED).

# Vinculum (overline) for large Roman numerals
five_thousand = "V\u0305"       # V̅  = 5,000
ten_thousand = "X\u0305"        # X̅  = 10,000
fifty_thousand = "L\u0305"      # L̅  = 50,000
one_million = "M\u0305"         # M̅  = 1,000,000

Practical Conversion Code

def int_to_roman(num: int) -> str:
    # Standard Roman numeral conversion using Latin letters
    val = [1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1]
    syms = ["M", "CM", "D", "CD", "C", "XC", "L", "XL", "X", "IX", "V", "IV", "I"]
    result = ""
    for i, v in enumerate(val):
        while num >= v:
            result += syms[i]
            num -= v
    return result

def int_to_roman_unicode(num: int) -> str:
    # Using precomposed Unicode characters (1-12 only)
    if 1 <= num <= 12:
        return chr(0x215F + num)
    # Fallback to Latin letters for larger values
    return int_to_roman(num)

print(int_to_roman(2024))           # "MMXXIV"
print(int_to_roman_unicode(7))      # "Ⅶ"
print(int_to_roman_unicode(14))     # "XIV" (fallback)

Key Takeaways

  • Unicode provides precomposed Roman numerals Ⅰ–Ⅻ (1–12) plus L, C, D, M in both uppercase and lowercase — 32 characters total in the Number Forms block (U+2150–U+218F).
  • These exist primarily for CJK compatibility (vertical text, fixed-width cells). The Unicode Consortium recommends using Latin letters (I, V, X) for most contexts.
  • Precomposed characters have General Category Nl (Number, letter) and carry numeric values accessible via unicodedata.numeric().
  • NFKC/NFKD normalization decomposes precomposed Roman numerals into Latin letters, which can break applications that depend on the distinction.
  • For values 13 and above, there are no precomposed multi-value characters — combine individual characters or use Latin letters.
  • The combining overline (U+0305) can be used with Latin letters to represent vinculum notation for large numbers (V̅ = 5,000).

Mehr in Symbol Reference

Complete Arrow Symbols List

Unicode contains hundreds of arrow symbols spanning simple directional arrows, double arrows, …

All Check Mark and Tick Symbols

Unicode provides multiple check mark and tick symbols ranging from the classic …

Star and Asterisk Symbols

Unicode includes a rich collection of star shapes — from the simple …

Heart Symbols Complete Guide

Unicode contains dozens of heart symbols including the classic ♥, black and …

Currency Symbols Around the World

Unicode's Currency Symbols block and surrounding areas contain dedicated characters for over …

Mathematical Symbols and Operators

Unicode has dedicated blocks for mathematical operators, arrows, letterlike symbols, and alphanumeric …

Bracket and Parenthesis Symbols

Beyond the ASCII parentheses and square brackets, Unicode includes angle brackets, curly …

Bullet Point Symbols

Unicode offers a wide variety of bullet point characters beyond the standard …

Line and Box Drawing Characters

Unicode's Box Drawing block contains 128 characters for drawing lines, corners, intersections, …

Musical Note Symbols

Unicode includes musical note symbols such as ♩♪♫♬ in the Miscellaneous Symbols …

Fraction Symbols Guide

Unicode includes precomposed fraction characters for common fractions like ½ ¼ ¾ …

Superscript and Subscript Characters

Unicode provides precomposed superscript and subscript digits and letters — such as …

Circle Symbols

Unicode contains dozens of circle symbols including filled circles, outlined circles, circles …

Square and Rectangle Symbols

Unicode includes filled squares, outlined squares, small squares, medium squares, dashed squares, …

Triangle Symbols

Unicode provides a comprehensive set of triangle symbols in all orientations — …

Diamond Symbols

Unicode includes filled and outline diamond shapes, lozenge characters, and playing card …

Cross and X Mark Symbols

Unicode provides various cross and X mark characters including the heavy ballot …

Dash and Hyphen Symbols Guide

The hyphen-minus on your keyboard is just one of Unicode's many dash …

Quotation Mark Symbols Complete Guide

Unicode defines typographic quotation marks — curly quotes — for dozens of …

Copyright, Trademark & Legal Symbols

Unicode includes dedicated characters for the copyright symbol ©, registered trademark ®, …

Degree and Temperature Symbols

The degree symbol ° (U+00B0) and dedicated Celsius ℃ and Fahrenheit ℉ …

Circled and Enclosed Number Symbols

Unicode's Enclosed Alphanumerics block provides circled numbers ①②③, parenthesized numbers ⑴⑵⑶, and …

Greek Alphabet Symbols for Math and Science

Greek letters like α β γ δ π Σ Ω are widely …

Decorative Dingbats

The Unicode Dingbats block (U+2700–U+27BF) contains 192 decorative symbols originally from the …

Playing Card Symbols

Unicode includes a Playing Cards block with characters for all 52 standard …

Chess Piece Symbols

Unicode provides characters for all six chess piece types in both white …

Zodiac and Astrological Symbols

Unicode's Miscellaneous Symbols block includes the 12 zodiac signs ♈♉♊♋♌♍♎♏♐♑♒♓, planetary symbols, …

Braille Pattern Characters

Unicode's Braille Patterns block (U+2800–U+28FF) encodes all 256 possible combinations of the …

Geometric Shapes Complete Guide

Unicode's Geometric Shapes block contains 96 characters covering circles, squares, triangles, diamonds, …

Letterlike Symbols

The Unicode Letterlike Symbols block contains mathematical and technical symbols derived from …

Technical Symbols Guide

Unicode's Miscellaneous Technical block contains symbols from computing, electronics, and engineering, including …

Combining Characters and Diacritics Guide

Diacritics are accent marks and other marks that attach to letters to …

Whitespace and Invisible Characters Guide

Unicode defines dozens of invisible characters beyond the ordinary space, including zero-width …

Warning and Hazard Signs

Unicode includes warning and hazard symbols such as the universal caution ⚠ …

Weather Symbols Guide

Unicode's Miscellaneous Symbols block includes sun ☀, cloud ☁, rain ☂, snow …

Religious Symbols in Unicode

Unicode includes symbols for many of the world's major religions including the …

Gender and Identity Symbols

Unicode includes the traditional male ♂ and female ♀ symbols from astronomy, …

Keyboard Shortcut Symbols Guide

Apple's macOS uses Unicode characters for keyboard modifier keys such as ⌘ …

Symbols for Social Media Bios

Unicode symbols like ▶ ◀ ► ★ ✦ ⚡ ✈ and hundreds …