Unicode for the Modern Web · Chapter 6

Fonts and Rendering: Making It Look Right

Font fallback chains, system-ui, variable fonts, and missing glyph detection — this chapter covers everything you need to know about rendering Unicode characters correctly in web browsers.

~3,000 words · ~12 min read · · Updated

A character exists in your database, travels correctly through your API, arrives in the browser's JavaScript engine with the right code points — and then the browser draws a small square. That square is "tofu": the visual placeholder rendered when the font has no glyph for a character. Understanding how browsers select fonts, how fonts encode Unicode coverage, and how to build robust fallback chains is the last mile of Unicode correctness: making it look right.

Unicode Coverage in Fonts

A font file contains glyphs — visual representations of characters — plus a cmap table that maps Unicode code points to glyph indices. No font covers all of Unicode's 149,813 assigned characters (as of Unicode 15.1). Even the comprehensive Noto project — Google's effort to eliminate tofu — distributes coverage across dozens of separate font files.

Typical coverage for common fonts:

Font Coverage
Arial ~2,700 characters
Times New Roman ~3,300 characters
DejaVu Sans ~6,250 characters
Noto Sans (single file) ~2,000–3,500 characters
All Noto Sans files combined ~110,000+ characters

You can inspect a font's Unicode coverage with tools:

from fontTools.ttLib import TTFont

font = TTFont('/System/Library/Fonts/Helvetica.ttc')
cmap = font.getBestCmap()
print(f"Coverage: {len(cmap)} code points")

# Check specific character
print(0x1F600 in cmap)  # Is 😀 covered?

The Font Fallback Chain

When a browser renders text, it walks through the font stack for each character, using the first font in the list that has a glyph for that character:

body {
  font-family:
    'Inter',           /* 1. Primary web font — Latin, some symbols */
    system-ui,         /* 2. Platform UI font — decent multilingual coverage */
    'Apple Color Emoji',  /* 3. macOS/iOS emoji (color, SBIX) */
    'Segoe UI Emoji',     /* 4. Windows emoji (color, COLR/CPAL) */
    'Noto Color Emoji',   /* 5. Cross-platform fallback emoji */
    'Noto Sans',          /* 6. Google's universal coverage */
    sans-serif;           /* 7. Browser last resort */
}

The browser checks each font for each character (not each word or element). A sentence mixing Latin and Arabic might use Inter for the Latin portions and Noto Sans Arabic for the Arabic portions — character by character.

Tofu (□) appears when no font in the chain has a glyph for the character. Common causes: - CJK characters without a CJK font in the stack - Newly assigned emoji not yet in any installed font - Private Use Area (PUA) characters without the proprietary font loaded - Rare scripts (Tirhuta, Nüshu, Hanifi Rohingya) with no system font

@font-face and unicode-range Subsetting

Loading the entire Noto Sans collection upfront is impractical — it totals hundreds of megabytes. The solution is unicode-range in @font-face, which triggers conditional font loading:

/* Only download this font if the page contains Devanagari */
@font-face {
  font-family: 'NotoSans';
  font-style: normal;
  font-weight: 400;
  src: url('/fonts/noto-sans-devanagari.woff2') format('woff2');
  unicode-range: U+0900-097F,  /* Devanagari */
                 U+1CD0-1CFF,  /* Vedic Extensions */
                 U+A8E0-A8FF;  /* Devanagari Extended */
  font-display: swap;
}

/* Latin subset — always loads */
@font-face {
  font-family: 'NotoSans';
  font-style: normal;
  font-weight: 400;
  src: url('/fonts/noto-sans-latin.woff2') format('woff2');
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC,
                 U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074,
                 U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215,
                 U+FEFF, U+FFFD;
  font-display: swap;
}

This is exactly how Google Fonts works. When you embed a Google Fonts URL, the CSS response contains 20–40 @font-face blocks, each covering a Unicode subset, each with a conditional unicode-range. Only the subsets needed by the page's text are downloaded.

Variable Fonts

Variable fonts (OpenType 1.8, 2016) pack multiple weights, widths, and styles into a single font file using interpolatable design axes:

@font-face {
  font-family: 'InterVariable';
  src: url('/fonts/Inter-Variable.woff2') format('woff2');
  font-weight: 100 900;  /* entire weight range from one file */
}

h1 {
  font-family: 'InterVariable';
  font-weight: 750;  /* any value between 100–900 */
  font-variation-settings: 'wght' 750, 'slnt' -5;
}

For Unicode coverage, variable fonts generally cover the same code points as their static counterparts — the variability is about design axes, not character coverage. But one variable font file replacing 6–12 static weight/style files significantly reduces HTTP requests.

Color Fonts for Emoji

Emoji rendering uses four competing color font formats, all embedded in standard .ttf/.otf/.woff2 containers:

Format Table Used by Notes
COLR/CPAL COLR + CPAL Microsoft (Windows), Google (Android/Chrome) Vector, compact, v1 supports gradients
SVG SVG Mozilla Firefox (legacy), Adobe SVG glyphs, large files
SBIX sbix Apple (macOS, iOS) PNG bitmaps at multiple sizes
CBDT/CBLC CBDT + CBLC Google (older Android) PNG bitmaps, like SBIX

Modern emoji fonts often include multiple tables for cross-platform compatibility. The browser/OS chooses the format it supports. COLR v1 (OpenType 1.9) is the current standard, supported in Chrome 98+, Firefox 98+, Safari 15.4+.

/* Color emoji always take precedence over text presentation
   when a color emoji font is available */
.emoji {
  font-family: 'Apple Color Emoji', 'Segoe UI Emoji', 'Noto Color Emoji';
}

/* Force specific emoji size — color fonts scale like vector */
.large-emoji {
  font-size: 3rem;
  line-height: 1;
}

OpenType Features

OpenType features are named 4-character tags that enable advanced typographic features stored in font tables:

body {
  /* Enable contextual alternates and standard ligatures */
  font-feature-settings: 'calt' 1, 'liga' 1;

  /* Or use the higher-level property */
  font-variant-ligatures: common-ligatures contextual;
}

/* Old-style figures in body text */
.body-text {
  font-variant-numeric: oldstyle-nums proportional-nums;
}

/* Tabular numbers for data tables */
.data-table {
  font-variant-numeric: tabular-nums;
}

/* Small caps */
.small-caps {
  font-variant-caps: small-caps;
}

Key features for Unicode correctness: - kern — kerning adjustments between specific pairs - mark/mkmk — positioning combining marks (accents) correctly over base letters - curs — cursive joining (Arabic, Syriac) - init/medi/fina/isol — Arabic letter positional forms - rtla/rtlm — right-to-left glyph alternates

font-display Strategy

Web fonts create a Flash of Invisible Text (FOIT) or Flash of Unstyled Text (FOUT). The font-display descriptor controls the trade-off:

@font-face {
  font-family: 'Inter';
  src: url('/fonts/Inter.woff2') format('woff2');
  font-display: swap;  /* show fallback immediately, swap when loaded */
}

Values: - block — invisible text for 3s, then swap (old default, worst UX) - swap — show fallback immediately, swap when loaded (CLS risk) - fallback — invisible for 100ms, fallback for 3s, then keep fallback - optional — 100ms block, then use cached font or abandon (best for performance)

For body text on content sites: swap. For headlines where the font is distinctive: optional on repeat visits.

Noto Fonts: The Universal Fallback

The Noto project (the name means "no tofu") is Google's effort to create fonts covering all of Unicode. Each Noto font file covers one or several scripts:

Noto Sans              — Latin, Cyrillic, Greek, and more
Noto Sans CJK SC/TC/JP/KR — Simplified/Traditional Chinese, Japanese, Korean
Noto Sans Arabic       — Arabic
Noto Sans Hebrew       — Hebrew
Noto Sans Devanagari   — Devanagari (Hindi, Marathi, Sanskrit)
Noto Color Emoji       — Color emoji (CBDT and COLR v1)
Noto Serif *           — Serif variants of all scripts

For a web application that handles user-generated content in unknown languages, including Noto as a fallback via Google Fonts or self-hosting ensures that virtually no character produces tofu:

<link rel="preconnect" href="https://fonts.googleapis.com">
<link href="https://fonts.googleapis.com/css2?family=Noto+Sans:wght@400;700&family=Noto+Sans+SC&family=Noto+Color+Emoji&display=swap" rel="stylesheet">

The conditional unicode-range loading in the Google Fonts response ensures users only download the Noto subset files that contain characters actually present on the page.