Norme universelle d'encodage de caractères attribuant un numéro unique (point de code) à chaque caractère de tous les systèmes d'écriture. La version 16.0 contient 154 998 caractères assignés.

What is Point de code?

Valeur numérique dans l'espace de codes Unicode (U+0000 à U+10FFFF), écrite sous la forme U+XXXX. Tous les points de code ne sont pas assignés à des caractères.

What is Non-caractère?

Points de code définitivement réservés à un usage interne (66 au total) : U+FDD0–U+FDEF et U+nFFFE/U+nFFFF pour chaque plan. Valides dans le texte, mais ne doivent pas être échangés en externe.

Norme Unicode

Zone à usage privé

Plages réservées où les organisations peuvent définir leurs propres caractères : PUA du BMP (U+E000–U+F8FF) ainsi que des PUA supplémentaires dans les plans 15 et 16.

2021-06-21 · Updated 2024-10-08

What is the Private Use Area?

The Private Use Area (PUA) refers to three ranges of Unicode code points that are permanently reserved for applications to define their own characters. Unlike most of the Unicode code space, PUA code points will never be assigned official characters by the Unicode Consortium. Instead, any organization can use them for proprietary characters — custom icons, corporate logos, game symbols, or glyphs not yet in Unicode.

There are three PUA regions in Unicode:

Name	Range	Size
BMP Private Use Area	U+E000–U+F8FF	6,400 code points
Supplementary Private Use Area A	U+F0000–U+FFFFF	65,534 code points
Supplementary Private Use Area B	U+100000–U+10FFFF	65,534 code points

Total: 137,468 code points — by far the largest reserved region in Unicode.

How the PUA is Used

Because PUA code points have no standard meaning, their interpretation is entirely up to the parties exchanging the text. This requires both sides to agree on a mapping — typically through a custom font that maps PUA code points to specific glyphs.

Common use cases:

Icon fonts — Font Awesome, Material Icons, and similar libraries map their icons to PUA code points (e.g., U+F000+ for Font Awesome). The font renders the PUA code point as the intended icon.
Corporate logo characters — Companies sometimes use PUA slots for brand marks in specialized documents.
Pre-standardization characters — Klingon, Tengwar (Tolkien's Elvish script), and other scripts not yet in Unicode have community-defined PUA assignments (the ConScript Unicode Registry, CSUR).
Regional/historic writing systems — Script communities waiting for official Unicode approval use the PUA for interoperability within their community.

The Interoperability Problem

PUA usage is inherently non-interoperable across different applications or organizations unless both use the same font and the same mapping. A PUA code point U+E001 might be a "thumbs up" icon in one font and a currency symbol in another. When text with PUA characters is exchanged between systems using different fonts, the result is meaningless glyphs.

# PUA code points have no official name
import unicodedata

cp = 0xE001  # PUA code point
try:
    name = unicodedata.name(chr(cp))
except ValueError as e:
    print(e)  # no such name

category = unicodedata.category(chr(cp))
print(category)  # "Co" (Private Use)

PUA in Emoji History

Before emoji were standardized in Unicode 6.0 (2010), Japanese mobile carriers (DoCoMo, KDDI, SoftBank) each used their own PUA encodings for emoji. DoCoMo used the range U+E63E–U+E757; SoftBank used a different range. This is why early cross-carrier emoji were garbled — each carrier had a different PUA mapping. Unicode 6.0 unified these into standardized code points.

Detecting PUA Characters

import unicodedata

def is_pua(char: str) -> bool:
    return unicodedata.category(char) == "Co"

print(is_pua("\uE001"))     # True (BMP PUA)
print(is_pua("\U000F0001")) # True (Supplementary PUA A)
print(is_pua("A"))          # False

Common Pitfalls

Assuming PUA characters are portable: Never embed PUA characters in data exchanged with external systems without documenting the required font/mapping.

Font Awesome characters in databases: Storing Font Awesome PUA icons in a database works only if the rendering system also uses Font Awesome. On different systems, PUA values appear as blank boxes or unrelated glyphs.

Quick Facts

Property	Value
BMP PUA range	U+E000–U+F8FF
Supplementary PUA A	U+F0000–U+FFFFF
Supplementary PUA B	U+100000–U+10FFFF
Total PUA code points	137,468
General category	Co (Private Use)
Official character assignment	Never — permanently private
Common use	Icon fonts (Font Awesome, Material Icons)
Registry for scripts	CSUR (ConScript Unicode Registry)

Termes associés

Unicode Point de code Non-caractère

Plus dans Norme Unicode

Basic Multilingual Plane (BMP)

Plan 0 (U+0000–U+FFFF), contenant les caractères les plus courants : latin, grec, …

Caractère abstrait

Unité d'information utilisée pour organiser, contrôler ou représenter des données textuelles — …

Caractère affecté

Point de code auquel un caractère a été attribué dans une version …

CJK

Chinois, Japonais et Coréen — le terme collectif pour le bloc des …

Consortium Unicode

Organisation à but non lucratif qui développe et maintient le standard Unicode. …

Espace de code

La plage complète des points de code Unicode possibles : U+0000 à …

Han Unification

The process of mapping Chinese, Japanese, and Korean ideographs that share a …

Hangul Jamo

The individual consonant and vowel components (jamo) of the Korean Hangul writing …

ISO 10646 / Universal Character Set

Norme internationale (ISO/IEC 10646) synchronisée avec Unicode, définissant le même répertoire de …

Non-caractère

Points de code définitivement réservés à un usage interne (66 au total) …

← Retour au glossaire