Emoji: When Characters Became Culture — The Encoding Wars

On a winter morning in 1999, a 25-year-old designer named Shigetaka Kurita sat in his small office at NTT DoCoMo in Tokyo with a specific, constrained problem. NTT DoCoMo was about to launch i-mode, Japan's first mobile internet service, and Kurita was responsible for designing the user interface for messages. The service had a strict limitation: each message could be at most 250 characters. Japanese text was already compact by the standard of encoding bytes, but characters were characters, and context and nuance were hard to fit in 250 of them.

Kurita's insight was that certain kinds of information could be communicated more efficiently with a picture than with words. Weather forecasts embedded in messages could use a sun icon instead of the word "sunny." A phone number could be preceded by a telephone icon. An expression of embarrassment could be conveyed with a blushing face rather than the phrase "I'm so embarrassed." And because pictures would take only one character's worth of encoding space, they were just as space-efficient as any other character.

Kurita drew 176 tiny 12x12 pixel images. Each one was a character — a single code point in NTT DoCoMo's internal Shift_JIS extension, displayed as a picture. He called them 絵文字 (emoji): 絵 (e) meaning "picture" and 文字 (moji) meaning "character." They were not the first pictographic characters in digital systems — various dingbat symbols had existed in fonts like Zapf Dingbats and Wingdings for years — but they were the first to achieve mass adoption and cultural significance in the context of person-to-person communication.

The 176 Originals and Their Context

Kurita's original 176 emoji are now in the permanent collection of the Museum of Modern Art in New York, exhibited as examples of interface design that shaped culture. Looking at them today, they seem primitive — 12x12 pixels, monochrome in their original form, rendered on the tiny screens of late-1990s feature phones. But they established the vocabulary that would evolve into today's thousands of Unicode emoji.

The original set included weather icons (sun, cloud, rain, snow, lightning, typhoon), communications icons (email envelope, telephone, fax machine, alarm clock), transportation icons (car, train, airplane, ship, bicycle), activity icons (running figure, tennis, baseball, skiing), food and drink icons (beer, coffee, rice ball, ramen), and emotion indicators including a basic smiley face. Notably absent were the elaborate emotional expression emoji that would define the format's cultural impact in later years. The originals were primarily informational and transactional — designed to supplement functional communication, not to express complex emotional states.

The cultural context matters enormously. Japanese communication, particularly in written form, places high value on indirectness and on conveying emotional nuance through subtle cues. The absence of facial expression, tone of voice, and body language in written messages creates ambiguity that is particularly uncomfortable in Japanese social contexts. Kurita's emoji, even in their limited original form, provided a way to reduce that ambiguity — to clarify "I'm saying this lightly, not seriously" or "this is embarrassing but I'm okay with it" in ways that mattered deeply to users.

The 176 emoji were encoded as values in NTT DoCoMo's proprietary extension of Shift_JIS — byte sequences in the ranges 0xF89F to 0xF8FC and 0xF940 to 0xF9FC. This means they existed entirely outside any international standard. They worked perfectly on DoCoMo handsets and in DoCoMo's i-mode service. They were completely invisible — rendered as question marks, boxes, or garbage — on any device not running DoCoMo software. This was acceptable in 1999 when i-mode was a closed ecosystem. It became a problem very quickly.

The Carrier Wars

NTT DoCoMo's success with emoji — i-mode had 40 million subscribers by 2001, an extraordinary adoption rate — sparked immediate imitation from competing carriers. KDDI (then DDI Cellular) and SoftBank (then J-Phone) both launched their own emoji sets for their competing mobile internet services within a year of i-mode's launch. The competition was good for emoji quantity: each carrier tried to outdo the others with more expressive, more colorful, and more culturally resonant emoji sets. SoftBank's emoji set became particularly beloved for its expressiveness, with a wider range of facial expressions and emotions than the DoCoMo original.

But the competition was terrible for interoperability. A face emoji on DoCoMo might map to a completely different character code than the same face emoji on SoftBank, and when you sent a message from one carrier's network to another, the emoji either disappeared, appeared as an error character, or — most confusingly — appeared as a completely different emoji. Sending a "happy face" from your DoCoMo phone to a friend on SoftBank might result in them receiving an "angry face" or a random character. The meaning of messages could be inverted.

Japan's three major carriers each maintained conversion tables that attempted to map their proprietary emoji codes to each other's codes, but these mappings were imperfect and the visual differences between carriers' interpretations of "the same" emoji were significant. The carrier emoji war was a microcosm of the broader encoding chaos problem: a clever local solution that worked perfectly within its domain and created havoc at its boundaries.

Japan's mobile internet was, in the late 1990s and early 2000s, approximately a decade ahead of the rest of the world. The features that the iPhone would introduce to Western markets in 2007 — mobile email, mobile web browsing, mobile commerce, downloadable ringtones and wallpapers, social messaging — were all mature and widely used in Japan by 2003. But the three-carrier emoji incompatibility was a constant friction in this otherwise advanced ecosystem, and the scale of that friction grew as emoji became more culturally central.

The Google and Apple Petitions

When Apple launched the original iPhone in June 2007, Japan was a critical market. Japanese consumers expected emoji support — not because it was a nice feature but because emoji had become part of the daily language of written communication. If the iPhone couldn't send and receive emoji properly, it would be unusable for many Japanese users.

Apple quietly added emoji keyboard support for the Japanese market in a special Japanese version of the iPhone software. The emoji were encoded in a private use area of Unicode — code points that Unicode reserves for private, non-interoperable use, specifically in the range U+E000 to U+F8FF. This allowed emoji to be encoded as Unicode code points without requiring standardization, but it meant that the encoding was still proprietary: an emoji sent from a Japanese iPhone would still be unintelligible to a Windows PC or a competing carrier's handset unless they had implemented the same private-use mapping.

Google, developing Android and launching in Japan in the same period, faced the same challenge. In 2007, a team of Google engineers including Mark Davis (one of the founders of the Unicode Consortium and a co-author of the original Unicode design documents), Kat Momoi, and Peter Edberg submitted a joint proposal to the Unicode Technical Committee requesting that emoji be formally encoded as Unicode characters.

The proposal was carefully designed to address the anticipated objections. It included a comprehensive mapping of all three Japanese carriers' emoji sets, proposed Unicode code points for each emoji, and argued that emoji met the Unicode criteria for character encoding: they were discrete symbols, widely used in written communication, with established meanings, not reducible to existing Unicode characters. The proposal also included a survey of emoji usage frequency showing that several hundred emoji were in extremely widespread use, comparable to commonly used punctuation marks.

Unicode 6.0: Emoji Enter the Standard

In October 2010, Unicode 6.0 was published, containing 722 emoji characters. These were distributed across several new and existing Unicode blocks: the Miscellaneous Symbols block, the Dingbats block, the new Emoticons block (U+1F600 to U+1F64F), the Miscellaneous Symbols and Pictographs block (U+1F300 to U+1F5FF), and the Transport and Map Symbols block (U+1F680 to U+1F6FF). The encoding resolved the three-carrier incompatibility by providing authoritative code points that all carriers and device manufacturers could implement.

iOS 5 (October 2011) added emoji keyboard support worldwide, not just in Japan. Android 4.1 (July 2012) added comprehensive emoji support. Microsoft added emoji to Windows 8 (October 2012). The global emoji era had begun.

The visual rendering question proved more consequential than expected. Apple's iOS emoji design — rounded, detailed, and using a distinctive visual vocabulary that mixed photorealism with cartoon sensibility — became the de facto visual reference for emoji globally. When people mentally pictured "the heart emoji" or "the thumbs-up emoji," they pictured Apple's versions. Microsoft's Windows 8 emoji were notably different in style — more angular, sometimes in ways that seemed expressively different from Apple's versions. Google's early Android emoji were minimalist to the point of ambiguity.

The divergence in visual rendering created real communication problems. Research documented cases where the same emoji displayed on iOS was interpreted as friendly and on Android as hostile, because the facial expression was rendered differently enough to convey different moods. Samsung's versions of some emoji were so visually distinct from Apple's versions that the "same" emoji conveyed materially different emotions. This was a new kind of encoding problem: the code point was standardized, but the glyph was not, and the glyph conveyed the meaning.

Skin Tone Modifiers and Diversity

The original Unicode emoji depicted human figures with a generic yellow skin tone — a design choice intended to avoid specifying any particular ethnicity. But as emoji became cultural artifacts that people used to represent themselves in digital communication, the yellow default was increasingly read as a racially unmarked "other" that didn't represent most of the world's population.

In 2014, a petition signed by over 30,000 people demanded more diverse emoji skin tones. Unicode 8.0 (June 2015) introduced five emoji modifier characters based on the Fitzpatrick scale, a dermatological classification of human skin tones developed by Irish dermatologist Thomas Fitzpatrick in 1975. The modifiers range from U+1F3FB (light skin tone, Fitzpatrick Type I-II) to U+1F3FF (dark skin tone, Fitzpatrick Type V-VI). Any emoji depicting a human figure can be followed by one of these modifier characters to change the displayed skin tone; the base emoji without a modifier defaults to the generic yellow.

The technical implementation was elegant for a genuinely hard problem. A skin-tone-modified emoji consists of two Unicode code points — the base emoji and the modifier — but is rendered as a single glyph on platforms that support modifiers. On platforms that don't, the base emoji appears followed by a colored square, which degrades more gracefully than a complete rendering failure. The system extended naturally to ZWJ (Zero Width Joiner) sequences involving skin tones: family emoji with mixed skin tones, handshake emoji between two people with different skin tones, couple emoji with individually selectable skin tones.

ZWJ Sequences and the Emoji Grammar

The Zero Width Joiner (ZWJ, U+200D) became the mechanism through which emoji complexity grew from individual pictures to compound expressions. A ZWJ sequence instructs a rendering engine to combine adjacent emoji into a single glyph. The mechanism was borrowed from complex scripts like Devanagari, where ZWJ suppresses ligature formation; in the emoji context, it creates it.

The family emoji "Man, Woman, Girl, Boy" (👨‍👩‍👧‍👦) is encoded as U+1F468 (Man) + U+200D + U+1F469 (Woman) + U+200D + U+1F467 (Girl) + U+200D + U+1F466 (Boy) — seven code points, zero or one glyph depending on platform support. A person of a specific profession and gender is encoded as a person emoji plus a ZWJ plus a tool or symbol associated with the profession. A couple of any gender combination is two person emoji joined with ZWJ. The rainbow flag (🏳️‍🌈) is the white flag (U+1F3F3) plus a variation selector plus ZWJ plus the rainbow (U+1F308).

This system made the emoji repertoire combinatorially extensible without requiring each combination to receive its own code point. It also made emoji processing significantly more complex: algorithms that count emoji, split emoji at boundaries, or manipulate emoji text must understand ZWJ sequences and treat their component code points as a unit. The Unicode Standard's rules for "extended grapheme clusters" — the sequences of code points that together constitute a single user-visible character — include ZWJ sequences as a specifically handled case.

Emoji as Cultural Phenomenon

By the mid-2010s, emoji had become something unprecedented in the history of writing: a globally shared pictographic communication system that required no linguistic knowledge to produce or interpret. The face with tears of joy (U+1F602) — named "Word of the Year" by the Oxford English Dictionary in 2015 — was understood from Tokyo to São Paulo to Lagos without translation.

Shigetaka Kurita, watching this global phenomenon from his office in Tokyo, said in interviews that he found it remarkable but also somewhat removed from his original intent. He had designed a functional communication aid for a mobile internet service with a 250-character limit. What emerged was a new expressive register in written language, used by billions of people daily, formalized in an international standard, and the subject of anthropological research and cultural commentary. His 176 tiny pixels had become a chapter in the history of writing.

The annual Unicode emoji proposal process — open to anyone, evaluated by the Emoji Subcommittee, voted on by the full Unicode Technical Committee — became a minor cultural event. Proposals for new emoji circulated on social media. Approved emoji were announced with press coverage that no character encoding decision had ever previously received. The governance of a universal writing standard had become participatory democracy, commerce, and cultural negotiation simultaneously. This was not what the Unicode Consortium had designed for, and it was precisely what the universality of writing requires: a living standard that evolves with the people who use it.