📜 Script Stories

Greek and Coptic

Greek is one of the oldest alphabetic writing systems and gave Unicode many of its mathematical symbols, with the Greek and Coptic block serving both modern Greek text and ancient Coptic liturgical use. This guide explores the Greek and Coptic Unicode block, the history of the script, and how Greek letters are used in mathematics and science.

Published 2023-06-01 · Updated 2025-01-14

Greek is one of the oldest writing systems in continuous use. For over 2,700 years, the Greek alphabet has served as the script for one of the world's foundational literary and philosophical traditions — and its influence extends far beyond the Greek language. Greek letters are the standard notation of mathematics, physics, and engineering worldwide. The alphabet also gave birth to the Latin and Cyrillic scripts, making it the ancestor of writing systems used by billions. In Unicode, Greek shares a block with Coptic, the latest descendant of the Egyptian language, creating a fascinating intersection of ancient and modern. This guide explores the Greek and Coptic Unicode block, the extended Greek blocks, and the many roles Greek characters play in modern computing.

A Brief History

The Greek alphabet emerged around 800 BCE, adapted from the Phoenician consonantal script. The Greeks' crucial innovation was the systematic introduction of vowel letters — they repurposed Phoenician consonants that had no equivalent in Greek to represent vowel sounds. This made Greek the first true alphabet (as opposed to an abjad or abugida), where both consonants and vowels have dedicated letters.

The word "alphabet" itself comes from the first two Greek letters: alpha (α) and beta (β).

Over the centuries, Greek script evolved through several stages:

Period	Script Form	Key Feature
800–400 BCE	Archaic Greek	Multiple local variants
403 BCE	Ionic alphabet adopted	Athens standardizes on 24 letters
4th c. BCE – 8th c. CE	Greek majuscule (uncial)	All uppercase, no spaces
9th c. CE onwards	Greek minuscule	Lowercase develops, accents added
1982	Monotonic reform	Greece simplifies to single accent

The Greek Alphabet

Modern Greek uses 24 letters:

Upper	Lower	Name	Unicode (Upper)	Unicode (Lower)
Α	α	Alpha	U+0391	U+03B1
Β	β	Beta	U+0392	U+03B2
Γ	γ	Gamma	U+0393	U+03B3
Δ	δ	Delta	U+0394	U+03B4
Ε	ε	Epsilon	U+0395	U+03B5
Ζ	ζ	Zeta	U+0396	U+03B6
Η	η	Eta	U+0397	U+03B7
Θ	θ	Theta	U+0398	U+03B8
Ι	ι	Iota	U+0399	U+03B9
Κ	κ	Kappa	U+039A	U+03BA
Λ	λ	Lambda	U+039B	U+03BB
Μ	μ	Mu	U+039C	U+03BC
Ν	ν	Nu	U+039D	U+03BD
Ξ	ξ	Xi	U+039E	U+03BE
Ο	ο	Omicron	U+039F	U+03BF
Π	π	Pi	U+03A0	U+03C0
Ρ	ρ	Rho	U+03A1	U+03C1
Σ	σ/ς	Sigma	U+03A3	U+03C3/U+03C2
Τ	τ	Tau	U+03A4	U+03C4
Υ	υ	Upsilon	U+03A5	U+03C5
Φ	φ	Phi	U+03A6	U+03C6
Χ	χ	Chi	U+03A7	U+03C7
Ψ	ψ	Psi	U+03A8	U+03C8
Ω	ω	Omega	U+03A9	U+03C9

Final Sigma

Greek lowercase sigma has two forms: medial sigma (σ, U+03C3) used within words, and final sigma (ς, U+03C2) used at the end of words. Unicode encodes these as separate characters. Case conversion must account for this:

# Python handles final sigma correctly in case folding
word = "\u03BB\u03CC\u03B3\u03BF\u03C2"  # λόγος
print(word.upper())   # ΛΟΓΟΣ — both sigmas become Σ
print(word.lower())   # λόγος — final sigma preserved
print(word.casefold())  # λόγοσ — casefold uses medial sigma (for comparison)

Unicode Blocks for Greek

Block	Range	Characters	Purpose
Greek and Coptic	U+0370–U+03FF	135	Modern Greek letters + Coptic legacy
Greek Extended	U+1F00–U+1FFF	233	Polytonic Greek (ancient accents)
Coptic	U+2C80–U+2CFF	123	Dedicated Coptic characters
Coptic Epact Numbers	U+102E0–U+102FF	28	Coptic calendar numbers

Greek and Coptic Block (U+0370–U+03FF)

This primary block contains:

24 modern Greek uppercase and lowercase letters
Accented letters for monotonic Greek (ά, έ, ή, ί, ό, ύ, ώ)
Diacritics: tonos (accent), dialytika (dieresis)
The final sigma (ς)
Archaic letters: digamma (Ϝ), koppa (Ϟ), sampi (Ϡ), stigma (Ϛ)
Coptic letters that were historically unified with Greek (e.g., U+03E2 Ϣ)

Greek Extended Block (U+1F00–U+1FFF)

This block supports polytonic Greek — the traditional accent system used in Ancient Greek and in formal Greek writing before the 1982 reform. Polytonic Greek uses three accent marks, two breathing marks, and the iota subscript:

Diacritic	Name	Example	Purpose
´	Oxia (acute)	ά	Rising pitch
`	Varia (grave)	ὰ	Falling pitch
˜	Perispomeni (circumflex)	ᾶ	Rising-falling pitch
ʽ	Dasia (rough breathing)	ἁ	Initial /h/ sound
ʼ	Psili (smooth breathing)	ἀ	No initial /h/
ͅ	Ypogegrammeni (iota subscript)	ᾳ	Historical diphthong

The Greek Extended block provides precomposed characters for all combinations of these diacritics on vowels:

U+1F00  ἀ  GREEK SMALL LETTER ALPHA WITH PSILI
U+1F01  ἁ  GREEK SMALL LETTER ALPHA WITH DASIA
U+1F04  ἄ  GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA
U+1F05  ἅ  GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA
U+1F80  ᾀ  GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
U+1F86  ᾆ  GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI

Greek and Coptic: Why One Block?

When Unicode was first designed, Coptic characters were "unified" with Greek — Coptic letters that looked similar to Greek letters were given the same code points. This was a practical decision but created problems:

Coptic and Greek are different scripts used by different communities
Font selection broke — a Coptic text would render with Greek fonts
Sorting and collation rules differ between the two scripts

Unicode 4.1 (2005) resolved this by adding a dedicated Coptic block (U+2C80–U+2CFF) with separate code points for all Coptic letters. The Coptic letters remaining in the Greek and Coptic block are kept for backward compatibility but are considered deprecated for Coptic use.

What is Coptic?

Coptic is the latest stage of the ancient Egyptian language, written with a script derived from the Greek alphabet plus six or seven additional letters from Demotic Egyptian. Coptic ceased to be a spoken vernacular language around the 17th century but remains the liturgical language of the Coptic Orthodox Church, used by approximately 15–20 million Coptic Christians in Egypt.

# Coptic-specific letters (not in Greek)
U+2C80  Ⲁ  COPTIC CAPITAL LETTER ALFA
U+2C81  ⲁ  COPTIC SMALL LETTER ALFA
U+2CA0  Ⲡ  COPTIC CAPITAL LETTER PI
U+2CA2  Ⲣ  COPTIC CAPITAL LETTER RO
U+2CB6  Ⳇ  COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI

Greek in Mathematics and Science

Greek letters are the lingua franca of mathematical and scientific notation. Unicode provides these characters in multiple contexts:

From the Greek Block (Plain Text)

These are the standard Greek letters used in running text:

Symbol	Code Point	Common Use
α	U+03B1	Angles, alpha particles, significance level
β	U+03B2	Beta coefficients, beta particles
γ	U+03B3	Gamma rays, Euler–Mascheroni constant
δ	U+03B4	Small changes (calculus), Kronecker delta
ε	U+03B5	Arbitrarily small quantities (analysis)
θ	U+03B8	Angles (trigonometry)
λ	U+03BB	Wavelength, lambda calculus, eigenvalues
μ	U+03BC	Micro- prefix, mean (statistics)
π	U+03C0	Pi (3.14159...)
σ	U+03C3	Standard deviation, summation (upper: Σ)
φ	U+03C6	Golden ratio, phase angle, Euler's totient
ω	U+03C9	Angular frequency
Δ	U+0394	Change/difference
Σ	U+03A3	Summation
Π	U+03A0	Product
Ω	U+03A9	Ohm (also U+2126 OHM SIGN for compatibility)

Mathematical Alphanumeric Symbols

For mathematical typography that requires distinct styles, Unicode provides styled variants in the Mathematical Alphanumeric Symbols block (U+1D400–U+1D7FF):

Style	Example	Range
Bold	𝛂 𝛃 𝛄	U+1D6C2–U+1D6DB
Italic	𝛼 𝛽 𝛾	U+1D6FC–U+1D715
Bold Italic	𝜶 𝜷 𝜸	U+1D736–U+1D74F

These are used in formal mathematical typesetting to distinguish between different uses of the same letter.

Confusable Characters

Greek letters are a major source of homoglyph attacks because many look identical to Latin letters:

Greek	Latin	Identical?
Α (U+0391)	A (U+0041)	Visually identical
Β (U+0392)	B (U+0042)	Visually identical
Ε (U+0395)	E (U+0045)	Visually identical
Η (U+0397)	H (U+0048)	Visually identical
Ι (U+0399)	I (U+0049)	Visually identical
Κ (U+039A)	K (U+004B)	Visually identical
Μ (U+039C)	M (U+004D)	Visually identical
Ν (U+039D)	N (U+004E)	Visually identical
Ο (U+039F)	O (U+004F)	Visually identical
Ρ (U+03A1)	P (U+0050)	Visually identical
Τ (U+03A4)	T (U+0054)	Visually identical
Χ (U+03A7)	X (U+0058)	Visually identical
ο (U+03BF)	o (U+006F)	Visually identical
ν (U+03BD)	v (U+0076)	Very similar

This is why the Unicode Consortium publishes the confusables.txt file and why IDNA (Internationalized Domain Names) restricts mixing Greek and Latin characters in the same domain label.

# Detecting mixed scripts (potential homoglyph attack)
import unicodedata

def get_script(char: str) -> str:
    # Simplified — in practice use the Unicode Script property
    cp = ord(char)
    if 0x0370 <= cp <= 0x03FF or 0x1F00 <= cp <= 0x1FFF:
        return "Greek"
    elif 0x0041 <= cp <= 0x024F:
        return "Latin"
    return "Other"

text = "\u0391pple"  # Greek Alpha + "pple"
scripts = {get_script(c) for c in text if c.isalpha()}
if len(scripts) > 1:
    print(f"Mixed scripts detected: {scripts}")
    # Mixed scripts detected: {'Greek', 'Latin'}

Working with Greek Text in Code

Python

import unicodedata

# Modern Greek (monotonic)
text = "\u039A\u03B1\u03BB\u03B7\u03BC\u03AD\u03C1\u03B1"  # Καλημέρα (Good morning)
print(text.upper())  # ΚΑΛΗΜΕΡΑ
print(text.lower())  # καλημέρα

# Check for Greek script
for ch in text:
    print(f"U+{ord(ch):04X} {unicodedata.name(ch)}")

# Ancient Greek (polytonic)
ancient = "\u1F08\u03BD\u03B4\u03C1\u03CE\u03C0\u03BF\u03C5"  # Ἀνδρώπου

JavaScript

// Greek regex matching
const greekPattern = /\p{Script=Greek}/u;
const text = "\u039A\u03B1\u03BB\u03B7\u03BC\u03AD\u03C1\u03B1";
console.log(greekPattern.test(text)); // true

// Normalize polytonic to monotonic (approximate)
const polytonic = "\u1F08\u03BD\u03B4\u03C1\u03CE\u03C0\u03BF\u03C5";
const nfd = polytonic.normalize("NFD");
// Remove combining marks except tonos
const monotonic = nfd.replace(/[\u0300\u0301\u0342\u0313\u0314\u0345]/g, "");

Summary

Greek is far more than a modern language script — it is a cornerstone of global scientific and mathematical notation, the ancestor of Latin and Cyrillic, and a writing system with nearly three millennia of continuous history. Key takeaways for developers:

Greek and Coptic share a Unicode block but are separate scripts — use the dedicated Coptic block (U+2C80–U+2CFF) for Coptic text
Final sigma (ς, U+03C2) must be handled correctly in case conversion and text processing
Polytonic Greek uses the Greek Extended block (U+1F00–U+1FFF) with complex combinations of breathing marks and accents
Greek–Latin confusables are a security concern for domain names, usernames, and any mixed-script context
Mathematical Greek uses standard Greek code points in plain text; use the Mathematical Alphanumeric Symbols block only for styled variants
Normalize polytonic text carefully — NFC and NFD produce different code point sequences that must be handled consistently