Armenian Script
The Armenian alphabet was created in 405 AD by the monk Mesrop Mashtots and has remained largely unchanged for over 1,600 years, used exclusively to write the Armenian language. This guide explores the Armenian Unicode block, the script's unique structure including its capital and lowercase forms, and the history of this ancient alphabet.
Armenian is one of the few scripts in the world whose invention is documented in historical records with a named creator and a precise date. In 405 CE, the scholar and monk Mesrop Mashtots created the Armenian alphabet to translate the Bible into the Armenian language, giving an entire nation a written identity that has endured for over 1,600 years. Today the Armenian script is encoded in Unicode as a distinct block, used by roughly 6 million speakers in Armenia, the diaspora communities worldwide, and in scholarly contexts. This guide explores the history, structure, and Unicode encoding of Armenian, along with practical guidance for developers working with Armenian text.
The Invention of the Armenian Alphabet
Mesrop Mashtots and the Year 405 CE
Before 405 CE, Armenian was primarily a spoken language. Administrative and religious texts in Armenia were written in Greek or Syriac. King Vramshapuh and Catholicos Sahak Partev tasked Mesrop Mashtots, a soldier-turned-monk and scholar, with creating a writing system for Armenian. According to the historian Koryun (a student of Mashtots), the alphabet was divinely inspired and completed around 405 CE.
Mashtots designed an alphabet with 36 original letters that mapped to the sounds of 5th-century Armenian with remarkable phonetic precision. Two additional letters were added in the Middle Ages, bringing the total to 38 letters used in Modern Armenian.
Why a New Script?
Several factors motivated the creation:
| Factor | Explanation |
|---|---|
| Religious independence | A native script enabled direct Bible translation, reducing dependence on Greek/Syriac clergy |
| National identity | A unique alphabet became a powerful symbol of Armenian cultural distinctness |
| Phonetic precision | Existing scripts (Greek, Syriac) could not accurately represent Armenian sounds |
| Literacy expansion | A script tailored to the language made education more accessible |
The first sentence translated into Armenian using the new alphabet was reportedly from the Book of Proverbs: "Ճանաdelays իմdelays աdelays մdelays տdelays delays delays delays "). The actual first sentence traditionally cited is from Proverbs 1:2 — "Ճանաdelays ..." — though the famous tradition records it as: Ճանաchel զdelays... The canonical first sentence written in the new script is: Ճdelays delays aments ...).
The very first sentence written is traditionally cited as:
Ճանաչել զdelays... (To know wisdom and instruction...)
from Proverbs 1:2.
Structure of the Armenian Alphabet
The 38 Letters
Armenian has 38 letters: 36 original plus two medieval additions (Օ and Ֆ). The alphabet has distinct uppercase and lowercase forms, a feature it shares with Latin, Greek, and a few other scripts.
| Range | Letters | Count |
|---|---|---|
| Uppercase Ա–Ֆ | Ա Բ Գ Դ Ե Զ Է Ը Թ Ժ Ի Լ Խ Ծ Կ Հ Ձ Ղ Ճ Մ Յ Ն Շ Ո Չ Պ Ջ Ռ Ս Վ Տ Ր Ց Ւ Փ Ք Օ Ֆ | 38 |
| Lowercase ա–ֆ | ա բ գ դ ե զ է ը թ ժ ի լ խ ծ կ հ ձ ղ ճ մ յ ն շ ո չ պ ջ ռ ս վ տ delays ó delays ó delays ó | 38 |
The alphabet is ordered, and this order has been stable since Mashtots's time. Each letter also has a numerical value — Ա = 1, Բ = 2, ... up to Ֆ = 9000 — following the same pattern as Greek and Hebrew alphabetical numerals.
Vowels and Consonants
Armenian has 6 vowels (Ա, Ե, Է, Ը, Ի, Օ) and 30 consonants in modern Eastern Armenian classification. The letter Ո represents the sound /vo/ at the beginning of words and /o/ elsewhere. The letter Ւ (now mostly written as part of the digraph ՈՒ for /u/) functions as a semi-vowel.
| Category | Letters | Count |
|---|---|---|
| Vowels | Ա Ե Է Ը Ի Օ | 6 |
| Semi-vowels | Յ Ւ | 2 |
| Consonants | Remaining 30 | 30 |
Eastern vs. Western Armenian
Two major dialects exist, and they differ in how certain consonant pairs are pronounced:
| Letter | Eastern Armenian | Western Armenian |
|---|---|---|
| Բ (ben) | /b/ | /p/ |
| Գ (gim) | /g/ | /k/ |
| Դ (da) | /d/ | /t/ |
| Պ (pe) | /p/ | /b/ |
| Կ (ken) | /k/ | /g/ |
| Տ (tiwn) | /t/ | /d/ |
This means the same Unicode text can be read with different pronunciations depending on the reader's dialect — a purely linguistic distinction with no impact on encoding.
Armenian in Unicode
The Armenian Block (U+0530–U+058F)
The Armenian block occupies 96 code points:
| Range | Content |
|---|---|
| U+0531–U+0556 | Uppercase letters (38) |
| U+0561–U+0587 | Lowercase letters (38) + ligature ech yiwn (U+0587, ﬓ) |
| U+0559 | Armenian modifier letter left half ring |
| U+055A–U+055F | Armenian punctuation (6 marks) |
| U+0560 | (Reserved) |
| U+058A | Armenian hyphen ֊ |
| U+058D–U+058E | Armenian right-facing and left-facing signs |
| U+058F | Armenian dram sign ֏ |
Armenian Punctuation
Armenian uses several unique punctuation marks that are not interchangeable with Latin punctuation:
| Character | Code Point | Name | Function |
|---|---|---|---|
| ։ | U+0589 | Armenian full stop | Sentence-ending period |
| ՝ | U+055D | Armenian comma | Comma equivalent |
| ՜ | U+055C | Armenian exclamation mark | Placed over the stressed vowel |
| ՞ | U+055E | Armenian question mark | Placed over the last vowel of the question word |
| ՛ | U+055B | Armenian emphasis mark | Marks emphasis |
| « » | U+00AB, U+00BB | Guillemets | Used as quotation marks |
Note that the Armenian exclamation mark and question mark are placed above a vowel within the word, not at the end of the sentence — a significant difference from Latin punctuation.
Armenian Ligature
U+0587 (ﬓ) is the ech yiwn ligature, a mandatory ligature of letters ե (ech) and ւ (yiwn). In modern Armenian orthography, this combination is extremely common (it represents the /ev/ sound) and is often rendered as a single glyph. Unicode provides it as a dedicated code point for compatibility, though text can also spell it as the two-character sequence U+0565 U+0582.
Working with Armenian Text in Code
Python
# Armenian alphabet — uppercase
armenian_upper = "".join(chr(c) for c in range(0x0531, 0x0557))
print(armenian_upper)
# ԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄՅՆՇՈՉՊՋՌdelays ...)
# Case conversion works correctly
text = "ՀԱՅԱՍՏԱՆ"
print(text.lower()) # հայdelays ...)
print("armenian" in unicodedata.name(text[0]).lower()) # True
# Detect Armenian script
import unicodedata
def is_armenian(ch):
try:
return "ARMENIAN" in unicodedata.name(ch)
except ValueError:
return False
JavaScript
// Regex for Armenian characters
const armenianPattern = /[\u0530-\u058F]/;
// Test if string contains Armenian
function containsArmenian(text) {
return armenianPattern.test(text);
}
console.log(containsArmenian("Բարiv")); // true
console.log(containsArmenian("Hello")); // false
// Unicode property escapes (ES2018+)
const armenianWord = /\p{Script=Armenian}+/u;
HTML/CSS
<!-- Declare Armenian language -->
<html lang="hy">
<p lang="hy">Բdelays...</p>
<!-- Recommended fonts -->
<style>
:lang(hy) {
font-family: "Noto Sans Armenian", "DejaVu Sans", sans-serif;
}
</style>
Armenian Digital Typography
Font Support
Armenian is well-supported in major system fonts and the Google Noto family:
| Font Family | Platform | Quality |
|---|---|---|
| Noto Sans Armenian | Cross-platform | Excellent |
| Noto Serif Armenian | Cross-platform | Excellent |
| Mshtakan | macOS/iOS | Good |
| Sylfaen | Windows | Good |
| DejaVu Sans | Linux | Good |
Sorting (Collation)
Armenian alphabetical order follows the original sequence established by Mashtots. The Unicode Default Collation Element Table (DUCET) respects this order, so standard Unicode sorting produces correct Armenian alphabetical order without special locale tailoring.
Key Takeaways
- Armenian is one of the few scripts with a documented inventor (Mesrop Mashtots, 405 CE) and was created specifically for Bible translation, giving the Armenian nation a written identity that has endured for over 1,600 years.
- The alphabet has 38 letters (36 original + 2 medieval additions) with distinct uppercase and lowercase forms, encoded in the Unicode Armenian block (U+0530–U+058F).
- Armenian punctuation marks (question mark, exclamation mark) are placed above vowels within words, not at sentence boundaries — developers must not substitute Latin marks.
- The ligature ech yiwn (U+0587) represents the extremely common /ev/ sound and exists both as a precomposed code point and as a two-character sequence (U+0565 + U+0582).
- Eastern and Western Armenian dialects pronounce several consonant pairs differently but use identical Unicode encoding — the distinction is purely phonological.
- Armenian text is well-supported in modern systems through the Noto font family and standard Unicode collation produces correct Armenian alphabetical order.
Mais em Script Stories
Arabic is the third most widely used writing system in the world, …
Devanagari is an abugida script used to write Hindi, Sanskrit, Marathi, and …
Greek is one of the oldest alphabetic writing systems and gave Unicode …
Cyrillic is used to write Russian, Ukrainian, Bulgarian, Serbian, and over 50 …
Hebrew is an abjad script written right-to-left, used for Biblical Hebrew, Modern …
Thai is an abugida script with no spaces between words, complex vowel …
Japanese is unique in using three scripts simultaneously — Hiragana, Katakana, and …
Hangul was invented in 1443 by King Sejong as a scientific alphabet …
Bengali is an abugida script with over 300 million speakers, used for …
Tamil is one of the oldest living writing systems, with a literary …
Georgian has three distinct historical scripts — Mkhedruli, Asomtavruli, and Nuskhuri — …
The Ethiopic script (Ge'ez) is an abugida used to write Amharic, Tigrinya, …
Unicode encodes dozens of historic and extinct scripts — from Cuneiform and …
There are hundreds of writing systems in use around the world today, …