Language Carbon Dating: The Science of Glottochronology

Coined from the Greek roots glōtta (tongue, language) and chronos (time), glottochronology is a method for estimating the time depth, or divergence date, of two related languages. It’s an idea that is both brilliantly simple and maddeningly complex, offering a tantalizing glimpse into a quantitative approach to language history.

What Is Glottochronology?

The central idea behind glottochronology was pioneered by American linguist Morris Swadesh in the 1950s. He started with a fundamental assumption inspired by radioactive decay: that the most basic, core part of a language’s vocabulary is replaced at a more or less constant rate over time. Just as Carbon-14 decays at a predictable half-life, Swadesh hypothesized that core words—like “I,” “water,” and “hand”—are lost and replaced at a stable, measurable pace across all languages and cultures.

If this assumption holds true, then by comparing the core vocabularies of two related languages and counting the number of words they still share from their common ancestor, we could theoretically calculate how much time has passed since they went their separate ways. It was a revolutionary proposal that promised to turn the relative timelines of historical linguistics (“Language A is older than Language B”) into absolute, numerical dates.

The Core of the Matter: The Swadesh List

The engine of glottochronology is the Swadesh list. Swadesh understood that not all words are created equal. Words for technology (“computer”), specific cultural items (“sushi”), or abstract ideas (“democracy”) are often borrowed from other languages or are recent inventions. To find a stable vocabulary, he sought out words for concepts thought to be universal and essential to human experience.

These words are less likely to be borrowed because every language already has a term for them. The original list contained 200 words, later refined to a more stable 100-word list. This list includes:

Pronouns (I, you, we)
Basic body parts (eye, nose, hand, tooth)
Natural elements (sun, moon, star, water, fire)
Essential verbs (to eat, to drink, to die, to see)
Simple adjectives (big, small, long)
Numerals (one, two)

The process involves comparing the Swadesh lists of two languages and identifying cognates—words that derive from the same ancestral word. For example, the English word “one” and the German “eins” are cognates, both stemming from the Proto-Germanic *ainaz. By contrast, English “dog” and German “Hund” are not cognates, even though they mean the same thing.

The Glottochronological Formula

Once you have the percentage of shared cognates (c), you can plug it into a formula to get the time of divergence (t). The basic formula Swadesh proposed is:

t = log(c) / (2 * log(r))

Let’s break that down:

t is the time of divergence, usually measured in millennia.
c is the percentage of cognates shared between the two languages (e.g., 0.70 for 70%).
r is the proposed constant of retention—the percentage of core words a language is assumed to retain over a thousand years. Swadesh initially calculated this to be about 86% (or 0.86) for his 100-word list.

For instance, if English and German share roughly 60% of their core vocabulary as cognates, the formula would suggest they diverged around 1,500 years ago, a date that aligns reasonably well with historical estimates for the breakup of West Germanic languages.

The Storm of Controversy: Why Linguists Argue

While the theory is elegant, its core assumptions came under immediate and intense scrutiny. The analogy to radioactive decay, it turns out, is deeply flawed. The linguistic community raised several major objections that continue to fuel debate today.

The Rate of Change Isn’t Constant: This is the most damaging criticism. Unlike atoms, humans and their languages are subject to social and historical pressures. Factors like intense cultural contact (e.g., the Norman conquest of England, which flooded English with French words), nationalistic purism, literary traditions, and social taboos can dramatically speed up or slow down the rate of vocabulary replacement. Icelandic, due to its geographic isolation and strong literary tradition, has changed far less in 1,000 years than English has.
The Swadesh List Isn’t Truly Universal or Stable: While the concepts are mostly universal, the words themselves are not immune to borrowing or change. For instance, the English word “mountain” is a core concept, but it was borrowed from Old French, replacing the native Old English word beorg (related to the modern “barrow”). Similarly, a word like “liver” might be replaced due to a taboo on mentioning internal organs.
Identifying Cognates is Subjective: Determining whether two words are true cognates or simply chance resemblances can be difficult and requires expert historical linguistic analysis. A non-expert might think English “much” and Spanish “mucho” are cognates, but they have completely different origins. This subjectivity introduces potential errors into the initial data.

The Verdict: A Flawed Tool or a Useful Heuristic?

Today, classic glottochronology as a precise dating method is largely rejected by mainstream historical linguists. The assumption of a constant rate of change has been proven to be unreliable. No serious linguist would publish a divergence date calculated with the original Swadesh formula and present it as fact.

However, the legacy of glottochronology is not one of complete failure. Its revolutionary premise forced the field to engage with quantitative methods and statistical analysis. It highlighted the important concept that some parts of a language are more stable than others, a principle that remains fundamental to historical reconstruction.

Modern approaches, often using sophisticated Bayesian statistical models, are direct descendants of Swadesh’s work. These newer methods don’t assume a single, constant rate of change. Instead, they allow the rate to vary across different branches of a language family, and they can integrate linguistic data with archaeological and genetic evidence. They are far more powerful and nuanced, but they owe a conceptual debt to the simple, bold idea of a “word clock.”

In the end, it’s best to think of glottochronology not as a precision timepiece like a carbon-14 detector, but as a rough, weathered sundial. You wouldn’t use it to set your watch, but on a clear day, it can still give you a fascinating, if approximate, idea of the time.