Category Archives: Language

The birthplace of English?

Turkish countryside near Hierapolis

I speak English (no surprise). I can also speak German, though more poorly than when I was in college. And I can understand a handful of Spanish, Italian, and Chinese words. But I am completely hopeless when it comes to Russian, Hindi, Tagalog, and hundreds of other languages. It’s a shame, but I know I’m not alone.

Language differences are both a hurdle to common understanding and a window into cultural differences. Which is why linguists, sociologists, and archaeologists are so intent on shedding light on the origin question. Research published today in the journal Science tries to trace the origin of the Indo-European language family, the largest in terms of native speakers and geographic distribution.

There are nearly 450 languages and dialects in the family, including English, German, the Romance languages, Hindi, Russian, and many others. If you haven’t studied them, most of these languages may seem like Greek to you.¹ But many of their words are surprisingly similar, even to the layperson. Take “three”, for example. In German, it’s drei, in Spanish it’s tres, in Greek it’s tria, and in Urdu (spoken in Pakistan and parts of India) it’s pronounced theen.

There are two prevailing theories about the origin of the Indo-European language family. The first proposes that the Kurgan people, a culture common to nomadic herders living on the steppe between and north of the Black and Caspian seas, first started spreading its language some 5,000 years ago. Recent archaeological and linguistic data provide evidence for this hypothesis. The other theory says farmers in Anatolia, or present-day Turkey, spread their tongue along with their farming techniques about 8,000 to 9,000 years ago.

It’s the nomad hypothesis versus the farmer hypothesis. Under the nomadic theory, it’s easy to imagine a more violent expansion of the culture and language, though the Kurgan diffusion may have been peaceful, too. The Anatolian expansion, though, was almost certainly more peaceful, with the language following the adoption of technologically advanced farming techniques.

Both sides have their staunch supporters in academia, but today’s Science paper gives new ammunition to Anatolian advocates. Quentin Atkinson, a senior lecturer at the University of Auckland in New Zealand, and his co-authors started by borrowing from the toolkits of epidemiologists and conservation biologists, using computer models that were first developed to trace the origin of diseases and the geographic ancestry of different species. They fed the models geographic and linguistic data from 103 ancient and modern-day Indo-European languages. Their results supported the Anatolian hypothesis regardless of which model variables they tuned.

Indo-European language origin map

Cladogram and map of the diversification of various Indo-European languages. From Bouckaert et al. 2012 (cited and linked below).

The spread of farming was probably a big driving force behind the geographic expansion of the Indo-European family, Atkinson and his colleagues said, but it wasn’t the only one. They point out that languages continued to spread and diversify well after agriculture was established in many areas, suggesting other factors. While their findings probably won’t settle the debate between the Kurgan and Anatolian camps any time soon, they do provide an intriguing look into the common ancestry shared by so many of our native tongues.

  1. Yes, Greek is an Indo-European language. Sorry for the bad pun.


Remco Bouckaert, Philippe Lemey, Michael Dunn, Simon J. Greenhill, Alexander V. Alekseyenko, Alexei J. Drummond, Russell D. Gray, Marc A. Suchard, and Quentin D. Atkinson. 2012. “Mapping the Origins and Expansion of the Indo-European Language Family”. Science 337:957-960. DOI: 10.1126/science.1219669

Photo by Ian W. Scott.

Related posts:

The curious relationship between place names and population density

Which reads faster, Chinese or English?

Southern regions nurtured languages

Southern regions nurtured languages

La Florida, by Abraham Ortelius

In the last few years, I’ve had the good fortune of befriending a pair of Italians. Before meeting them, I admit I knew relatively little about Italian culture apart from the typical American stereotypes. I grew up in an area with strong German roots, and the college I attended maintains close ties with Norway. Needless to say, I was not well acquainted with southern European cultures.

But thanks to my friends, that’s been changing. Among other things, I’ve been picking up bits of Italian, both the standard tongue and the Veneto dialect. Italy, I’ve learned, is a country defined by a common language which many Italians don’t speak at home. There doesn’t seem to be much agreement on the exact number of dialects, but estimates range from around a dozen to over 50.

That Italy has so many dialects shouldn’t surprise an astute student of history. The region was heavily balkanized prior to unification in the mid-1800s. But Italy’s dialectal diversity may also be the product of another quirk of geography. A study done in the mid-1990s by two British professors—an evolutionary anthropologist and an evolutionary biologist—revealed a distinct trend in the languages of North American native peoples at the time of European contact. More languages were spoken in southern latitudes and the range over which those languages were spoken was smaller. In other words, language density increased closer to the equator.

The scientists discovered this trend when analyzing the first comprehensive map of the world’s languages, Atlas of the World’s Languages, which was initially published in 1993.¹ Focusing on languages spoken by native peoples when Europeans first arrived, they counted the number of tongues that a line of latitude crossed as it ran east-west across the continent. Their survey spanned 8 ˚N and ended at 70 ˚N, the furthest north an entire latitudinal span was inhabited by humans.

Upon tallying their results, a few things stood out. First, the number of languages peaked at 40 ˚N—the parallel that runs approximately through Philadelphia, Denver, and Reno.² Perhaps coincidentally—or perhaps not—this northing is also where the number of mammal species peaks in North America.³ They also discovered the number of languages per square kilometer rises exponentially as you head south. Further, the number of parallels each language intersected increased as they moved north, a function of both language density and the non-overlapping nature of native peoples’ languages at the time. Finally, the number of languages increased with habitat diversity.

The authors speculate that greater habitat diversity at southern latitudes was responsible in part for the greater density of languages. More habitat diversity tends to increase resource abundance, which would allow smaller groups of people to survive in those areas. After groups divided or a new group formed, cultural or geographic barriers may have fostered linguistic diversification.

With the advent of global communications networks, many languages and dialects are slowly dying out. That’s partially driven by the the need to communicate with ever more people in ever more places. But what’s pushing in that direction? One answer could be the world’s population. Earth is a planet of finite resources, and perhaps efficient use requires more interaction. People learned long ago that we need to cooperate to survive. Language is an amazingly efficient vehicle for that. Today, the need to cooperate—and communicate—is greater than ever.

¹ I’d love to get my hands on it, but it sells for over $700. Time to hit the library.

² The top of Italy’s boot heal is at about 40 ˚N. That’s not to imply any correlation, just to provide a frame of reference.

³ Expect more on the species-latitude relationship in a later post.


Mace, R., & Pagel, M. (1995). A Latitudinal Gradient in the Density of Human Languages in North America Proceedings of the Royal Society B: Biological Sciences, 261 (1360), 117-121 DOI: 10.1098/rspb.1995.0125

Map scanned by Norman B. Leventhal Map Center at the BPL.

Related posts:

Which reads faster, Chinese or English?

This is your brain in the city

Which reads faster, Chinese or English?

Guangfu Rd., Jiali District, Tainan County, Taiwan

If there’s one thing that can dazzle my Western eyes, it’s the main drag of any Taiwanese town. On my recent trip to Taiwan, I saw billboards and signs for local shops that dripped from buildings with so many hues Benjamin Moore would blush. Once my mind had adjusted to the mishmash of colors, I noticed the Chinese characters, or rather their number. On each sign, there were strikingly few.

Compared with English, Chinese is a dense language. Its complex characters can convey considerable information in a very small amount of space, or where space isn’t a concern, convey that information more boldly. Given Chinese’s compact written form, I wondered how language density affected the speed at which people read.

My intuition told me that of two native speakers—one Chinese, one English—the Chinese speaker could zip through an equivalent passage in less time because each character says more. But information density can also work against a reader. Chinese’s trade-off is its complexity, both in terms of the immense number of characters—tens of thousands according to some dictionaries, though only about 4,600 are commonly used today—and the fact that nearly all of them are more baroque than any letter in the alphabet. This means someone reading Chinese must dig into the structure of each character to decipher its meaning.

Chinese characters aren’t all unique, though. Similar to English words, there are some repeating themes among them. Each character, or hanzi, consists of strokes and radicals. Strokes are single lines or curves, of which there are about 20. Radicals are constructed from several strokes, and there are about 200 of them. Characters are built by varying the presence and number of strokes and radicals. This has its advantages: proficient readers can decipher both the meaning and pronunciation of an unfamiliar character by deconstructing it. While some characters constitute an entire word, others are multiple characters strung together, much like words in English. Still, Chinese words tend to be short on average—only 1.5 characters per word, compared with 5.1 letters per word for English.

Dragon, in Traditional Chinese and English

So which is more quickly read, English or Chinese? Chinese’s high information density could work for it—more complexity could impart more meaning per glance— or against it—each character could require a longer stare to decipher. The answer is neither.

English and Chinese are, by and large, read at the same speeds. In one study, both languages were read at approximately the same rate—English at 382 words per minute and Chinese at the equivalent of 386 words per minute. A statistical tie. Another study found the percentage of times a person moves backward in a text—a sign the person is having trouble processing the words—to be about the same for English and Chinese.

What simple statistics on reading speed don’t convey is how dramatically different the experience of reading is for each language. When reading English, our eyes perceive 7–8 letters a time, whereas with Chinese we perceive only 2.6 characters at once. This span—known as a saccade—multiplied by how long we fixate on it equals reading speed. Since readers of English and Chinese tend to fixate on a saccade for the same amount of time, naïve multiplication would lead you to believe that Chinese is read more slowly. After all, a reader of Chinese processes fewer characters per saccade than an English reader, and each saccade lasts about the same amount of time in both languages. But that’s only if you ignore information density. Written Chinese is dense, so though comprehension of characters is slower than letters, meaning is conveyed at the same rate as in English.

This jibes with the gist of a recent study on spoken language speed, which found that while some languages like Spanish sound faster than others, the amount of information imparted is the same. That’s because each syllable in a fast-sounding language like Spanish has less meaning than a slower one like English or Chinese. Spanish speakers have to run through more syllables to get the same point across, thus sounding faster.

Earlier linguists had suggested that Chinese might be faster to read because of a physiological quirk of our eyes—they thought the square shape of Chinese characters fit the most acute region of our retina (the fovea) better than long, string-like English words. But the authors of the first written language study I mentioned—the one that measured words read per minute—speculated that reading speed is instead limited by a cognitive bottleneck. The fact that both reading and speaking seem to follow to the same rules suggests they were right. Cognition—not language—appears to control the rate at which we communicate.


Sun F, Morita M, & Stark LW (1985). Comparative patterns of reading eye movement in Chinese and English. Perception & Psychophysics, 37 (6), 502-6 PMID: 4059005

Sun, F, & Feng, D (2010). Eye movements in reading Chinese and English text Reading Chinese Script: A cognitive analysis, Eds. Jian Wang, Albrecht W. Inhoff, Hsuan-Chih Chen., 189-205 ISBN: 9780805824780

Yan, G., Tian, H., Bai, X., & Rayner, K. (2006). The effect of word and character frequency on the eye movements of Chinese readers British Journal of Psychology, 97 (2), 259-268 DOI: 10.1348/000712605X70066

Photo by Tim De Chant.

Related posts:

This is your brain in the city

Thinking about how we think about landscapes

Southern regions nurtured languages