Structure of Transeurasian language family revealed by computational linguistic methods

Researchers applied Bayesian phylolinguistic methods to the Transeurasian language family for the first time to address the long-standing debate about its internal structure.

A new study by researchers at the Max Planck Institute for the Science of Human History combines classical historical-comparative linguistics and computational Bayesian phylogenetic methods to reveal the internal structure of the Transeurasian language family. The study, published in the Journal of Language Evolution, is the first to apply Bayesian methods to this long-debated language family.

The term Transeurasian refers to a family of languages that are theorized to have evolved from a single original language in Asia and now include up to five existing language families: Turkic, Mongolic, Tungusic, Koreanic and Japonic. There is long-standing debate as to whether or not these language families did indeed evolve from a single ancestor, and even amongst those who support the Transeurasian hypothesis, there are differing opinions on where the language families split and which are most closely related. The researchers used two approaches to confirm the most likely internal structure for the Transeurasian languages – they created proto-Transeurasian reconstructions using contemporary and historical lexical data, and then applied Bayesian phylogenetic methods to infer the internal structure that best fit the data.

The researchers found that the best fit for the internal structure was with Japonic and Koreanic clustering together and Tungusic clustering with Mongolic and Turkic, with Tungusic branching off first. “We get a strong historical signal with 98.3% support for Japano-Koreanic, 90.3% for Tungusic-Mongolic-Turkic, and 100% for Mongolic-Turkic,” states Remco Bouckaert of the University of Auckland and the Max Planck Institute for the Science of Human History.

“For the first time in the history of linguistics, we integrated classical historical-comparative linguistics and computational Bayesian phylolinguistics to infer a phylogeny of the Transeurasian languages,” explains Martine Robbeets, head of the ERC-funded Eurasia3angle research group at the Max Planck Institute for the Science of Human History. “Our results solve a long-standing question about the exact shape of the Transeurasian tree by providing a quantitative basis to test various competing hypotheses.”

Go to Editor View