Phylogenetics in Lingustics - Why and How to?

Phylogenetic analyses play a key role in comparative linguistics. They provide not only information about the relationship of different languages, but also test hypotheses concerning the ages of language families. A typical data set contains grammatical features such as ”has productive plural marking on nouns” or statistics of what cognates exist in the respective languages. These are then encoded as a binary string representing the presence or absence of such features. Based upon this a tree showing the evolutionary relationships can be reconstructed.

Model parameters and the phylogenetic tree are estimated at the same time ensuring that all uncertainty is taken into account. Based on the data set and prior information about e.g. the diversification process or times of certain events a sample of the posterior distribution is created. This can be summarized for instance as a maximum-clade-credibility tree. Computation can be done with the software tool BEAST and the package Babel, which includes models and templates specifically made for linguistic purposes. A detailed tutorial on how to create a language phylogeny with BEAST2 and Babel can be found on the teaching platform Taming the BEAST.