Analyzing Genetic Connections between Languages by Matching Consonant Classes

Peter Turchin (University of Connecticut, Peter.Turchin@Uconn.edu); Ilia Peiros (Santa Fe Institute (New Mexico, USA), ilia@santafe.edu); Murray Gell-Mann (Santa Fe Institute (New Mexico, USA), mgm@santafe.edu)

Journal of Language Relationship, № 3, 2010 - p.117-126

Abstract: The idea that the Turkic, Mongolian, Tungusic, Korean, and Japanese languages are genetically related (the “Altaic hypothesis”) remains controversial within the linguistic community. In an effort to resolve such controversies, we propose a simple approach to analyzing genetic connections between languages. The Consonant Class Matching (CCM) method uses strict phonological identification and permits no changes in meanings. This allows us to estimate the probability that the observed similarities between a pair (or more) of languages occurred by chance alone. The CCM procedure yields reliable statistical inferences about historical connections between languages: it classifies languages correctly for well-known families (Indo-European and Semitic) and does not appear to yield false positives. The quantitative patterns of similarity that we document for languages within the Altaic family are similar to those in the non-controversial Indo-European family. Thus, if the Indo-European family is accepted as real, the same conclusion should also apply to the Altaic family.

Keywords: distant language relationship, comparative linguistics, phonetic similarity, Altaic languages, quantitative methods in linguistics

PDF