An experimental study comparing linguistic phylogenetic reconstruction methods
The estimation of linguistic evolution has intrigued many researchers for centuries, and in just the last few years, several new methods for constructing phylogenies from languages have been produced and used to analyze a number of language families. These analyses have led to a great deal of excitement, both within the field of historical linguistics and in related fields such as archaeology and human genetics. They have also been controversial, since the analyses have not always been consistent with each other, and the differences between different reconstructions have been potentially critical to the claims made by the different groups. In this paper, we report on a simulation study we performed in order to help resolve this controversy, which compares some of the main phylogeny reconstruction methods currently being used in linguistic cladistics. Our simulated datasets varied in the number of contact edges, the degree of homoplasy, the deviation from a lexical clock, and the deviation from the rates-across-sites assumption. We find the accuracy of the unweighted methods maximum parsimony, neighbor joining, lexico-statistics, and the method of Gray \& Atkinson, to be remarkably consistent across all the model conditions we studied, with maximum parsimony being the best, followed (often closely) by Gray \& Atkinson's method, then neighbor joining, and finally lexico-statistics (UPGMA). The accuracy of the two weighted methods (weighted maximum parsimony and weighted maximum compatibility) depends upon the appropriateness of the weighting scheme, and so depends upon the homoplasy levels produced by the model conditions; for low-homoplasy levels, however, the weighted methods generally produce the most accurate results of all methods, while the use of inappropriate weighting schemes can make for poorer results than maximum parsimony and Gray \& Atkinson's method under moderate to high homoplasy levels.