Friday, March 11, 2016

Creation Moment 3/12/2016 - Evolutionary Trees Cut Down

For the wisdom of this world is foolishness with God.
1 Corinthians 3:19
"According to an article in PLoS Genetics, there is a fundamental flaw in the way species trees are inferred from gene trees using molecular genetics that is guaranteed to produce erroneous results:
Because of the stochastic way in which lineages sort during speciation, gene trees may differ in topology from each other and from species trees.  Surprisingly, assuming that genetic lineages follow a coalescent model of within-species evolution, we find that for any species tree topology with five or more species, there exist branch lengths for which gene tree discordance is so common that the most likely gene tree topology to evolve along the branches of a species tree differs from the species phylogeny.  This counterintuitive result implies that in combining data on multiple loci, the straightforward procedure of using the most frequently observed gene tree topology as an estimate of the species tree topology can be asymptotically guaranteed to produce an incorrect estimate.
Their paper proves that “the ‘democratic vote’ procedure of using the most common gene tree as the estimate of the species tree is statistically inconsistent for phylogenetic inference.”  In fact, it is “positively misleading,” they claim.  Common methods used in phylogenetic studies do not take into account the “anomalous gene trees” (AGTs) that result from a flawed assumption: “the implicit premise that makes it sensible to estimate a species tree using a single gene tree or the most common among several gene trees—has remained unquestioned.”  They show that “discordance can occur between the species tree and the most likely gene tree” and that the data can converge on a wrong answer as the number of genes increases.  This is not just a theoretical problem, they say, and provide an example:
It is noteworthy that our theoretical results apply to known—rather than estimated—gene trees, and do not consider the effect of mutations on inference of gene trees.  This issue is important, as mutational history is a key factor in determining when an empirical study might actually be misled by AGTs.  As an illustration, in one human-chimp-gorilla study, a substantial fraction of loci—six of 45 considered—had no informative substitutions that could provide support to any particular phylogenetic grouping..  That this many loci would not have any phylogenetic information in the human-chimp-gorilla clade suggests that for the smaller branch lengths typical of the anomaly zone, the fraction of uninformative loci could be much greater.
Adding more genes to a study does not improve the statistics, nor does including other types of data,
Cutting down the evolutionary trees
such as genomic inversions or rearrangements.  Their best advice is to include samples with multiple individuals per species.  That, however, is unlikely due to the difficulty and expense of sequencing.  “Different algorithms for combining data on multiple loci will have different degrees of susceptibility to the occurrence of AGTs, and a challenge for phylogenetics is to identify those procedures that are best able to overcome this new obstacle to accurate inference of species trees.”
CEH