Email updates

Keep up to date with the latest news and content from BMC Evolutionary Biology and BioMed Central.

Open Access Highly Accessed Research article

Looking for trees in the forest: summary tree from posterior samples

Joseph Heled12* and Remco R Bouckaert12

Author Affiliations

1 Department of Computer Science, University of Auckland, Auckland New Zealand

2 Computational Evolution Group, University of Auckland, Auckland New Zealand

For all author emails, please log on.

BMC Evolutionary Biology 2013, 13:221  doi:10.1186/1471-2148-13-221

Published: 4 October 2013

Abstract

Background

Bayesian phylogenetic analysis generates a set of trees which are often condensed into a single tree representing the whole set. Many methods exist for selecting a representative topology for a set of unrooted trees, few exist for assigning branch lengths to a fixed topology, and even fewer for simultaneously setting the topology and branch lengths. However, there is very little research into locating a good representative for a set of rooted time trees like the ones obtained from a BEAST analysis.

Results

We empirically compare new and known methods for generating a summary tree. Some new methods are motivated by mathematical constructions such as tree metrics, while the rest employ tree concepts which work well in practice. These use more of the posterior than existing methods, which discard information not directly mapped to the chosen topology. Using results from a large number of simulations we assess the quality of a summary tree, measuring (a) how well it explains the sequence data under the model and (b) how close it is to the “truth”, i.e to the tree used to generate the sequences.

Conclusions

Our simulations indicate that no single method is “best”. Methods producing good divergence time estimates have poor branch lengths and lower model fit, and vice versa. Using the results presented here, a user can choose the appropriate method based on the purpose of the summary tree.