Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Can comprehensive background knowledge be incorporated into substitution models to improve phylogenetic analyses? A case study on major arthropod relationships

Björn M von Reumont1 email, Karen Meusemann1 email, Nikolaus U Szucsich2 email, Emiliano Dell'Ampio2 email, Vivek Gowri-Shankar email, Daniela Bartel2 email, Sabrina Simon3 email, Harald O Letsch1 email, Roman R Stocsits1 email, Yun-xia Luan4 email, Johann Wolfgang Wägele1 email, Günther Pass2 email, Heike Hadrys3,5 email and Bernhard Misof6 email

Molecular Lab, Zoologisches Forschungsmuseum A. Koenig, Bonn, Germany

Department of Evolutionary Biology, University Vienna, Vienna, Austria

ITZ, Ecology & Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany

Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China

Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA

UHH Biozentrum Grindel und Zoologisches Museum, University of Hamburg, Hamburg, Germany

author email corresponding author email

BMC Evolutionary Biology 2009, 9:119doi:10.1186/1471-2148-9-119

Published: 27 May 2009

Abstract

Background

Whenever different data sets arrive at conflicting phylogenetic hypotheses, only testable causal explanations of sources of errors in at least one of the data sets allow us to critically choose among the conflicting hypotheses of relationships. The large (28S) and small (18S) subunit rRNAs are among the most popular markers for studies of deep phylogenies. However, some nodes supported by this data are suspected of being artifacts caused by peculiarities of the evolution of these molecules. Arthropod phylogeny is an especially controversial subject dotted with conflicting hypotheses which are dependent on data set and method of reconstruction. We assume that phylogenetic analyses based on these genes can be improved further i) by enlarging the taxon sample and ii) employing more realistic models of sequence evolution incorporating non-stationary substitution processes and iii) considering covariation and pairing of sites in rRNA-genes.

Results

We analyzed a large set of arthropod sequences, applied new tools for quality control of data prior to tree reconstruction, and increased the biological realism of substitution models. Although the split-decomposition network indicated a high noise content in the data set, our measures were able to both improve the analyses and give causal explanations for some incongruities mentioned from analyses of rRNA sequences. However, misleading effects did not completely disappear.

Conclusion

Analyses of data sets that result in ambiguous phylogenetic hypotheses demand for methods, which do not only filter stochastic noise, but likewise allow to differentiate phylogenetic signal from systematic biases. Such methods can only rely on our findings regarding the evolution of the analyzed data. Analyses on independent data sets then are crucial to test the plausibility of the results. Our approach can easily be extended to genomic data, as well, whereby layers of quality assessment are set up applicable to phylogenetic reconstructions in general.


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.