Can comprehensive background knowledge be incorporated into substitution models to improve phylogenetic analyses? A case study on major arthropod relationships
-
* Corresponding author: Björn M von Reumont bmvr@arcor.de
1 Molecular Lab, Zoologisches Forschungsmuseum A. Koenig, Bonn, Germany
2 Department of Evolutionary Biology, University Vienna, Vienna, Austria
3 ITZ, Ecology & Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany
4 Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China
5 Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
6 UHH Biozentrum Grindel und Zoologisches Museum, University of Hamburg, Hamburg, Germany
BMC Evolutionary Biology 2009, 9:119 doi:10.1186/1471-2148-9-119
Published: 27 May 2009Additional files
Additional file 1:
Taxa list. Taxa list of sampled sequences. * indicates concatenated 18S and 28S rRNA sequences from different species. For combinations of genes to construct concatenated sequences of chimeran taxa, see Table S1. ** contributed sequences in the present study (author of sequences).
Format: XLS Size: 123KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 2:
LogDet corrected network of concatenated 18S and 28S rRNA alignment. LogDet corrected network plus invariant site models (30.79% invariant sites) using SplitsTree4 based on the concatenated 18S and 28S rRNA alignment after exclusion of randomly similar sections evaluated with ALISCORE.
Format: PDF Size: 53KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 3:
Bayesian support values for selected clades. List of Baysian support values (posterior probability, pP) for selected clades of the time-heterogeneous and time-homogeneous tree.
Format: XLS Size: 79KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 4:
Detailed flow of the analysis procedure in the software package PHASE-2.0. Options used in PHASE-2.0 are italicized above the arrows and are followed by input files. Black arrows represent general flows of the analysis procedure, green arrows show that results or parameter values after single steps were inserted or accessed in a further process. Red block-arrows mark the final run of the time-heterogeneous and time-homogeneous approach with 16 chains each (2 × 118,000,000 generations). First row: I.) We prepared 3 control files (control.mcmc) for mcmcphase using three different mixed models. This "pre-run" was used for a first model selection (500,000 generations for each setting). We excluded model (C) based on non-convergence of parameter values. II.) We repeated step one (I.) with 3,000,000 generations using similar control files (different number of generations and random seeds) of the two remaining model settings. Calculated ln likelihoods values of both chains were compared in a BFT resulting in the exclusion of mixed model (A). Parameter values of the remaining model (B) were implemented in the time-heterogeneous setting. III.) We started the final analysis (final run) using sixteen chains for both the time-homogeneous and the time-heterogeneous approach. In the final time-homogeneous approach, the control files were similar to step II.) except for a different number of generations and random seeds. Second row: Additional steps were necessary prior to the computation of the final time-heterogeneous chains. We applied mcmcsummarize for the selected mixed model (B) to calculate a consensus tree. Optimizer was executed to conduct a ML estimation for each parameter value (opt.mod) based on the inferred consensus tree and optimized parameter-values (mcmc-best.mod), a data file delivered by mcmcphase. Estimated values were implemented in an initial.mod file. The initial.mod file and its parameter values were accessed by the control files of the final time-heterogeneous chains (only topology and base frequencies estimated). Third row: Trees were reconstructed separately for the time-homogeneous and time-heterogeneous setting. All chains of each approach were tested in a BFT against the chain with the best lnL. We only included chains with a 2lnB10-value > 10. From these chains we constructed a metachain for each setting using Perl and applied mcmcsummarize to infer the consensus topology. To estimate branch lengths properly we ran mcmcphase, resulting branch lengths were implemented in the consensus trees. Finally, both trees were optimized using graphic programs (Dendroscope, Adobe Illustrator CS II).
Format: PDF Size: 447KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 5:
List of chimeran species for concatenated 18S and 28S rRNA sequences
Format: XLS Size: 117KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 6:
Primer list 18S rRNA
Format: XLS Size: 104KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 7:
Primer list 28S rRNA
Format: XLS Size: 110KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 8:
Primercard of the 18S rRNA gene for hexapods, myriapods and crustaceans. Primers used for hexapods or myriapods are shown in the upper part, primers for crustaceans in the lower part. Positions of forward primers are marked with green arrows, those of reverse primers with red arrows. When different primers with identical position were used, all primer labels are given at the single arrow for the specific position. Primers and their combinations are given in Additional file 6 and 11.
Format: PDF Size: 529KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 9:
Primercard of the 28S rRNA gene for crustaceans, hexapods and myriapods. Positions of forward primers are tagged with green arrows, those of reverse primers with red arrows. When different primers with identical position were used, all primer labels are given at the single arrow for the specific position. Primers and their combinations are given in Additional file 7 and 11.
Format: PDF Size: 577KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 10:
Primercard of the 28S rRNA gene for pterygots. Positions of forward primers are assigned by green arrows, those of reverse primers with red arrows. When different primers with identical position were used, all primer labels are given at the single arrow for the specific position. Primers and their combinations are given in Additional file 7 and 11.
Format: PDF Size: 565KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 11:
Supplementary Information. Supplementary information for lab work (amplificaion, purification and sequencing of PCR products).
Format: PDF Size: 83KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 12:
PCR temperature-profiles
Format: XLS Size: 75KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 13:
PCR chemicals
Format: XLS Size: 86KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 14:
Setting of exchangeability parameters used in pre-runs. Listed settings of exchangeability parameters used in pre-runs in PHASE-2.0.
Format: XLS Size: 48KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 15:
Included chains to infer the time-heterogeneous consensus tree. Number of chains, generations per chain, harmonic means (lnL) and 2lnB10-values included to infer the time-heterogeneous consensus tree.
Format: XLS Size: 55KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 16:
Included chains to infer the time-homogeneous consensus tree. Number of chains, generations per chain, harmonic means (lnL) and 2lnB10-values included to infer the time-homogeneous consensus tree.
Format: XLS Size: 55KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 17:
Localities of sampled taxa
Format: XLS Size: 127KB Download file
This file can be viewed with: Microsoft Excel Viewer
