Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Technical Note

Treetrimmer: a method for phylogenetic dataset size reduction

Shinichiro Maruyama123, Robert JM Eveleigh1234 and John M Archibald123*

Author Affiliations

1 Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, NS, Canada

2 Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS, Canada

3 Integrated Microbial Biodiversity Program, Canadian Institute for Advanced Research, Montreal, QC H3A 1A4, Canada

4 McGill University and Génome Québec, 740 Docteur-Penfield Ave, Montreal, QC H3A 1A4, Canada

For all author emails, please log on.

BMC Research Notes 2013, 6:145  doi:10.1186/1756-0500-6-145

Published: 12 April 2013

Additional files

Additional file 1: Figure S1:

Phylogeny of Cytochrome c oxidase subunit 2 proteins. A) Phylogenetic tree of Cytochrome c oxidase subunit 2 proteins used in Figure 1A, with full descriptions of organismal names and accession numbers. Parameter input files (B) and (C) were used to generate the trees shown in Figure 1B and C, respectively, together with the Newick format input tree file (D) and the reference list of OTU names and taxonomic information (E). Figure S2. PsbO protein phylogeny with the query from Arabidopsis thaliana using various settings. Settings were as follows: (A) Maximum number of BLASTP hits retrieved, 2000; BLASTP cutoff value, 1e-5. Font colors represent taxonomic categories. (B) Dereplication cutoff, 0.8; the numbers of OTUs retained are 2 for Alveolata, 2 for Stramenopiles, 2 for Euglenozoa, 2 for Viridiplantae, and 2 for each genus if not included in these taxonomic categories. Note that the sequences were re-collected based on the TreeTrimmer output and re-aligned prior to constructing the tree. (C) Maximum number of BLASTP hits retrieved, 2000; BLASTP cutoff value, 1e-100. (D) Maximum number of BLASTP hits retrieved, 100; BLASTP cutoff value, 1e-5. Figure S3. Protein phylogeny of Myb-domain containing transcription factors from green plants (Viridiplantae). (A) Sequences homologous to GenBank accession BAA23337 (Oryza sativa OSMYB1) were collected from Arabidopsis thaliana (Tracheophyta), Oryza sativa (Tracheophyta), Zea mays (Tracheophyta), Brachypodium distachyon (Tracheophyta), Vitis vinifera (Tracheophyta), Physcomitrella patens (Bryophyta), and Cyanidioschyzon merolae (red alga, outgroup) by BLASTP, with the maximum number of hit 5000 and the e-value cut off 1e-5. Species names in Green, Tracheophyta; Blue, Bryophyta; Magenta, outgroup (red algal) OTUs. The support value for the whole Viridiplantae clade is shown in bold with an asterisk. (B) The tree was reconstructed using the TreeTrimmer output with the following settings: Support value cutoff, 0.8; the numbers of OTUs retained are 5 for Viridiplantae. (C) Tree was built in the same manner as in B but the numbers of OTUs retained are 2 for Bryophyta and 2 for Tracheophyta.

Format: PDF Size: 402KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data