Open Access Research article

Evolutionary dynamics of protein domain architecture in plants

Xue-Cheng Zhang16*, Zheng Wang2, Xinyan Zhang37, Mi Ha Le1, Jianguo Sun3, Dong Xu24, Jianlin Cheng24 and Gary Stacey15

Author Affiliations

1 Division of Plant Sciences, University of Missouri, Columbia, MO 65211 USA

2 Department of Computer Sciences, University of Missouri, Columbia, MO 65211 USA

3 Department of Statistics, University of Missouri, Columbia, MO 65211 USA

4 Informatics Institute, University of Missouri, Columbia, MO 65211 USA

5 Center for Sustainable Energy, National Center for Soybean Biotechnology, Division of Biochemistry, University of Missouri, Columbia, MO 65211 USA

6 Department of Molecular Biology, Massachusetts General Hospital; Department of Genetics, Harvard Medical School, Boston, MA 02114 USA

7 School of Public Health, Harvard University, Boston, MA 02115

For all author emails, please log on.

BMC Evolutionary Biology 2012, 12:6  doi:10.1186/1471-2148-12-6

Published: 17 January 2012

Additional files

Additional file 1:

Overall and species-specific domain architectures in green plants. This file contains all Pfam-predicted domain architectures. The "overall" tab lists 11545 domain architectures predicted from all 14 plant genomes included in this study. The remaining tabs list the species-specific domain architectures.

Format: XLSX Size: 972KB Download file

Open Data

Additional file 2:

Distributions of overall domain architectures in plant lineages. Pfam-predicted domain architectures (listed in Additional file 1, Table S1) were sorted into different plant lineages or lineage combinations. Pattern A represents algal architectures; B, bryophyte and lycophyte architectures or early diverging architectures; ABCD, universal architectures; BCD, land architectures; CD angiosperm architectures; C, monocot architectures; and D, dicot architectures.

Format: XLSX Size: 809KB Download file

Open Data

Additional file 3:

Distribution of WD40-containing domain architectures in plant lineages. Domain architectures containing WD40 domain are represented as an example in plant lineages or lineage combinations.

Format: XLSX Size: 72KB Download file

Open Data

Additional file 4:

Distribution of prevalent domain architectures in plant lineages. This file lists all architectures present in the majority of the species in each lineage, e.g. prevalent architectures present in at least three out of five algal species, both P. patens and S. moellendorffii species, two out of three monocot species, and three out four dicot species.

Format: XLSX Size: 262KB Download file

Open Data

Additional file 5:

Genetic origins of early diverging domain architectures. BLASTp searches using protein sequences of early diverging-specific domain architectures show the presence of highly homologous sequences in non-plant species, including bacteria, fungi, ancient marine animals, and metazoan. This suggests a possible common ancestor of these sequences before splitting of bacteria, fungi, plants and animals.

Format: XLSX Size: 44KB Download file

Open Data

Additional file 6:

Genetic origin of a P. patens WD 40 protein. A WD40 architecture in P. patens is homologous to bacterial sequences as supported by a BLASTP search against NCBI database using the P. patens WD40 protein as query (top hits are all bacterial and fungal sequences ) and by a majority- ruled parsimony tree with maximum- likelihood branch length 9 the P. patens WD40 protein cluster together with bacterial sequences).

Format: PDF Size: 684KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Example domain architectures illustrating architecture expansion in green plants. Expansion of domain architectures in plants illustrated by representative architectures, including Myb_DNA-binding (2), F-box(1), as well as TIR (1). The right six columns indicate the pairwise comparison between lineages of the probability of domain architecture expansion during the plant genome evolution.

Format: PDF Size: 27KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Expansion of universal and land domain architectures. Expansion of universal and land domain architectures (BCD) was shown in individual tabs. Domain architectures that have not undergone expansion are also shown.

Format: XLSX Size: 1.1MB Download file

Open Data

Additional file 9:

Categorical distribution of domain architectures in TAIR8 and presumed TAIR6, 7, and 9 annotations. The effect of genome annotation errors on the genome-wide domain architecture content was examined by analyzing genome-wide domain architecture content built on different version of Arabidopsis annotations, which are thought to be well annotated. In general, we did not observe significant difference of domain architecture content between TAIR6, 7, 8 and 9 annotations.

Format: XLSX Size: 53KB Download file

Open Data