Open Access Open Badges Research article

Evolutionary dynamics of protein domain architecture in plants

Xue-Cheng Zhang16*, Zheng Wang2, Xinyan Zhang37, Mi Ha Le1, Jianguo Sun3, Dong Xu24, Jianlin Cheng24 and Gary Stacey15

Author Affiliations

1 Division of Plant Sciences, University of Missouri, Columbia, MO 65211 USA

2 Department of Computer Sciences, University of Missouri, Columbia, MO 65211 USA

3 Department of Statistics, University of Missouri, Columbia, MO 65211 USA

4 Informatics Institute, University of Missouri, Columbia, MO 65211 USA

5 Center for Sustainable Energy, National Center for Soybean Biotechnology, Division of Biochemistry, University of Missouri, Columbia, MO 65211 USA

6 Department of Molecular Biology, Massachusetts General Hospital; Department of Genetics, Harvard Medical School, Boston, MA 02114 USA

7 School of Public Health, Harvard University, Boston, MA 02115

For all author emails, please log on.

BMC Evolutionary Biology 2012, 12:6  doi:10.1186/1471-2148-12-6

Published: 17 January 2012



Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes.


Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications.


Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution.

domain architecture; evolutionary dynamics; plant lineage; genetic origin