Insertions and the emergence of novel protein structure: a structure-based phylogenetic study of insertions
1 Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, B3H 1W5, Canada
2 Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 1X5, Canada
BMC Bioinformatics 2007, 8:444 doi:10.1186/1471-2105-8-444Published: 15 November 2007
In protein evolution, the mechanism of the emergence of novel protein domain is still an open question. The incremental growth of protein variable regions, which was produced by stochastic insertions, has the potential to generate large and complex sub-structures. In this study, a deterministic methodology is proposed to reconstruct phylogenies from protein structures, and to infer insertion events in protein evolution. The analysis was performed on a broad range of SCOP domain families.
Phylogenies were reconstructed from protein 3D structural data. The phylogenetic trees were used to infer ancestral structures with a consensus method. From these ancestral reconstructions, 42.7% of the observed insertions are nested insertions, which locate in previous insert regions. The average size of inserts tends to increase with the insert rank or total number of insertions in the variable regions. We found that the structures of some nested inserts show complex or even domain-like fold patterns with helices, strands and loops. Furthermore, a basal level of structural innovation was found in inserts which displayed a significant structural similarity exclusively to themselves. The β-Lactamase/D-ala carboxypeptidase domain family is provided as an example to illustrate the inference of insertion events, and how the incremental growth of a variable region is capable to generate novel structural patterns.
Using 3D data, we proposed a method to reconstruct phylogenies. We applied the method to reconstruct the sequences of insertion events leading to the emergence of potentially novel structural elements within existing protein domains. The results suggest that structural innovation is possible via the stochastic process of insertions and rapid evolution within variable regions where inserts tend to be nested. We also demonstrate that the structure-based phylogeny enables the study of new questions relating to the evolution of protein domain and biological function.