Revealing divergent evolution, identifying circular permutations and detecting active-sites by protein structure comparison
1 Institute of Systems Biology, Shanghai University, Shanghai 200444, China
2 Osaka Sangyo University, Nakagaito 3-1-1, Daito, Osaka 574-8530, Japan
3 Institute of Applied Mathematics, Academy of Mathematics and Systems Science, CAS, Beijing 100080, China
4 Graduate School of the Chinese Academy of Sciences, Beijing 100049,China
BMC Structural Biology 2006, 6:18 doi:10.1186/1472-6807-6-18Published: 2 September 2006
Protein structure comparison is one of the most important problems in computational biology and plays a key role in protein structure prediction, fold family classification, motif finding, phylogenetic tree reconstruction and protein docking.
We propose a novel method to compare the protein structures in an accurate and efficient manner. Such a method can be used to not only reveal divergent evolution, but also identify circular permutations and further detect active-sites. Specifically, we define the structure alignment as a multi-objective optimization problem, i.e., maximizing the number of aligned atoms and minimizing their root mean square distance. By controlling a single distance-related parameter, theoretically we can obtain a variety of optimal alignments corresponding to different optimal matching patterns, i.e., from a large matching portion to a small matching portion. The number of variables in our algorithm increases with the number of atoms of protein pairs in almost a linear manner. In addition to solid theoretical background, numerical experiments demonstrated significant improvement of our approach over the existing methods in terms of quality and efficiency. In particular, we show that divergent evolution, circular permutations and active-sites (or structural motifs) can be identified by our method. The software SAMO is available upon request from the authors, or from http://zhangroup.aporc.org/bioinfo/samo/ webcite and http://intelligent.eic.osaka-sandai.ac.jp/chenen/samo.htm webcite.
A novel formulation is proposed to accurately align protein structures in the framework of multi-objective optimization, based on a sequence order-independent strategy. A fast and accurate algorithm based on the bipartite matching algorithm is developed by exploiting the special features. Convergence of computation is shown in experiments and is also theoretically proven.