Protein-segment universe exhibiting transitions at intermediate segment length in conformational subspaces
1 Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
2 School of Life Science, Tokyo University of Pharmacy and Life Science, 1432-1 Horinouchi, Hachioji, Tokyo 192-0392, Japan
3 Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency, Chiyoda-ku, Tokyo 102-8666, Japan
4 PharmaDesign, Inc., 2-19-8 Hacchobori, Chuo-ku, Tokyo 104-0032, Japan
5 461 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
BMC Structural Biology 2008, 8:37 doi:10.1186/1472-6807-8-37Published: 13 August 2008
Many studies have examined rules governing two aspects of protein structures: short segments and proteins' structural domains. Nevertheless, the organization and nature of the conformational space of segments with intermediate length between short segments and domains remain unclear. Conformational spaces of intermediate length segments probably differ from those of short segments. We investigated the identification and characterization of the boundary(s) between peptide-like (short segment) and protein-like (long segment) distributions. We generated ensembles embedded in globular proteins comprising segments 10–50 residues long. We explored the relationships between the conformational distribution of segments and their lengths, and also protein structural classes using principal component analysis based on the intra-segment Cα-Cα atomic distances.
Our statistical analyses of segment conformations and length revealed critical dual transitions in their conformational distribution with segments derived from all four structural classes. Dual transitions were identified with the intermediate phase between the short segments and domains. Consequently, protein segment universes were categorized. i) Short segments (10–22 residues) showed a distribution with a high frequency of secondary structure clusters. ii) Medium segments (23–26 residues) showed a distribution corresponding to an intermediate state of transitions. iii) Long segments (27–50 residues) showed a distribution converging on one huge cluster containing compact conformations with a smaller radius of gyration. This distribution reflects the protein structures' organization and protein domains' origin. Three major conformational components (radius of gyration, structural symmetry with respect to the N-terminal and C-terminal halves, and single-turn/two-turn structure) well define most of the segment universes. Furthermore, we identified several conformational components that were unique to each structural class. Those characteristics suggest that protein segment conformation is described by compositions of the three common structural variables with large contributions and specific structural variables with small contributions.
The present results of the analyses of four protein structural classes show the universal role of three major components as segment conformational descriptors. The obtained perspectives of distribution changes related to the segment lengths using the three key components suggest both the adequacy and the possibility of further progress on the prediction strategies used in the recent de novo structure-prediction methods.