Predicting gene ontology functions from protein's regional surface structures
- Equal contributors
1 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China
2 Graduate University of Chinese Academy of Sciences, Beijing 100049, China
3 Department of Electronics, Information and Communication Engineering, Osaka Sangyo University, Osaka 574-8530, Japan
4 Institute of Systems Biology, Shanghai University, Shanghai 200444, China
5 ERATO Aihara Complexity Modelling Project, JST, Tokyo 151-0064, Japan
6 Institute of Industrial Science, The University of Tokyo, Tokyo 153-8505, Japan
BMC Bioinformatics 2007, 8:475 doi:10.1186/1471-2105-8-475Published: 11 December 2007
Annotation of protein functions is an important task in the post-genomic era. Most early approaches for this task exploit only the sequence or global structure information. However, protein surfaces are believed to be crucial to protein functions because they are the main interfaces to facilitate biological interactions. Recently, several databases related to structural surfaces, such as pockets and cavities, have been constructed with a comprehensive library of identified surface structures. For example, CASTp provides identification and measurements of surface accessible pockets as well as interior inaccessible cavities.
A novel method was proposed to predict the Gene Ontology (GO) functions of proteins from the pocket similarity network, which is constructed according to the structure similarities of pockets. The statistics of the networks were presented to explore the relationship between the similar pockets and GO functions of proteins. Cross-validation experiments were conducted to evaluate the performance of the proposed method. Results and codes are available at: http://zhangroup.aporc.org/bioinfo/PSN/ webcite.
The computational results demonstrate that the proposed method based on the pocket similarity network is effective and efficient for predicting GO functions of proteins in terms of both computational complexity and prediction accuracy. The proposed method revealed strong relationship between small surface patterns (or pockets) and GO functions, which can be further used to identify active sites or functional motifs. The high quality performance of the prediction method together with the statistics also indicates that pockets play essential roles in biological interactions or the GO functions. Moreover, in addition to pockets, the proposed network framework can also be used for adopting other protein spatial surface patterns to predict the protein functions.