This article is part of the supplement: Proceedings of the Great Lakes Bioinformatics Conference 2012
Evaluation of function predictions by PFP, ESG, and PSI-BLAST for moonlighting proteins
1 Department of Computer Science, Purdue University, 305 N. University Street, West Lafayette, Indiana 47907, USA
2 Department of Biological Sciences, Purdue University, 915 W. State Street, West Lafayette, Indiana 47907, USA
3 EA3900-BIOPI Biologie des Plantes et Innovation, Université de Picardie Jules Verne, 33 Rue St Leu, 80039 Amiens, France
BMC Proceedings 2012, 6(Suppl 7):S5 doi:10.1186/1753-6561-6-S7-S5Published: 13 November 2012
Advancements in function prediction algorithms are enabling large scale computational annotation for newly sequenced genomes. With the increase in the number of functionally well characterized proteins it has been observed that there are many proteins involved in more than one function. These proteins characterized as moonlighting proteins show varied functional behavior depending on the cell type, localization in the cell, oligomerization, multiple binding sites, etc. The functional diversity shown by moonlighting proteins may have significant impact on the traditional sequence based function prediction methods. Here we investigate how well diverse functions of moonlighting proteins can be predicted by some existing function prediction methods.
We have analyzed the performances of three major sequence based function prediction methods, PSI-BLAST, the Protein Function Prediction (PFP), and the Extended Similarity Group (ESG) on predicting diverse functions of moonlighting proteins. In predicting discrete functions of a set of 19 experimentally identified moonlighting proteins, PFP showed overall highest recall among the three methods. Although ESG showed the highest precision, its recall was lower than PSI-BLAST. Recall by PSI-BLAST greatly improved when BLOSUM45 was used instead of BLOSUM62.
We have analyzed the performances of PFP, ESG, and PSI-BLAST in predicting the functional diversity of moonlighting proteins. PFP shows overall better performance in predicting diverse moonlighting functions as compared with PSI-BLAST and ESG. Recall by PSI-BLAST greatly improved when BLOSUM45 was used. This analysis indicates that considering weakly similar sequences in prediction enhances the performance of sequence based AFP methods in predicting functional diversity of moonlighting proteins. The current study will also motivate development of novel computational frameworks for automatic identification of such proteins.