This article is part of the supplement: Ninth International Conference on Bioinformatics (InCoB2010): Computational Biology
Proceedings
Predicting RNA-binding residues from evolutionary information and sequence conservation
- Equal contributors
Author affiliations
1 Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, Republic of China
2 Department of Engineering Science and Oceanic Engineering, National Taiwan University, Taipei, Taiwan, Republic of China
Citation and License
BMC Genomics 2010, 11(Suppl 4):S2 doi:10.1186/1471-2164-11-S4-S2
Published: 2 December 2010Abstract
Background
RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments.
Results
The proposed prediction framework named “ProteRNA” combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F0.5-score of 0.3546.
Conclusions
This article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machine learning and pattern mining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments.


