Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the 21st International Conference on Genome Informatics (GIW2010)

Open Access Research

MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features

Jiandong Ding1, Shuigeng Zhou1* and Jihong Guan2*

Author Affiliations

1 Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China

2 Department of Computer Science and Technology, Tongji University, Shanghai 201804, China

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 11):S11  doi:10.1186/1471-2105-11-S11-S11

Published: 14 December 2010

Abstract

Background

MicroRNAs (simply miRNAs) are derived from larger hairpin RNA precursors and play essential regular roles in both animals and plants. A number of computational methods for miRNA genes finding have been proposed in the past decade, yet the problem is far from being tackled, especially when considering the imbalance issue of known miRNAs and unidentified miRNAs, and the pre-miRNAs with multi-loops or higher minimum free energy (MFE). This paper presents a new computational approach, miRenSVM, for finding miRNA genes. Aiming at better prediction performance, an ensemble support vector machine (SVM) classifier is established to deal with the imbalance issue, and multi-loop features are included for identifying those pre-miRNAs with multi-loops.

Results

We collected a representative dataset, which contains 697 real miRNA precursors identified by experimental procedure and other computational methods, and 5428 pseudo ones from several datasets. Experiments showed that our miRenSVM achieved a 96.5% specificity and a 93.05% sensitivity on the dataset. Compared with the state-of-the-art approaches, miRenSVM obtained better prediction results. We also applied our method to predict 14 Homo sapiens pre-miRNAs and 13 Anopheles gambiae pre-miRNAs that first appeared in miRBase13.0, MiRenSVM got a 100% prediction rate. Furthermore, performance evaluation was conducted over 27 additional species in miRBase13.0, and 92.84% (4863/5238) animal pre-miRNAs were correctly identified by miRenSVM.

Conclusion

MiRenSVM is an ensemble support vector machine (SVM) classification system for better detecting miRNA genes, especially those with multi-loop secondary structure.