BMC Bioinformatics Volume 6
|
Viewing options:Associated material:Related literature:- Articles citing this article
- Other articles by authors
- Related articles/pages
Tools:Post to:
|
 Methodology articleClassification of real and pseudo microRNA precursors using local structure-sequence features and support vector machineChenghai Xue* 2,1 , Fei Li* 1 , Tao He1 , Guo-Ping Liu2,3 , Yanda Li1 and Xuegong Zhang1  1MOE Key Laboratory of Bioinformatics / Department of Automation, Tsinghua University, Beijing 100084, China 2Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China 3School of Electronics, University of Glamorgan, Pontypridd CF37 1DL, UK author email corresponding author email* Contributed equally
BMC Bioinformatics 2005,
6:310doi:10.1186/1471-2105-6-310
|
|
| Published: |
29 December 2005 |
Abstract
Background
MicroRNAs (miRNAs) are a group of short (~22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology.
Results
A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information.
Conclusion
The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs. |