Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009)

Open Access Research

HHMMiR: efficient de novo prediction of microRNAs using hierarchical hidden Markov models

Sabah Kadri1*, Veronica Hinman2 and Panayiotis V Benos3*

Author Affiliations

1 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA

2 Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA

3 Department of Computational Biology, University of Pittsburgh, Pittsburgh, PA 15260, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 1):S35  doi:10.1186/1471-2105-10-S1-S35

Published: 30 January 2009

Abstract

Background

MicroRNAs (miRNAs) are small non-coding single-stranded RNAs (20–23 nts) that are known to act as post-transcriptional and translational regulators of gene expression. Although, they were initially overlooked, their role in many important biological processes, such as development, cell differentiation, and cancer has been established in recent times. In spite of their biological significance, the identification of miRNA genes in newly sequenced organisms is still based, to a large degree, on extensive use of evolutionary conservation, which is not always available.

Results

We have developed HHMMiR, a novel approach for de novo miRNA hairpin prediction in the absence of evolutionary conservation. Our method implements a Hierarchical Hidden Markov Model (HHMM) that utilizes region-based structural as well as sequence information of miRNA precursors. We first established a template for the structure of a typical miRNA hairpin by summarizing data from publicly available databases. We then used this template to develop the HHMM topology.

Conclusion

Our algorithm achieved average sensitivity of 84% and specificity of 88%, on 10-fold cross-validation of human miRNA precursor data. We also show that this model, trained on human sequences, works well on hairpins from other vertebrate as well as invertebrate species. Furthermore, the human trained model was able to correctly classify ~97% of plant miRNA precursors. The success of this approach in such a diverse set of species indicates that sequence conservation is not necessary for miRNA prediction. This may lead to efficient prediction of miRNA genes in virtually any organism.