MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans
1 Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, P R China
2 Department of Animal Science and Center for Integrated Animal Genomics, Iowa State University, Ames, IA, 50011, USA
3 Department of Gene and Cell Engineering, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100094, P R China
BMC Bioinformatics 2007, 8:341 doi:10.1186/1471-2105-8-341Published: 17 September 2007
MicroRNAs (miRNAs) are recognized as one of the most important families of non-coding RNAs that serve as important sequence-specific post-transcriptional regulators of gene expression. Identification of miRNAs is an important requirement for understanding the mechanisms of post-transcriptional regulation. Hundreds of miRNAs have been identified by direct cloning and computational approaches in several species. However, there are still many miRNAs that remain to be identified due to lack of either sequence features or robust algorithms to efficiently identify them.
We have evaluated features valuable for pre-miRNA prediction, such as the local secondary structure differences of the stem region of miRNA and non-miRNA hairpins. We have also established correlations between different types of mutations and the secondary structures of pre-miRNAs. Utilizing these features and combining some improvements of the current pre-miRNA prediction methods, we implemented a computational learning method SVM (support vector machine) to build a high throughput and good performance computational pre-miRNA prediction tool called MiRFinder. The tool was designed for genome-wise, pair-wise sequences from two related species. The method built into the tool consisted of two major steps: 1) genome wide search for hairpin candidates and 2) exclusion of the non-robust structures based on analysis of 18 parameters by the SVM method. Results from applying the tool for chicken/human and D. melanogaster/D. pseudoobscura pair-wise genome alignments showed that the tool can be used for genome wide pre-miRNA predictions.