This article is part of the supplement: Seventh International Conference on Bioinformatics (InCoB2008)
MiRTif: a support vector machine-based microRNA target interaction filter
1 Institute of Molecular and Cell Biology, 61 Biopolis Drive, 138673, Singapore
2 Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei, 11221, Taiwan
BMC Bioinformatics 2008, 9(Suppl 12):S4 doi:10.1186/1471-2105-9-S12-S4Published: 12 December 2008
MicroRNAs (miRNAs) are a set of small non-coding RNAs serving as important negative gene regulators. In animals, miRNAs turn down protein translation by binding to the 3' UTR regions of target genes with imperfect complementary pairing. The identification of microRNA targets has become one of the major challenges of miRNA research. Bioinformatics investigations on miRNA target have resulted in a number of target prediction tools. Although these tools are capable of predicting hundreds of targets for a given miRNA, many of them suffer from high false positive rates, indicating the need for a post-processing filter for the predicted targets. Once trained with experimentally validated true and false targets, machine learning methods appear to be ideal approaches to distinguish the true targets from the false ones.
We present a miRNA target filtering system named MiRTif (miRNA:target interaction filter). The system is a support vector machine (SVM) classifier trained with 195 positive and 38 negative miRNA:target interaction pairs, all experimentally validated. Each miRNA:target interaction pair is divided into a seed and a non-seed region. The encoded feature vector contains various k-gram frequencies in the seed, the non-seed and the entire regions. Informative features are selected based on their discriminating abilities. Prediction accuracies are assessed using 10-fold cross-validation experiments. Our system achieves AUC (area under the ROC curve) of 0.86, sensitivity of 83.59%, and specificity of 73.68%. More importantly, the system correctly predicts majority of the false positive miRNA:target interactions (28 out of 38). The possibility of over-fitting due to the relatively small negative sample set has also been investigated using a set of non-validated and randomly selected targets (from miRBase).
MiRTif is designed as a post-processing filter that takes miRNA:target interactions predicted by other target prediction softwares such as TargetScanS, PicTar and miRanda as inputs, and determines how likely the given interaction is a real or a pseudo one. MiRTif can be accessed from http://bsal.ym.edu.tw/mirtif webcite.