This article is part of the supplement: Ninth International Conference on Bioinformatics (InCoB2010): Computational Biology
Discovery and characterization of medaka miRNA genes by next generation sequencing platform
1 Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
2 Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
3 Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
4 Institute of Information Sciences, Academia Sinica, Taipei, Taiwan
5 Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA
6 Institute of Cellular and Organismic Biology, Academia Sinica, Taiwan
BMC Genomics 2010, 11(Suppl 4):S8 doi:10.1186/1471-2164-11-S4-S8Published: 2 December 2010
MicroRNAs (miRNAs) are endogenous non-protein-coding RNA genes which exist in a wide variety of organisms, including animals, plants, virus and even unicellular organisms. Medaka (Oryzias latipes) is a useful model organism among vertebrate animals. However, no medaka miRNAs have been investigated systematically. It is beneficial to conduct a genome-wide miRNA discovery study using the next generation sequencing (NGS) technology, which has emerged as a powerful sequencing tool for high-throughput analysis.
In this study, we adopted ABI SOLiD platform to generate small RNA sequence reads from medaka tissues, followed by mapping these sequence reads back to medaka genome. The mapped genomic loci were considered as candidate miRNAs and further processed by a support vector machine (SVM) classifier. As result, we identified 599 novel medaka pre-miRNAs, many of which were found to encode more than one isomiRs. Besides, additional minor miRNAs (also called miRNA star) can be also detected with the improvement of sequencing depth. These quantifiable isomiRs and minor miRNAs enable us to further characterize medaka miRNA genes in many aspects. First of all, many medaka candidate pre-miRNAs position close to each other, forming many miRNA clusters, some of which are also conserved across other vertebrate animals. Secondly, during miRNA maturation, there is an arm selection preference of mature miRNAs within precursors. We observed the differences on arm selection preference between our candidate pre-miRNAs and their orthologous ones. We classified these differences into three categories based on the distribution of NGS reads. Finally, we also investigated the relationship between conservation status and expression level of miRNA genes. We concluded that the evolutionally conserved miRNAs were usually the most abundant ones.
Medaka is a widely used model animal and usually involved in many biomedical studies, including the ones on development biology. Identifying and characterizing medaka miRNA genes would benefit the studies using medaka as a model organism.