Comparative analysis of structured RNAs in S. cerevisiae indicates a multitude of different functions
1 Wilhelm-Schickard-Institut für Informatik, ZBIT-Center for Bioinformatics Tübingen, University of Tübingen, Sand-14, D-72076 Tübingen, Germany
2 Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics (IZBI), University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany
3 EMBL Outstation Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
4 Department of Theoretical Chemistry University of Vienna, Währingerstraße 17, A-1090 Wien, Austria
5 Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
BMC Biology 2007, 5:25 doi:10.1186/1741-7007-5-25Published: 18 June 2007
Non-coding RNAs (ncRNAs) are an emerging focus for both computational analysis and experimental research, resulting in a growing number of novel, non-protein coding transcripts with often unknown functions. Whole genome screens in higher eukaryotes, for example, provided evidence for a surprisingly large number of ncRNAs. To supplement these searches, we performed a computational analysis of seven yeast species and searched for new ncRNAs and RNA motifs.
A comparative analysis of the genomes of seven yeast species yielded roughly 2800 genomic loci that showed the hallmarks of evolutionary conserved RNA secondary structures. A total of 74% of these regions overlapped with annotated non-coding or coding genes in yeast. Coding sequences that carry predicted structured RNA elements belong to a limited number of groups with common functions, suggesting that these RNA elements are involved in post-transcriptional regulation and/or cellular localization. About 700 conserved RNA structures were found outside annotated coding sequences and known ncRNA genes. Many of these predicted elements overlapped with UTR regions of particular classes of protein coding genes. In addition, a number of RNA elements overlapped with previously characterized antisense transcripts. Transcription of about 120 predicted elements located in promoter regions and other, previously un-annotated, intergenic regions was supported by tiling array experiments, ESTs, or SAGE data.
Our computational predictions strongly suggest that yeasts harbor a substantial pool of several hundred novel ncRNAs. In addition, we describe a large number of RNA structures in coding sequences and also within antisense transcripts that were previously characterized using tiling arrays.