A motif-based search in bacterial genomes identifies the ortholog of the small RNA Yfr1 in all lineages of cyanobacteria
1 University of Freiburg, Faculty of Biology, Experimental Bioinformatics, Schänzlestr. 1, D-79104 Freiburg, Germany
2 Humboldt University Berlin, Institute for Theoretical Biology, Invalidenstrasse 43, D-10115 Berlin, Germany
BMC Genomics 2007, 8:375 doi:10.1186/1471-2164-8-375Published: 17 October 2007
Non-coding RNAs (ncRNA) are regulators of gene expression in all domains of life. They control growth and differentiation, virulence, motility and various stress responses. The identification of ncRNAs can be a tedious process due to the heterogeneous nature of this molecule class and the missing sequence similarity of orthologs, even among closely related species. The small ncRNA Yfr1 has previously been found in the Prochlorococcus/Synechococcus group of marine cyanobacteria.
Here we show that screening available genome sequences based on an RNA motif and followed by experimental analysis works successfully in detecting this RNA in all lineages of cyanobacteria. Yfr1 is an abundant ncRNA between 54 and 69 nt in size that is ubiquitous for cyanobacteria except for two low light-adapted strains of Prochlorococcus, MIT 9211 and SS120, in which it must have been lost secondarily. Yfr1 consists of two predicted stem-loop elements separated by an unpaired sequence of 16–20 nucleotides containing the ultraconserved undecanucleotide 5'-ACUCCUCACAC-3'.
Starting with an ncRNA previously found in a narrow group of cyanobacteria only, we show here the highly specific and sensitive identification of its homologs within all lineages of cyanobacteria, whereas it was not detected within the genome sequences of E. coli and of 7 other eubacteria belonging to the alpha-proteobacteria, chlorobiaceae and spirochaete. The integration of RNA motif prediction into computational pipelines for the detection of ncRNAs in bacteria appears as a promising step to improve the quality of such predictions.