Log on / register
Feedback | Support
Open AccessResearch article

Identification of putative regulatory upstream ORFs in the yeast genome using heuristics and evolutionary conservation

Marija Cvijović1,3 email, Daniel Dalevi2 email, Elizabeth Bilsland1,4 email, Graham JL Kemp2 email and Per Sunnerhagen1 email

1Department of Cell and Molecular Biology, Lundberg Laboratory, Göteborg University, PO Box 462 SE-405 30 Göteborg, Sweden

2Department of Computer Science and Engineering, Chalmers University of Technology, SE-412 96 Göteborg, Sweden

3Max-Planck Institute for Molecular Genetics, Ihnestraße 63, D-14195 Berlin, Germany

4Biochemistry Department, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK

author email corresponding author email

BMC Bioinformatics 2007, 8:295doi:10.1186/1471-2105-8-295

Published: 8 August 2007

Abstract

Background

The translational efficiency of an mRNA can be modulated by upstream open reading frames (uORFs) present in certain genes. A uORF can attenuate translation of the main ORF by interfering with translational reinitiation at the main start codon. uORFs also occur by chance in the genome, in which case they do not have a regulatory role. Since the sequence determinants for functional uORFs are not understood, it is difficult to discriminate functional from spurious uORFs by sequence analysis.

Results

We have used comparative genomics to identify novel uORFs in yeast with a high likelihood of having a translational regulatory role. We examined uORFs, previously shown to play a role in regulation of translation in Saccharomyces cerevisiae, for evolutionary conservation within seven Saccharomyces species. Inspection of the set of conserved uORFs yielded the following three characteristics useful for discrimination of functional from spurious uORFs: a length between 4 and 6 codons, a distance from the start of the main ORF between 50 and 150 nucleotides, and finally a lack of overlap with, and clear separation from, neighbouring uORFs. These derived rules are inherently associated with uORFs with properties similar to the GCN4 locus, and may not detect most uORFs of other types. uORFs with high scores based on these rules showed a much higher evolutionary conservation than randomly selected uORFs. In a genome-wide scan in S. cerevisiae, we found 34 conserved uORFs from 32 genes that we predict to be functional; subsequent analysis showed the majority of these to be located within transcripts. A total of 252 genes were found containing conserved uORFs with properties indicative of a functional role; all but 7 are novel. Functional content analysis of this set identified an overrepresentation of genes involved in transcriptional control and development.

Conclusion

Evolutionary conservation of uORFs in yeasts can be traced up to 100 million years of separation. The conserved uORFs have certain characteristics with respect to length, distance from each other and from the main start codon, and folding energy of the sequence. These newly found characteristics can be used to facilitate detection of other conserved uORFs.


© 1999-2008 BioMed Central Ltd unless otherwise stated