<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-8-295</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Identification of putative regulatory upstream ORFs in the yeast genome using heuristics and evolutionary conservation</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Cvijovi&#263;</snm>
               <fnm>Marija</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>cvijovic@molgen.mpg.de</email>
            </au>
            <au id="A2">
               <snm>Dalevi</snm>
               <fnm>Daniel</fnm>
               <insr iid="I2"/>
               <email>dalevi@cs.chalmers.se</email>
            </au>
            <au id="A3">
               <snm>Bilsland</snm>
               <fnm>Elizabeth</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>eb343@cam.ac.uk</email>
            </au>
            <au id="A4">
               <snm>Kemp</snm>
               <mi>JL</mi>
               <fnm>Graham</fnm>
               <insr iid="I2"/>
               <email>kemp@cs.chalmers.se</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Sunnerhagen</snm>
               <fnm>Per</fnm>
               <insr iid="I1"/>
               <email>per.sunnerhagen@cmb.gu.se</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Cell and Molecular Biology, Lundberg Laboratory, G&#246;teborg University, PO Box 462 SE-405 30 G&#246;teborg, Sweden</p>
            </ins>
            <ins id="I2">
               <p>Department of Computer Science and Engineering, Chalmers University of Technology, SE-412 96 G&#246;teborg, Sweden</p>
            </ins>
            <ins id="I3">
               <p>Max-Planck Institute for Molecular Genetics, Ihnestra&#223;e 63, D-14195 Berlin, Germany</p>
            </ins>
            <ins id="I4">
               <p>Biochemistry Department, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>295</fpage>
         <url>http://www.biomedcentral.com/1471-2105/8/295</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17686169</pubid>
               <pubid idtype="doi">10.1186/1471-2105-8-295</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>08</day>
               <month>6</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>08</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>08</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Cvijovi&#263; et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The translational efficiency of an mRNA can be modulated by upstream open reading frames (uORFs) present in certain genes. A uORF can attenuate translation of the main ORF by interfering with translational reinitiation at the main start codon. uORFs also occur by chance in the genome, in which case they do not have a regulatory role. Since the sequence determinants for functional uORFs are not understood, it is difficult to discriminate functional from spurious uORFs by sequence analysis.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We have used comparative genomics to identify novel uORFs in yeast with a high likelihood of having a translational regulatory role. We examined uORFs, previously shown to play a role in regulation of translation in <it>Saccharomyces cerevisiae</it>, for evolutionary conservation within seven <it>Saccharomyces </it>species. Inspection of the set of conserved uORFs yielded the following three characteristics useful for discrimination of functional from spurious uORFs: a length between 4 and 6 codons, a distance from the start of the main ORF between 50 and 150 nucleotides, and finally a lack of overlap with, and clear separation from, neighbouring uORFs. These derived rules are inherently associated with uORFs with properties similar to the <it>GCN4 </it>locus, and may not detect most uORFs of other types. uORFs with high scores based on these rules showed a much higher evolutionary conservation than randomly selected uORFs. In a genome-wide scan in <it>S. cerevisiae</it>, we found 34 conserved uORFs from 32 genes that we predict to be functional; subsequent analysis showed the majority of these to be located within transcripts. A total of 252 genes were found containing conserved uORFs with properties indicative of a functional role; all but 7 are novel. Functional content analysis of this set identified an overrepresentation of genes involved in transcriptional control and development.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Evolutionary conservation of uORFs in yeasts can be traced up to 100 million years of separation. The conserved uORFs have certain characteristics with respect to length, distance from each other and from the main start codon, and folding energy of the sequence. These newly found characteristics can be used to facilitate detection of other conserved uORFs.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The expression of protein-coding genes in eukaryotes is regulated on several levels even after the transcript has been formed. Translation into protein requires assembly of ribosomes with initiation factors on the mRNA in the 5'-untranslated region (5'-UTR) near the initiation codon. After completion of a translation round, at the stop codon, termination factors cause the ribosome to dissociate and fall off the template. Scanning of the mRNA by the ribosome from its 5' end is seen as the major mechanism for locating the start codon of the main ORF <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. In several cases, one or several ORFs are present in the 5'-UTR. Such uORFs can negatively regulate translation of the main ORF by interfering with reassembly of the initiation complex at its start codon. Conceptually, this could occur through several mechanisms (for review, see <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>). The ribosome could remain bound to the mRNA downstream of the uORF, blocking further rounds of translation. In at least one case in yeast, <it>CPA1</it>, it has been convincingly shown that missense mutations at internal positions in the uORF abolish its function, implying that the uORF-encoded peptide is important for the effect on translation <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The working model proposes that the newly synthesised peptide blocks progression of the ribosome. There is recent evidence that such stalling induces the nonsense-mediated mRNA decay (NMD) pathway <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Yeast <it>GCN4 </it>is the best-investigated case of translational control through uORFs; in this case however, the encoded peptide is not invoked to play a functional role <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. <it>GCN4 </it>translation is controlled by four uORFs. Reinitiation downstream of uORF1 occurs at different distances from its stop codon depending on the cellular levels of eIF2-GTP bound to Met-tRNA (ternary complex). If this level is high, reinitiation will most frequently occur upstream of uORF4. The sequence downstream of uORF4 is unfavourable for reinitiation, and so translation of the main ORF is prevented. With low levels of ternary complex, uORF4 will be bypassed and the main ORF translated <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. For other genes, a negative correlation between the length of the uORF and frequency of downstream initiation has been observed <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Comparative genomics has emerged as a main instrument to discern important structural and regulatory elements in nucleic acid sequences. The optimal evolutionary distance between genomes to be compared depends on the property under investigation. Functional protein domains can be conserved throughout the eukaryotic kingdom and beyond, whereas regulatory <it>cis-</it>elements in DNA diverge much more rapidly, and thus require comparisons between closely related species for efficient detection. Genomes from the <it>Saccharomyces sensu stricto </it>group and more distantly related <it>Saccharomyces </it>species have been successfully employed to identify transcription factor binding sites in promoters <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. Among these species, <it>S. paradoxus</it>, <it>S. mikatae</it>, <it>S. bayanus</it>, and <it>S. kudriavzevii </it>(all members of the <it>Saccharomyces sensu stricto </it>group) diverged from <it>S. cerevisiae </it>between 5 and 20 million years ago, while <it>S. castellii </it>and <it>S. kluyveri </it>are considerably more distant, with an estimated divergence around 100 million years ago <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Beside conservation of sequence, conservation of position and order (synteny) of genes or sequence elements can be used as a powerful complementary approach to identification in a complex genomic context, as has been shown for gene finding in the rat genome using alignments with the human and mouse counterparts <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Comparative genomics of three closely related species of <it>Aspergillus </it>has been attempted to predict functional uORFs <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and the same approach was used comparing human and mouse genomic sequences <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Another analysis was recently performed using a comparison of seven <it>Saccharomyces </it>species' genomes to identify tentatively functional uORFs <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>.</p>
         <p>The present investigation combines two independent criteria for assessing the potential for a uORF to be functional in regulation: evolutionary conservation of sequence and position on one hand; and conformity to certain properties, that we have found to be associated with characterised uORFs with a regulatory role, on the other. The latter have been coded into a scoring system, which we have used to rank uORF candidates in the <it>S. cerevisiae </it>genome. We have found 379 uORFs in 252 genes that fulfil these criteria, and which we predict to be functional. Of these, 16 genes have previously been characterised at the translational level, and 7 of these contain 12 uORFs with regulatory roles. The remaining 367 uORFs identified in this study are novel. Since ranking according to our scoring system identifies novel uORFs with a better than average evolutionary conservation, we infer that this combined approach is efficient.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Conservation of uORFs in GCN4 homologues in other fungi</p>
            </st>
            <p>To estimate the degree of evolutionary conservation of functional uORFs among fungal species, we decided to initially investigate the homologues of the <it>GCN4 </it>locus, which is well-characterised in <it>S. cerevisiae </it>with respect to the regulatory role of its four uORFs <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Using WU-BLAST2-TBLASTN at SGD, we identified <it>GCN4 </it>orthologue candidates in 18 fungal species. In all cases it was possible to find one unambiguous homologous locus. All upstream regions were aligned, and uORFs were examined for similarity in sequence and distance from the main ORF (Fig. <figr fid="F1">1</figr>). All four uORFs are well conserved in all species up to and including <it>Ashbya gossypii</it>, with the sole exception of <it>Kluyveromyces lactis</it>. uORFs 1, 2, and 4 have discernible homologues at even longer evolutionary distances, as far as <it>Yarrowia lipolytica </it>(representing a split of > 200 MYr <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>). In even more distantly related fungi, representing basidiomycetes and filamentous ascomycetes, no homologous uORFs were found, however. These findings demonstrate that uORFs with a proven regulatory role in <it>S. cerevisiae </it>are indeed conserved in genomes throughout most of <it>Hemiascomycetes</it>. It is thus a reasonable expectation to find conservation of uORFs with a regulatory role among <it>Saccharomyces </it>sister species, and to use this as a criterion for classifying them as functional.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Conservation of uORFs in the <it>GCN4 </it>locus of <it>S. cerevisiae </it>and homologues in 18 fungal species</p>
               </caption>
               <text>
                  <p>Conservation of uORFs in the <it>GCN4 </it>locus of <it>S. cerevisiae </it>and homologues in 18 fungal species. The species are ordered approximately according to evolutionary distance from <it>S. cerevisiae </it>[13]. uORFs which are conserved with respect to sequence and position within the 5' flanking region are connected by dotted lines. The start codon of the <it>GCN4 </it>coding sequence is located at position 0.</p>
               </text>
               <graphic file="1471-2105-8-295-1"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Conservation between species among previously recognised uORFs</p>
            </st>
            <p>The starting point for our investigation was a set of 16 <it>S. cerevisiae </it>genes with characterised 5'-UTRs containing uORFs <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, (Fig. <figr fid="F2">2</figr>, set A). Investigation of this set revealed 27 uORFs, for an average of 1.8 uORFs per gene. A summary of the properties of this set is found in Table <tblr tid="T1">1</tblr>. Among this set of uORFs, we discerned three subclasses with respect to their length and positioning (Fig. <figr fid="F3">3</figr>). The first and most abundant subclass, typified by <it>GCN4</it>, has short uORFs that do not overlap either with each other or with the main ORF. The second class, which includes <it>YAP2</it>, has short as well as longer uORFs, which overlap with the main ORF but not with each other. The third class, represented here by <it>PET111</it>, has short and long uORFs that overlap both with each other and with the main ORF.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Evolutionary conservation of uORFs highlighted by Vilela and McCarthy [3]. Genes with conserved uORFs are shown in bold.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Gene</p>
                     </c>
                     <c ca="left">
                        <p>uORF conservation<sup>1</sup></p>
                     </c>
                     <c ca="left">
                        <p>If predicted not to be functional, reason for this</p>
                     </c>
                     <c ca="left">
                        <p>Evidence about functional role</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>CLN3</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (1/1; 4/6)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[26]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>GCN4</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (4/4; 7/7)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[6]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>INO2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/6)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too long</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PPR1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/6)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too close to main AUG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SCO1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/5)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too close to main AUG</p>
                     </c>
                     <c ca="left">
                        <p>[32]<sup>2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>CPA1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (1/1; 5/5)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[4]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>HAP4</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (2/2; 4/4)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[43]<sup>3</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>LEU4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/7)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too close to main AUG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>TIF4631</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (4/6; 4/6)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[31]<sup>3</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YAP1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (1/1; 3/5)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[27]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YAP2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (2/2; 3/3)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[27]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>CBS1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/5)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too close to main AUG</p>
                     </c>
                     <c ca="left">
                        <p>[32]<sup>2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>DCD1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/7)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too close to main AUG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>HOL1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (1/1; 4/4)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[29]</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>PET111</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>yes (3/4; 3/4)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>[30]<sup>4</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SCH9</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>no (0/1; 0/6)</p>
                     </c>
                     <c ca="left">
                        <p>uORF too long (55 codons)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The <it>STA1-3 </it>genes mentioned by Vilela and McCarthy are not present in the standard S288c genome sequence and were not included in this analysis.</p>
                  <p><sup>1 </sup>Numbers between parentheses denote: (number of uORFs conserved/total number of uORFs; number of species where uORFs are conserved/total number of species where orthologue could be identified)</p>
                  <p><sup>2 </sup>Evidence <b>against </b>translational control by uORFs</p>
                  <p><sup>3 </sup>Evidence for translation using an IRES mechanism</p>
                  <p><sup>4 </sup>Pet111 controls translation of another mRNA, but no evidence for uORF control of <it>PET111 </it>expression</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Flowchart of the steps in defining criteria to find novel uORFs that share characteristics with known functional uORFs</p>
               </caption>
               <text>
                  <p>Flowchart of the steps in defining criteria to find novel uORFs that share characteristics with known functional uORFs. Solid arrows denote partition of a gene set into subsets; dotted arrows denote that a gene set or an algorithm is influenced by or operates on something. Letters within brackets identify the different subsets referred to in the text. Set A was the initial training set; set A + B was the training set for the refined rule set.</p>
               </text>
               <graphic file="1471-2105-8-295-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Three major classes of organisation of uORFs found in the <it>S. cerevisiae </it>genome</p>
               </caption>
               <text>
                  <p>Three major classes of organisation of uORFs found in the <it>S. cerevisiae </it>genome. Not drawn to scale.</p>
               </text>
               <graphic file="1471-2105-8-295-3"/>
            </fig>
            <p>To investigate to which extent these uORFs are conserved, we aligned the sequences from 1000 bp upstream of the start codon of each of these <it>S. cerevisiae </it>genes with their orthologues from the other members of the <it>Saccharomyces sensu stricto </it>group, plus <it>S. castellii </it>and <it>S. kluyveri </it>(for an example of visualisation of an alignment, see Fig. <figr fid="F4">4</figr>). The result is shown in Table <tblr tid="T1">1</tblr>. Nine of the 16 genes (<it>CLN3</it>, <it>CPA1</it>, <it>GCN4</it>, <it>HAP4</it>, <it>HOL1</it>, <it>PET111</it>, <it>TIF4631</it>, <it>YAP1</it>, <it>YAP2</it>) turned out to possess uORFs that are visibly conserved in most other <it>Saccharomyces </it>species where an orthologue could be identified. As expected, there was generally a gradual decline of conservation with increasing evolutionary distance. Thus, all 18 uORFs were conserved in <it>S. paradoxus</it>, <it>S. mikatae</it>, and <it>S. bayanus</it>; 10 were conserved in <it>S. castellii</it>, 8 in <it>S. kudriavzevii</it>, and 3 in <it>S. kluyveri</it>.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Alignment of a region containing uORF1 (closest to the start codon of the main ORF) from <it>S. cerevisiae YJL139c </it>(<it>YUR1</it>) with the orthologous sequences from four other <it>Saccharomyces </it>species</p>
               </caption>
               <text>
                  <p>Alignment of a region containing uORF1 (closest to the start codon of the main ORF) from <it>S. cerevisiae YJL139c </it>(<it>YUR1</it>) with the orthologous sequences from four other <it>Saccharomyces </it>species. <b>A</b>, sequence alignment. The start and stop codons of the uORF are marked in yellow. <b>B</b>, DNA sequence similarity profile of uORF1. <b>C</b>, DNA sequence similarity profile of the entire 5'-UTR of <it>YUR1 </it>and its homologues.</p>
               </text>
               <graphic file="1471-2105-8-295-4"/>
            </fig>
            <p>An analysis of common properties of the 9 genes, where conservation of uORFs was evident, showed two features that the majority of them share, and which might be used to distinguish them from spurious uORFs. First, the uORFs are short, on average 6.5 codons, to be compared with the average of 12.9 codons for all uORFs in this set, and 15.0 codons for the non-conserved uORFs. Second, the most downstream uORF is placed not closer than 50 nt from the start codon of the main ORF; in most cases at a distance between 50 and 150 nt.</p>
         </sec>
         <sec>
            <st>
               <p>Extension of heuristics for classification of functional uORFs in a larger dataset</p>
            </st>
            <p>In the second step, we extended our analysis to the whole collection of <it>S. cerevisiae </it>genes for which the extent of the 5'-UTR is known <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. All 294 5'-UTR sequences were downloaded from the UTRResource database and analysed for their uORF content. In 90 of these genes, at least one uORF was found (Fig. <figr fid="F2">2</figr>, set B). The corresponding sequences from the other genomes were aligned as previously. Out of these 90 genes, 16 were found to contain at least one conserved uORF (average 1.7 uORF per gene; Fig. <figr fid="F2">2</figr>, set D). The properties of uORFs, both conserved and non-conserved, in this set are summarised in Table <tblr tid="T2">2</tblr>, and the 16 genes with conserved uORFs detected in this work are listed in additional file <supplr sid="S1">1</supplr>.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>uORFs in the dataset by Pesole et al. <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> of verified 5'-UTRs, which we have verified to be conserved. Numbering of uORFs is 5' to 3'.</p>
               </text>
               <file name="1471-2105-8-295-S1.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Properties of uORFs found in 294 previously identified 5'-UTRs [18], after classification as evolutionarily conserved or non-conserved.</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Conserved</p>
                     </c>
                     <c ca="left">
                        <p>Non-conserved</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>16</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>74</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Average length (codons)</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>5.1</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>15.4</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Average distance from start codon of main ORF</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>61</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>121</b>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>We then reanalysed the combined set of 106 (16 + 90; Fig. <figr fid="F2">2</figr>, set A + B) uORF-containing 5'-UTRs, again looking for features that distinguish uORFs of the 25 (9 + 16; set C + D) 5'-UTRs where evolutionary conservation was detected, from those without detectable conservation.</p>
         </sec>
         <sec>
            <st>
               <p>Creation of an expert system and its implementation to discriminate functional from spurious uORFs on a genome-wide level</p>
            </st>
            <p>We wanted to perform an analysis of all 5'-flanking sequences of recognised genes in the <it>S. cerevisiae </it>genome, using the approximate criteria that we derived from the set of conserved uORFs in characterised 5'-UTRs. For this, we needed a formal implementation of criteria, which was also able to perform a genome-wide scan in a reasonable time. We used an expert system (see Materials and Methods) where the following rules, derived from the analysis of the 106 genes with conserved uORFs (Fig. <figr fid="F2">2</figr>, set A + B), were encoded. The system gave as an output a numeric score for each uORF based on: a) the length of the uORF (optimal 4 &#8211; 6 codons); b) the distance of the gene-proximal uORF (optimal 50 &#8211; 250 nt); c) the number of uORFs upstream of a main ORF (optimal &lt; 10). These values were stored in frames structures in an expert system shell. A score (cf) for each uORF was deduced using a set of production rules with associated cfs, and the highest score among the uORFs upstream of a certain gene was assigned to that gene. A diagram visualising the length, position, certainty factor and conservation in other <it>Saccharomyces </it>species is produced automatically for each gene (Fig. <figr fid="F5">5</figr>). We analysed a total of 5602 intergenic sequences of recognised genes from <it>S. cerevisiae </it>(Fig. <figr fid="F2">2</figr>, set E). As in most cases the length of the 5'-UTR was unknown, the entire intergenic sequences were used. Among these sequences, a total of 51904 potential uORFs were found. In our scoring system, 24449 uORFs distributed among the 5' flanks of 2735 genes (set F) were assigned a cf &#8805; 0.98.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Schematic of the arrangement of uORFs in the 5' flank of <it>S. cerevisiae YJL139c </it>(<it>YUR1</it>) and its homologues in other <it>Saccharomyces </it>species</p>
               </caption>
               <text>
                  <p>Schematic of the arrangement of uORFs in the 5' flank of <it>S. cerevisiae YJL139c </it>(<it>YUR1</it>) and its homologues in other <it>Saccharomyces </it>species. This type of diagram is produced automatically for each gene, showing the intergenic sequence as a numbered axis; the coding sequence of the gene starts at the position one of the intergenic sequence. uORFs are shown as boxes. The box colours show <it>S. cerevisiae </it>uORFs predicted to be functional (red), or not functional (blue). uORFs from other species are represented by black boxes, since we do not predict their functionality. The rightmost uORF (uORF1) is identical to the one shown in Fig. 4.</p>
               </text>
               <graphic file="1471-2105-8-295-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Conservation of uORFs that conform to newly derived rule set</p>
            </st>
            <p>We extracted the intergenic region from each of the 2735 genes and aligned them to their counterparts from the other 6 <it>Saccharomyces </it>species as described above. uORFs from <it>S. cerevisiae </it>with scores above 0.7 were visualised by colour-coding (red, see Fig. <figr fid="F2">2</figr>). We manually examined all alignments. We found 379 uORFs distributed among 252 genes (Fig. <figr fid="F2">2</figr>, set G) to show a clear conservation of sequence and position in at least 4 species. The mean score of these genes was 0.98, notably higher than the average score of the entire set (-0.09), and the average score of the genes selected for inspection of alignment (-0.005).</p>
            <p>The fact that uORFs with a high score were significantly better conserved indicates that the rules of our scoring system are indeed detecting features that have been conserved in evolution, and by inference, are likely to play a functional role. Out of the 16 previously characterised genes with uORFs (Table <tblr tid="T1">1</tblr>), 9 are conserved as previously mentioned, and 7 out of these 9 (<it>CLN3</it>, <it>GCN4</it>, <it>HAP4</it>, <it>HOL1</it>, <it>PET111</it>, <it>YAP1</it>, <it>YAP2</it>) were also found in the list of 252 genes with uORFs that we identified in the screen described above. By contrast, for a group of 40 randomly selected genes (with an average score of -0.09), the degree of conservation of uORFs was 11.4% in <it>S. paradoxus</it>; 2.4% in <it>S. mikatae</it>, 5.2% in <it>S. bayanus</it>, 2.3% in <it>S. castellii</it>, 6.7% in <it>S. kudriavzevii</it>, and 5.2% in <it>S. kluyveri</it>. The fact that the degree of conservation does not follow the evolutionary closeness between species is a sign that this does not reflect actual conservation of sequences. It should be noted that for <it>PET111 </it>and <it>YAP2</it>, only the shorter uORFs that do not overlap with the main ORF (<it>PET111 </it>uORF1 and uORF3; <it>YAP2 </it>uORF1; Fig. <figr fid="F3">3</figr>) received high scores. The complete list comprising 252 genes with conserved uORFs predicted to be functional is shown in additional file <supplr sid="S2">2</supplr>.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>The 252 genes with conserved uORFs and with a maximal confidence factor score (0.98). Information given for each gene from left to right: Systematic name of gene, length of intergenic region in nucleotides, length of uORF in codons, position in nucleotides of uORF relative to ATG of main ORF. The first uORF listed is always the most distant from the main ATG. Genes appear in the list in no particular order.
</p>
               </text>
               <file name="1471-2105-8-295-S2.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>In the course of our work, the study by Zhang and Dietrich <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> verified the 5' ends of a large set of <it>S. cerevisiae </it>mRNAs, 24 of which were shown to contain uORFs (additional file <supplr sid="S3">3</supplr>). We did not use these to modify our rule set, but examined to what extent they are conserved and predicted to be functional according to our work. The uORFs of three genes (<it>AGE1</it>, <it>PIC2 </it>and <it>PCL5</it>) are conserved and conform well to our rule set; those of another two (<it>AMN1 </it>and <it>URA5</it>) are conserved but get lower scores since they deviate too much from the optimal length. Out of the remaining 19 genes, the uORFs are not conserved in other species (17 genes), no orthologues were found (<it>IMD1</it>), or no uORF was found at the indicated position (<it>YNR034W-A</it>).</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>uORFs identified in verified 5'-UTRs by Zhang and Dietrich <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Numbering of uORFs is 5' to 3'.</p>
               </text>
               <file name="1471-2105-8-295-S3.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Sequence properties of conserved uORFs</p>
            </st>
            <p>Having identified a large set (379) of uORFs predicted to have biological function, we analysed these for common properties. First, we noted that there is no correspondence between the reading frames of functional uORFs and the frames of either the main ORF or of other uORFs upstream of the same gene.</p>
            <p>We noticed that a marked feature of uORFs with a high score and a high degree of conservation was a clear physical separation from other, low-scoring (and by inference spurious), uORFs. In our set of 252 genes, the average distance between a predicted functional uORF and another neighbouring functional uORF is 127 nt, whereas the average distance between a uORF predicted to be functional and its closest neighbouring non-functional uORF is 100 nt. A genome-wide investigation of all intergenic regions in <it>S. cerevisiae </it>of the average distance between neighbouring uORFs gave the value 79 nt. This indicates that functional uORFs are indeed characterised by having a wider uORF-free zone around them than spurious uORFs. Therefore, we decided to add this criterion to augment the process of ranking uORFs according to the likelihood of them having a functional role. From the group of 252 genes with high scores, we manually selected 32 cases (Fig. <figr fid="F2">2</figr>, set H) with the following properties: a) the uORF responsible for the high score given to that 5' flanking sequence was well separated from other uORFs with low scores, b) optimal distance from main ORF, c) optimal length. This conforms to the properties of the 9 + 16 (set C + D) conserved uORFs that we initially identified. Of these 34 uORFs from 32 genes, all 34 (100%) are conserved in <it>S. paradoxus</it>, 29 (83%) in <it>S. bayanus</it>, 23 (66%) in <it>S. kudriavzevii</it>, 14 (40%) in <it>S. castellii</it>, and 3 (9%) in <it>S. kluyveri</it>. In the <it>S. mikatae </it>genome sequence, syntenic homologues could be identified for only 16 out of the 32 genes, and all 16 of these (100%) had conserved uORFs. These 32 genes, shown in Table <tblr tid="T3">3</tblr>, represent the cases where we make the strongest prediction for the presence of functional uORFs with a regulatory role. The uORFs in this sub-group are better conserved than the average in the group comprising 252 genes that they were selected from. In this larger set, only 85% of uORFs were conserved in <it>S. paradoxus</it>, 43% in <it>S. mikatae</it>, 37% in <it>S. bayanus</it>, 37% in <it>S. kudriavzevii</it>, 20% in <it>S. castellii</it>, and 11% in <it>S. kluyveri</it>.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>32 newly identified genes with highly conserved uORFs strongly predicted by the rule set to be functional (marked in bold), with an optimal spacing to the main ORF and other uORFs. Numbering of uORFs is 3' to 5', as uORFs were found from intergenic sequences.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>
                           <b>ORF</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Name</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Length (nt) of intergenic sequence</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Number and size (codons) of uORFs</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Position (nt) relative to start codon of main ORF</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Predicted length (nt) of 5'-UTR [20]<sup>a</sup></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YDL146W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>LDB17</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>489</p>
                     </c>
                     <c ca="left">
                        <p><b>uORF1(3</b>)</p>
                     </c>
                     <c ca="left">
                        <p>-53</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>514</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YDL176W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Uncharacterised</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>376</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(9)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-104</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>157</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YDL205C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>HEM3</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>860</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(8)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-130</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>183</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YEL061C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>CIN8</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>509</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-109</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>128</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YER167W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>BCK2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>757</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(7)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-245</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>286</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YGL006W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>PMC1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>387</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-144</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>438</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YJL139C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YUR1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>505</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(8)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-55</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>60</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YKL182W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>FAS1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1030</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(6)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-142</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>548</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YLR047C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>FRE8</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>825</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-230</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>264</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YLR427W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>MAG2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>315</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(6)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-71</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>109</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YMR145C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>NDE1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1006</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-165</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>215</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YNL053W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>MSG5</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>358</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-104</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>120</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YNL094W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>APP1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>771</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-155</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>278</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YNR016C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>ACC1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1539</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-342</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>540</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YOL100W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>PKH2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1317</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF2(9)</b>
                        </p>
                        <p>
                           <b>uORF1(7)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-338</p>
                        <p>-81</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>255</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YOL130W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>ALR1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1042</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(5)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-103</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>290</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YOR061W</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>CKA2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>371</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(5)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-103</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>162</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YOR124C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>UBP2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>388</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(7)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-211</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>250</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YOR137C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>SIA1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>628</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(9)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-367</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>440</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>YOR231W</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>MKK1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>488</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(9)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-72</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>148</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>YOR254C</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>SEC63</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>248</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(5)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-82</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>100</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>YPL057C</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>SUR1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>373</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(5)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-145</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>253</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>YPR026W</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>ATH1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>821</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-65</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>112</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>YER118C</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>SHO1</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>441</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(8)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-212</p>
                     </c>
                     <c ca="left">
                        <p>206</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YEL013W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>VAC8</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>541</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF2(3)</b>
                        </p>
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-364</p>
                        <p>-306</p>
                     </c>
                     <c ca="left">
                        <p>252</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YEL026W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>SNU13</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>692</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF3(3)</b>
                        </p>
                        <p>uORF2(2)</p>
                        <p>uORF1(2)</p>
                     </c>
                     <c ca="left">
                        <p>-206</p>
                        <p>-198</p>
                        <p>-193</p>
                     </c>
                     <c ca="left">
                        <p>170</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YLR009W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>RLP24</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>454</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-220</p>
                     </c>
                     <c ca="left">
                        <p>171</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YLR243W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Uncharacterised</p>
                     </c>
                     <c ca="left">
                        <p>320</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-111</p>
                     </c>
                     <c ca="left">
                        <p>80</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YML093W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>UTP14</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>241</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-99</p>
                     </c>
                     <c ca="left">
                        <p>21</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YMR215W</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>GAS3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>313</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(4)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-147</p>
                     </c>
                     <c ca="left">
                        <p>62</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YNL229C</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>URE2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>457</p>
                     </c>
                     <c ca="left">
                        <p>uORF2(11)</p>
                        <p>
                           <b>uORF1(5)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-384</p>
                        <p>-285</p>
                     </c>
                     <c ca="left">
                        <p>217</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>YNR049C</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>MSO1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>393</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>uORF1(3)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-142</p>
                     </c>
                     <c ca="left">
                        <p>123</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a </sup>Genes where predicted functional uORFs are located within the estimated 5'-UTR are marked in bold.</p>
               </tblfn>
            </tbl>
            <p>Since we used genomic DNA to derive the uORFs for this study, it is important to consider whether they lie within the transcribed region (5'-UTR) of the gene in question. We manually examined the position of the 34 top-scoring uORFs (set H) using data from the recently published high-density <it>S. cerevisiae </it>transcriptome map obtained from tiling arrays <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. In 23 of the 34 cases, the genomic uORF was unambiguously placed within the transcribed region (the corresponding genes marked in bold in Table <tblr tid="T3">3</tblr>), and in one additional case (<it>SHO1</it>), it is quite close to the predicted transcript start site. To determine to what extent genomic uORFs not predicted to be functional were transcribed, we picked 40 uORFs with the lowest score, on average located at the same distance from the start codon of the main ORF (250 nt) as the 34 uORFs in the top group in Table <tblr tid="T3">3</tblr>. In stark contrast, only 20% of these low-scoring uORFs were located within transcripts.</p>
            <p>The A/T-rich sequence downstream of <it>GCN4 </it>uORF1 and the G/C-rich sequence downstream of <it>GCN4 </it>uORF4 have been proposed to be essential for their translational regulatory properties. Therefore, we also compared the G/C content of the 20 nt immediately upstream and downstream of all uORFs in the whole genome with those from the top-scoring 32 genes where uORFs in addition have an optimal distance to the main ORF and a clear separation between uORFs (Fig. <figr fid="F2">2</figr>, Table <tblr tid="T3">3</tblr>). We found an average G/C content of 38.6% upstream of high-scoring uORFs (vs. 36.9% for all uORFs in the whole genome), and 36.9% downstream of uORFs (vs. 36.3% for all uORFs in the whole genome). We conclude that there is no significant deviation in G/C content from the genome average for sequences flanking functional uORFs.</p>
            <p>Finally, we examined the sets of genes carrying candidate functional uORFs found in this work for the predicted folding energies of their 5'-UTRs. It has been shown that 5'-UTRs generally are more weakly folded than bulk or randomised sequences, and that strongly translated mRNAs tend to be even less folded <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. We found that the predicted folding energies of the 200 nt immediately preceding the AUG of the main ORF were weaker for the initial set of genes containing previously recognised functional uORFs than for the average gene (Table <tblr tid="T4">4</tblr>). Interestingly, our newly found genes containing uORFs predicted here to be functional also have weaker folding energies in this region; most significantly for the 32 most highly ranked genes, and to a lesser extent also the larger set of 252 genes (Table <tblr tid="T4">4</tblr>).</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Calculated minimum free folding energy of the 200 nt immediately upstream of the start codon of different sets of uORF-containing genes [21].</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>Set</p>
                     </c>
                     <c ca="left">
                        <p>Minimum free energy (kcal/mol)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9 genes in initial set with conserved uORFs (Table 1; Fig. 2 set C)</p>
                     </c>
                     <c ca="left">
                        <p>-25.8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>32 genes with highly conserved uORFs with optimal spacing (Table 3; Fig. 2 set H)</p>
                     </c>
                     <c ca="left">
                        <p>-32.8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>252 genes with highly conserved uORFs (additional file <supplr sid="S2">2</supplr>; Fig. 2 set G)</p>
                     </c>
                     <c ca="left">
                        <p>-35.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All genes in genome</p>
                     </c>
                     <c ca="left">
                        <p>-36.6</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Possible role of peptide product of predicted functional uORFs</p>
            </st>
            <p>We then wanted to estimate the prevalence among regulatory uORFs of mechanisms that depend on the encoded peptide. We reasoned that if the encoded peptide is relevant, this should be reflected by the absence of frameshift mutations (<it>e.g</it>. one +1 followed by a -1 frameshift, thus preserving the length of the uORF but altering the peptide sequence) and by a high ratio of synonymous to non-synonymous mutations (d<sub>s</sub>/d<sub>n</sub>), similar to other protein-coding sequences. Among the 34 uORFs we investigated (from the 32 genes in Table <tblr tid="T3">3</tblr>), we found one case of frame-shifts within one uORF, namely <it>YER118c </it>in <it>S. kudriavzevii</it>. As a complementary approach, we calculated the ratio of synonymous to non-synonymous substitutions in uORFs by comparing the orthologous sequences of <it>S. cerevisiae</it>, <it>S. paradoxus</it>, <it>S. mikatae</it>, and <it>S. bayanus</it>. For the uORFs in Table <tblr tid="T3">3</tblr>, the d<sub>s</sub>/d<sub>n </sub>ratio calculated from a total of substitutions is 0.41. This is significantly lower than the average d<sub>s</sub>/d<sub>n </sub>value we determined from 3268 protein-coding sequences from the same species, namely 1.80.</p>
            <p>As a further estimation of the likelihood that uORFs encode a functional peptide, we compared the codon adaptation index (CAI; <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>) of the set of 252 conserved uORFs in additional file <supplr sid="S2">2</supplr> (CAI = 0.151) with those of the entire group of 24449 uORFs (mostly non-functional; CAI = 0.149). This is to be contrasted with the indices for weakly (CAI = 0.19) and highly (CAI = 0.77) expressed protein-coding main ORFs <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. There is thus no bias for a higher CAI in the conserved uORFs examined.</p>
            <p>The sequences around the start codon that promote efficient translation are much less frequent in uORFs than in main ORFs <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. In accordance with this, we did not find good fits to the consensus found for <it>S. cerevisiae</it>, (A/U)A(A/C)AA(A/C)<ul>AUG</ul>UC(U/C, <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>) in most high-scoring uORFs. For the positions with the greatest impact on translational efficiency, the base frequencies as calculated from the set of 252 genes were not significantly different from bulk DNA: at -3; 35% A, 16% C, 20% G, 29% T; at +4; 32% A, 22% C, 17% G, 29% T.</p>
         </sec>
         <sec>
            <st>
               <p>Biological context of genes with predicted functional uORFs</p>
            </st>
            <p>In order to identify any common denominator for the biological function of these 252 genes, we performed a Gene Ontology (GO) term analysis at SGD. There was no single term unifying the majority of the genes; however there was a moderate overrepresentation of genes with the function "transcription regulator activity" (9.6% vs. 4.4% in the whole genome; P = 3.1 &#215; 10<sup>-4</sup>); see Table <tblr tid="T5">5</tblr>. There was also an overrepresentation of the cellular process "development" (10.4% vs. 5.4%; P = 10<sup>-3</sup>). The genes associated with "development" are mainly involved in establishment of cell polarity and sporulation. Related to this, we also noted an overrepresentation of genes with a role in pseudohyphal growth (2.4% vs. 0.6%; P = 7 &#215; 10<sup>-3</sup>), even though this category is not classified under "development" in GO. Most of the genes for pseudohyphal growth are also included in one of the other categories (cell polarity, transcription); see Table <tblr tid="T5">5</tblr>.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Major functional classes for genes that harbour conserved uORFs predicted to play a regulatory role (Fig. 2, set G).</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Sporulation</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Transcription</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Filamentous growth</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Cell polarity</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>FKH2</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>FLO8</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>SOK2</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>BDF1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>BUD8</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>CDC42</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>ADE16</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CAT8</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SHO1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>BUD6</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>MDS3</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ELP3</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>BUD22</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>MSO1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>GCN4</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CDC12</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>PRE1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>HAP4</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CKA2</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>RIM9</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>HFI1</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>HKR1</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>SMK1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>MET32</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>RHO3</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>SSP1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>RRN10</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>RRN11</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>RCS1</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SIF2</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SKN7</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SOK2</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SPT8</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SRB7</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SUT1</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>SWI5</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>TAF3</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>TAF12</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>URE2</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Properties of conserved uORFs</p>
            </st>
            <p>The independent properties that correlate with the newly found evolutionarily conserved uORFs are: a) short length, 4&#8211;6 codons; b) distance from main ORF between 50 and 250 nt; c) a distance to the nearest conserved uORF slightly greater than between neighbouring spurious uORFs; d) weaker folding energies of the most downstream 200 nt of the 5'-UTR than for the average gene; e) a 3-fold higher probability of being located within a transcript than randomly chosen uORFs in the genome at an equivalent distance from the main ORF. The first two of these features emerged from our evolutionary comparison of the initial set of uORFs with experimentally demonstrated regulatory function, where it was shown that conserved uORFs had these properties. These two rules were then used to rank all uORFs in the genome, facilitating the manual inspection of alignments with homologous regions from other genomes to reveal evolutionary conservation. The last three properties of evolutionarily conserved uORFs became apparent in the final analysis of the larger set of novel predicted functional uORFs. We believe that these rules of thumb can be helpful in the identification of functional uORFs from other genomes.</p>
            <p>Several factors underpin the approach we have used for discrimination of uORFs with a regulatory role from those arising in the genome by chance. The set of genome sequences from seven <it>Saccharomyces </it>species utilised in this work lends itself well to extracting putative <it>cis</it>-regulatory elements with bioinformatics methods. The reasons for this are threefold: a) the species represent a range of rather short evolutionary distances, suitable for detection of sequence features that change relatively rapidly; b) budding yeast genomes are less complex than those of most eukaryotes, with <it>e.g</it>. fewer repetitive elements and protein binding sites, and have short intergenic sequences; c) using seven genomes for comparison is inherently more powerful than two, such as man vs. mouse <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> or three <it>Aspergillus </it>species <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Independently of the criterion of evolutionary conservation, we have developed a set of heuristic rules of length and spacing of uORFs, which we have used to pre-sort the 51904 uORFs found in the <it>S. cerevisiae </it>genome, in order to be able to concentrate efforts on the best candidates. Lastly, the visualisation tool we constructed allows immediate spotting of conserved uORFs in other species among candidate uORFs.</p>
            <p>It is noteworthy that among the 9 genes in the initial set where conservation of uORFs was found, there is evidence in the literature for a regulatory role of uORFs in six cases: <it>GCN4 </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, <it>CLN3 </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, <it>YAP1 </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, <it>YAP2 </it><abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>, <it>HOL1 </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, and <it>CPA1 </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. We note that the uORFs of five out of these six genes (all except <it>CPA1</it>) were identified as functional by our automated scoring system. <it>CPA1 </it>was not identified is because its uORF is much longer (20 nt) than the optimum in our scoring system (4 &#8211; 6 nt). The <it>CPA1 </it>uORF also belongs to a different functional class, where the encoded peptide has a direct role in the regulatory mechanism <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, in contrast to the <it>GCN4</it>-like uORFs that likely make up the vast majority in the set we identified. Of the remaining three genes, <it>PET111 </it>is an interesting case in that it has been recognised that Pet111p acts to control translation of another mRNA, namely the mitochondrially encoded <it>COX2 </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. To our understanding, post-transcriptional control of <it>PET111 </it>itself by uORFs has not been considered, however. For <it>TIF4631</it>, itself encoding a translation factor, translational control through an internal ribosome entry site (IRES) mechanism has been argued <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, but we are not aware that uORF-mediated control has been demonstrated. For <it>HAP4</it>, finally, we have not been able to find documentation in the literature about regulation through uORFs. Considering the genes where no conserved uORFs were found, in fact there are reports in the literature indicating that the uORF is <it>not </it>functional for two of them: <it>CBS1 </it>and <it>SCO1 </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <p>It is relevant to compare the results of our investigation with those of Zhang and Dietrich <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. There, a list of 15 genes containing 19 newly predicted functional uORFs is presented (additional file <supplr sid="S4">4</supplr>). Six of these (<it>FOL1</it>, <it>HEM3</it>, <it>MBR1</it>, <it>MKK1</it>, <it>RPC11</it>, <it>WSC3</it>) are also highly ranked (score &#8805; 0.98) with our methods; one of them (<it>HEM3</it>) is in our top list (Table <tblr tid="T3">3</tblr>). Of the remaining eight, several observations may explain why they were not highly scored by our methods. One gene, <it>IMD4</it>, is not present in other fungal genomes, and is given a low score by our methods since the uORF is too long. For five genes (<it>AVT2</it>, <it>TPK1</it>, <it>APC2</it>, <it>SPE4</it>, <it>SPH1</it>) the distance to the main ORF is too short. Two further genes have three uORFs each, and not all of them are conserved. Thus, uORF2 of <it>ARV1 </it>is conserved and gets an intermediate score, because it is too long, whereas the other uORFs are not conserved; uORF2 and uORF3 of <it>SLM2 </it>are conserved and get high scores whereas uORF1 is not conserved. Zhang and Dietrich <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> used evolutionary conservation as the sole criterion for inclusion in the set to be considered. Because of the very large number of genes and uORFs to be investigated, we believe it is efficient to concentrate manual inspection of alignments to the cases with the highest likelihood of constituting true regulatory uORFs. We think this is the reason why we succeeded in identifying a much larger set of candidates in this work (252 vs. 35). We have noted that the average length of the <it>S. cerevisiae </it>5'-UTRs as measured by David et al. (260 nt; <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>), is higher as earlier estimates (&lt; 200 nt; <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>). This increases the number of yeast genes with a potential to be regulated by uORFs.</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>uORFs predicted to be functional by Zhang and Dietrich <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Numbering of uORFs 5' to 3'.</p>
               </text>
               <file name="1471-2105-8-295-S4.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Based on identification of putative functional uORFs using comparisons between mouse and man, it has been suggested that the peptides encoded by regulatory uORFs in most cases are crucial to their function <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Our results do not support this conclusion for yeast: a) we find no bias for synonymous vs. non-synonymous mutations in the nucleotide substitutions, in six <it>Saccharomyces </it>species, present among the uORFs most strongly predicted to be functional; b) the lack of a codon bias or strong translation start sites for conserved uORFs give no support for functional peptides to be encoded by them; c) even in a small set of otherwise conserved uORFs, we find an example of a nonsense mutation. We conclude that for the majority of functional uORFs, the encoded peptide plays no regulatory role. It should be emphasised, however, that our analysis may be biased for <it>GCN4</it>-type uORFs, with a regulatory mechanism that does not involve the encoded peptide.</p>
            <p>We have observed a correlation between the folding energies calculated by Ringner and Krogh <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> for the 200 nt upstream of the start codon and the presence of a predicted functional uORF: 5'-UTRs with experimentally verified functional uORFs have weaker folding than average genes. The genes we predict in this work to have functional uORFs have weaker folding in this region than the average gene, but stronger than the previously recognised set. This indicates that we have selected a set of upstream regions enriched for functional uORFs (or uORFs with sequence properties similar to functional ones). Given that we find the optimal distance between a functional uORF and the start codon of the main ORF to be in the range 50 &#8211; 200 nt, it is not surprising that a correlation is found for upstream sequences of a similar length.</p>
         </sec>
         <sec>
            <st>
               <p>Generality of the findings</p>
            </st>
            <p>We used as a starting point for this investigation the well-documented regulatory uORFs of <it>S. cerevisiae GCN4</it>. We found their evolutionary conservation to extend quite far, even beyond <it>Ashbya</it>. We did not find another example of such extensive conservation among the set of high-scoring uORFs. In fact we have identified no other uORF that is preserved in all seven <it>Saccharomyces </it>species, not even among genes with previously well-characterised functional uORFs such as <it>CLN3</it>, <it>YAP1 </it>or <it>YAP2</it>. Several components of the pathway regulating <it>GCN4 </it>expression through modulation of translation of its mRNA, <it>e.g</it>. the protein kinase Gcn2, are conserved also in plants and animals. Translational control through uORFs could potentially be a very widespread mechanism for <it>GCN4 </it>homologues, and in this respect this gene could represent a special case. Another aspect of <it>GCN4 </it>is the arrangement of 4 uORFs acting together in an intricate regulatory pattern. It is only uORF4, the most gene-proximal one, that conforms to the criterion of being located within 150 nt from the start codon of the gene. Translation of this uORF precludes translation of the main gene <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. It is thus conceivable that the uORFs predicted to be functional in this work represent a subgroup with negative regulatory properties.</p>
            <p>Within the group of conserved uORFs that we have examined, there is a high covariance between the property of being short (&lt; 10 codons) of the uORF and the requirement for a certain distance (50 &#8211; 150 nucleotides) from the start codon of the main ORF. It is likely that we have defined a subset of genes containing uORFs similar to uORF4 of <it>GCN4</it>, which shares these properties. Other classes of genes with uORFs with a demonstrated functional role in translational regulation include <it>YAP1</it>, <it>YAP2 </it>and <it>PET111</it>. The uORFs of these genes are much longer (16 codons) and overlap with each other (<it>PET111</it>) and with the main ORF. It has been argued that the longer the uORF, the lower the reinitiation frequency immediately downstream of it <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. The short uORFs in the <it>GCN4 </it>class may reflect the need for flexible reinitiation frequencies, using the uORF as a regulatory device: if the uORF is too long, then translation would be constitutively off. If so, then clearly the much longer uORFs in the other two classes should also completely repress translational reinitiation, given the narrow optimum for uORFs in the <it>GCN4 </it>class. It follows that the sequence requirements for uORFs in the other two classes have to follow different principles, and the mechanisms of action of these uORFs are presumably different from those in the <it>GCN4 </it>class. Indeed, post-termination events have been invoked to explain the action of uORFs in the <it>YAP2 </it>mRNA <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
            <p>Our initial set of uORFs with a known functional role contained a large majority of <it>GCN4</it>-like genes, and this is a likely explanation why we have arrived at a set of rules that is biased in their favour and describes similar uORFs. Another, not mutually exclusive, explanation is that the <it>GCN4 </it>class is more homogeneous in terms of sequence requirements than other classes. A third alternative would be that <it>GCN4</it>-like uORFs are simply much more numerous in the genome, which would facilitate their detection.</p>
         </sec>
         <sec>
            <st>
               <p>Perspective</p>
            </st>
            <p>Regulation by uORFs is in principle detectable by several experimental methods. Using fractionation of mRNA bound to several ribosomes (polysomes) or to one ribosome or ribosomal subunit (monosomes), one can observe the <it>GCN4 </it>mRNA accumulating in the monosomal fraction (characteristic of translation initiation of uORFs) under conditions of good nitrogen availability, and migrate to polysomal fractions (indicative of translation of the main ORF) under conditions of nitrogen starvation <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. With global approaches to translational regulation, one can separate polysomal from monosomal RNA and analyse the relative abundance of all cellular mRNAs on microarrays <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. In an experimental approach to enrich translationally regulated transcripts, Arava et al. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> examined mRNAs co-sedimenting with monosomes using this approach. Using a combination of microarray experiments displaying polysomal association under several different conditions should be an efficient way to experimentally verify the predictions from this work.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Sequence collections and databases</p>
            </st>
            <p>From a database of 5'-UTR's from genes where the transcript start sites have been mapped <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B37">37</abbr></abbrgrp>, we extracted 294 5'-UTR sequences from <it>S. cerevisiae </it>and catalogued all uORFs (see electronic supplement). Genome sequences of <it>S. paradoxus, S. mikatae </it>and <it>S. bayanus</it>, as well as tabulated information about syntenic regions, were taken from Kellis et al. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, whereas the genome sequences from <it>S. kudriavzevii</it>, <it>S. castellii </it>and <it>S. kluyveri </it>were taken from Cliften et al., 2003 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Both datasets were downloaded from the Saccharomyces Genome Database (SGD <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>). 5' flanking sequences from orthologous genes were extracted from databases, and uORFs detected in them in all six reading frames using getorf with no upper or lower limits set for ORF length <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Intergenic sequences from the seven species were collected from the homepage of the Martha L. Bulyk laboratory at Harvard University <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Alignment and visualisation of conservation of uORFs</p>
            </st>
            <p>A series of Perl scripts <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> were developed and used for performing large-scale batch analyses on the data. Upstream regions were extracted from <it>S. cerevisiae </it>and open reading frames were identified using the software getorf <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. The candidate uORFs were assessed by an expert system (see next section) to produce a list of candidates sorted by their obtained score based on a set of rules. These candidates were aligned to the homologous regions in the six other species to verify their integrity using the AlignX module of Vector NTI Suite (Informax) and the alignment was visualised along with its DNA similarity profile (Fig. <figr fid="F4">4</figr>). Overviews of candidate uORFs in the syntenic upstream regions of the seven species were also plotted using a custom Java application (<abbrgrp><abbr bid="B41">41</abbr></abbrgrp>; Fig. <figr fid="F5">5</figr>). We have maintained the established numbering of uORFs in the 5' to 3' direction for genes where the sequences were derived from the 5'-UTR of mRNAs (thus the well-characterised inhibitory uORF4 of <it>GCN4 </it>keeps its name), whereas numbering starts at the AUG of the main ORF and runs 3' to 5' for cases where genomic sequence was used. This is indicated in the respective tables.</p>
         </sec>
         <sec>
            <st>
               <p>Prediction of uORF functionality using an expert system</p>
            </st>
            <p>A simple expert system was constructed to predict which uORFs were likely to affect gene expression. Attribute values describing the properties of genes and uORFs were derived from different genome sequences using a suite of programs written in Perl and Java. Attributes of interest were intergenic sequence length, the number of uORFs, the length of each uORF, and the distance in nucleotides from the uORF to the start of the main gene. These values were loaded into frames structures in an expert system shell <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
            <p>The expert system uses a MYCIN-like certainty factor (cf) model for representing and reasoning with uncertain data and rules <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Cfs are values in the range -1.0 to +1.0. A value of +1.0 means that we are sure of something; a value of -1.0 means that something is definitely untrue; a value of zero means that we know nothing about whether a piece of knowledge is true or not. A set of production rules for inferring whether a uORF was likely to affect gene expression was written manually and each rule was assigned a certainty factor representing our confidence in a consequent being true if all of the antecedents are true. These rules were loaded into the expert system's rule base, and forward chaining inference was used to apply the rules to the data. If the same prediction was made for a uORF using two or more different lines of inference, then the cfs associated with these were combined as in MYCIN <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. The resulting cf with which each uORF was predicted to affect gene expression was used to score the uORF.</p>
            <p>As a first step, the rules were applied to training data consisting of a set of 16 genes containing uORFs, 9 of which were known to affect the translational activity (see Fig. <figr fid="F2">2</figr>, set A). The rules and their associated cfs were adjusted by hand until the expert system could distinguish between positive and negative training examples. A threshold value for the cf score for positive examples was determined by looking at the cfs inferred for known functional uORFs. The attribute values of the expert system and their certainty factor are given in additional file <supplr sid="S5">5</supplr>.</p>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Attribute values of the expert system and their certainty factor</p>
               </text>
               <file name="1471-2105-8-295-S5.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Having built a rule base and selected a threshold score for predicting likely functional uORFs, the expert system was used to classify all uORF-containing genes in the <it>S. cerevisiae </it>genome as likely or unlikely to be regulated by uORFs. A gene was predicted to be a "good candidate" if at least one of its uORFs was inferred to have a functional role with a cf score above the selected threshold. The highest cf value for any one of a gene's uORFs was used as the score for the gene itself.</p>
         </sec>
         <sec>
            <st>
               <p>Calculation of synonymous and non-synonymous substitutions</p>
            </st>
            <p>The ratio of synonymous to non-synonymous substitution mutations within uORFs and in protein-coding yeast DNA was calculated. Homologous sequences from the seven species were identified using BLASTN and aligned with CLUSTALW, and differences from the <it>S. cerevisiae </it>sequence were recorded.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have identified criteria that distinguish uORFs in the yeast genome that are conserved in evolution. These are: short length of the uORF (4 &#8211; 6 nt); optimal distance from the main ORF (50 &#8211; 250 nt); greater than average distance to neighbouring uORFs; weaker than average folding energies of the 5'-UTR. These rules probably apply not to all functional uORFs in the genome, but to those similar to uORFs in <it>GCN4</it>. Evolutionary conservation of most uORFs identified extends to separation times between 20 and 100 million years ago, but <it>GCN4 </it>uORFs considerably beyond that. Using these criteria, we have identified 252 genes with uORFs that we predict to be functional, and short-listed 32 among those. We subsequently determined that the majority of these are located within transcripts. We found no bias in G/C composition near uORFs. We also found no evidence indicating that the encoded peptide of most uORFs identified in this study would play a functional role in regulation. Genes containing uORFs predicted to be functional were enriched for a function in transcriptional control, cell polarity, sporulation and development, with several genes encompassing more than one of these categories.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>MC compiled and extracted sequence data, formulated the initial rule set, assessed all alignments, and performed data analyses. DD wrote all Perl scripts and assisted in data analysis. EB formulated questions about sequence determinants of uORFs. GK implemented the rule set into expert system software. PS conceived the study and drafted the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Marina Axelsson-Fisk and Olle Nerman for helpful discussions. Francesco Strino is acknowledged for development of a Java application for visualisation of predicted uORFs. This work was supported by a grant from the Swedish Research Council for Science and Technology (2003-3189) to P.S.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Pushing the limits of the scanning mechanism for initiation of translation</p>
            </title>
            <aug>
               <au>
                  <snm>Kozak</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2002</pubdate>
            <volume>299</volume>
            <fpage>1</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1119(02)01056-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">12459250</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Upstream open reading frames as regulators of mRNA translation</p>
            </title>
            <aug>
               <au>
                  <snm>Morris</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Geballe</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2000</pubdate>
            <volume>20</volume>
            <fpage>8635</fpage>
            <lpage>8642</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">86464</pubid>
                  <pubid idtype="pmpid" link="fulltext">11073965</pubid>
                  <pubid idtype="doi">10.1128/MCB.20.23.8635-8642.2000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Regulation of fungal gene expression via short open reading frames in the mRNA 5'untranslated region</p>
            </title>
            <aug>
               <au>
                  <snm>Vilela</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>McCarthy</snm>
                  <fnm>JE</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>49</volume>
            <fpage>859</fpage>
            <lpage>867</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2003.03622.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12890013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>A segment of mRNA encoding the leader peptide of the CPA1 gene confers repression by arginine on a heterologous yeast gene transcript</p>
            </title>
            <aug>
               <au>
                  <snm>Delbecq</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Werner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Feller</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Filipkowski</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Messenguy</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Pierard</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1994</pubdate>
            <volume>14</volume>
            <fpage>2378</fpage>
            <lpage>2390</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">358605</pubid>
                  <pubid idtype="pmpid" link="fulltext">8139542</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Ribosome occupancy of the yeast CPA1 upstream open reading frame termination codon modulates nonsense-mediated mRNA decay</p>
            </title>
            <aug>
               <au>
                  <snm>Gaba</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jacobson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sachs</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2005</pubdate>
            <volume>20</volume>
            <fpage>449</fpage>
            <lpage>460</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.molcel.2005.09.019</pubid>
                  <pubid idtype="pmpid" link="fulltext">16285926</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Suppression of ribosomal reinitiation at upstream open reading frames in amino acid-starved cells forms the basis for GCN4 translational control</p>
            </title>
            <aug>
               <au>
                  <snm>Abastado</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Jackson<