<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2001-2-11-research0048</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Prediction of co-regulated genes in <it>Bacillus subtilis</it> on the basis of upstream elements conserved across three closely related species</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Terai</snm>
               <fnm>Goro</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
            </au>
            <au id="A2">
               <snm>Takagi</snm>
               <fnm>Toshihisa</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A3" ca="yes">
               <snm>Nakai</snm>
               <fnm>Kenta</fnm>
               <insr iid="I1"/>
               <email>knakai@ims.u-tokyo.ac.jp</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan</p>
            </ins>
            <ins id="I2">
               <p>INTEC Web and Genome Informatics Corp., 1-3-3 Shinsuna, Koto-ku, Tokyo 136-8637, Japan</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2001</pubdate>
         <volume>2</volume>
         <issue>11</issue>
         <fpage>research0048.1</fpage>
         <lpage>research0048.12</lpage>
         <url>http://genomebiology.com/2001/2/11/research/0048</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2001-2-11-research0048</pubid>
               <pubid idtype="pmpid">11737947</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>10</day>
               <month>7</month>
               <year>2001</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>6</day>
               <month>9</month>
               <year>2001</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>13</day>
               <month>9</month>
               <year>2001</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>15</day>
               <month>10</month>
               <year>2001</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2001</year>
         <collab>Terai et al., licencee BioMed Central Ltd</collab>
      </cpyrt>
      <shortabs>
         <p>Three closely related species of <it>Bacillus</it> were used to identify co-regulated genes in <it>Bacillus subtilis</it>. 1,884 phylogenetically conserved elements from the upstream intergenic regions of 1,568 <it>B. subtilis</it> genes were identified.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Identification of co-regulated genes is essential for elucidating transcriptional regulatory networks and the function of uncharacterized genes. Although co-regulated genes should have at least one common sequence element, it is generally difficult to identify these genes from the presence of this element because it is very easily obscured by noise. To overcome this problem, we used conserved information from three closely related species: <it>Bacillus subtilis, </it><it>B. halodurans</it> and <it>B. stearothermophilus.</it></p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Even though such species have a limited number of clearly orthologous genes, we obtained 1,884 phylogenetically conserved elements from the upstream intergenic regions of 1,568 <it>B. subtilis</it> genes. Similarity between these elements was used to cluster these genes. No other <it>a priori</it> knowledge on genes and elements was used. We could identify some genes known or suggested to be regulated by a common transcription factor as well as genes regulated by a common attenuation effector.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>We confirmed that our method generates relatively few false positives in clusters with higher scores and that general elements such as -35/-10 boxes and Shine-Dalgarno sequence are not major obstacles. Moreover, we identified some plausible additional members of groups of known co-regulated genes. Thus, our approach is promising for exploring potentially co-regulated genes.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Transcriptional regulatory networks are important for controlling many biological phenomena, such as development and cell proliferation. Even in bacteria, elucidation of such networks or identification of co-regulated genes (regulons) is essential for understanding many cellular processes. Because co-regulated genes are likely to function for the same purpose, identifying them can also provide hints on gene function. The microarray technique, which enables us to monitor the expression levels of thousands of genes in parallel, appears very powerful for identifying co-regulated genes and several articles on this technique have been published [<abbr bid="B1">1</abbr>,<abbr bid="B2">2</abbr>,<abbr bid="B3">3</abbr>]. Even if we can ignore experimental artifacts, however, it is not always easy to set experimental conditions to identify differential expression patterns of uncharacterized genes. Thus, it would be desirable to develop some computational methods that can supplement such experimental techniques.</p>
         <p>In recent years, several computational approaches to identifying co-regulated genes have been reported. Because transcription is regulated by transcription factors that bind DNA in a sequence-specific manner, comparison of gene upstream regions could, in principle, identify co-regulated genes. Thus, a classical and most widely used method for predicting co-regulated genes is to search upstream regions for sequence segments similar to known binding sites for transcription factors [<abbr bid="B4">4</abbr>,<abbr bid="B5">5</abbr>,<abbr bid="B6">6</abbr>]. This approach is, however, applicable only when information on binding sites is available. Furthermore, as DNA sequences recognized by a single transcription factor are only about 6-10 base pairs (bp) long and are not strictly conserved, many false-positive matches would be unavoidable.</p>
         <p>One way to overcome this difficulty is to use conservation information across species. New members of co-regulated genes have been predicted on the basis of conservation of hypothetical transcriptional regulatory sites between several eubacteria such as <it>Escherichia coli</it> and <it>Haemophilus influenzae</it> [<abbr bid="B7">7</abbr>,<abbr bid="B8">8</abbr>,<abbr bid="B9">9</abbr>]. A similar approach was also applied to the analysis of four archaeal candidate regulons [<abbr bid="B10">10</abbr>]. In that approach, the heuristic that many binding sites are quasi-palindromic was also used. McGuire <it>et al.</it> have exploited the possibility of using conservation in a wider range of species [<abbr bid="B11">11</abbr>,<abbr bid="B12">12</abbr>]. To reduce false-positive hits, candidate genes were prescreened using <it>a priori</it> knowledge such as their function, the metabolic pathway they belong to, and their functional coupling predicted from conserved operons, protein fusions and correlated evolution. Techniques for detecting conserved elements in noncoding regions across species have also been studied [<abbr bid="B13">13</abbr>,<abbr bid="B14">14</abbr>,<abbr bid="B15">15</abbr>].</p>
         <p>For bacterial genes, McCue <it>et al.</it> developed an elaborate algorithm for detecting potential binding sites in sets of upstream regions of orthologous genes [<abbr bid="B16">16</abbr>]. Their method also assumes the palindromic nature of binding sites. Thus, it is evident that such a method would fail to detect non-palindromic binding sites, of which there are many. It is also questionable whether the molecular mechanisms of transcription in distantly related bacteria have been well conserved and whether each orthologous transcription factor recognizes exactly the same consensus pattern in each species. Furthermore, the problem of detecting conserved elements is not simple; we should carefully observe each case of conservation and optimize parameters to detect as many known binding sites as possible.</p>
         <p>In this paper, we used three closely related genome sequences to predict co-regulated genes of <it>Bacillus subtilis.</it> Our method consists of two parts; first, we identified phylogenetically conserved elements (PCEs) in the upstream intergenic regions of <it>B. subtilis</it> genes; then they were clustered according to the similarity of PCEs in their upstream region. In addition, each of the obtained clusters, predicted to be co-regulated, was examined in terms of existing knowledge of regulons and functional information from downstream genes. The species used for this analysis are: <it>B. subtilis</it> [<abbr bid="B17">17</abbr>], <it>B. halodurans</it> [<abbr bid="B18">18</abbr>], and <it>B. stearothermophilus</it> (genome sequence incomplete; see Materials and methods). We selected these sequences for three reasons. First, the interpretation of the comparison of upstream regions of orthologous genes would be more straightforward because their regulatory mechanisms are also likely to be conserved. Second, we have constructed a database (DBTBS) of <it>B. subtilis</it> promoters and transcription factors by literature survey [<abbr bid="B19">19</abbr>,<abbr bid="B20">20</abbr>]. Therefore, it is easier to check the predictions and optimize parameters. Third, an international project on functional genomics, including transcriptome analysis, of <it>B. subtilis</it> is ongoing [<abbr bid="B21">21</abbr>]. Thus, our predictions have more chance of being tested experimentally. Here we report the results of our prediction of co-regulated genes in <it>B. subtilis</it> without any prior knowledge or assumption. The extensive evaluation of these results is also described.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Detection of PCEs and their verification</p>
            </st>
            <p>We could analyze the upstream regions of 1,568 <it>B. subtilis</it> genes. For 675 of them, orthologous genes were found in both <it>B. halodurans</it> and <it>B. stearothermophilus,</it> for 706 in <it>B. halodurans</it> only, and for 187 in <it>B. stearothermophilus</it> only. The genome sequence of <it>B. stearothermophilus</it> is still incomplete; its length was 3,286,068 bp on 21 February 2001. If we assume that the genome of <it>B. stearothermophilus</it> is about the same size as that of <it>B. subtilis,</it> the data roughly correspond to three-quarters of all genes.</p>
            <p>Within the upstream regions of these 1,568 genes, we identified 1,884 PCEs. For comparison, we generated five pseudogenomes of scrambled upstream regions; for this we took all upstream regions of these genes and randomly placed them in front of randomly chosen genes. Then, the same PCE identification procedure was applied to each pseudogenome. In these cases, we can basically regard detected PCEs as spurious. On average, 793 spurious PCEs were identified (the standard deviation is 26.7). Figure <figr fid="F1">1</figr> shows the histogram of scores calculated against these PCEs. The score of spurious PCEs is relatively low, suggesting that their length is relatively short. We estimate that over half of the 1,884 PCEs are meaningful and that this ratio becomes higher for longer PCEs. These PCEs were also compared with known binding sites for transcription factors using the DBTBS database [<abbr bid="B19">19</abbr>,<abbr bid="B20">20</abbr>] and literature survey. Table <tblr tid="T1">1</tblr> summarizes the result for each known transcription factor. In total, 52 of 122 known binding sites overlapped with the PCEs. For some transcription factors such as GltR, ComA and IolR, the orthologous genes themselves could not be identified, whereas orthologous genes of most genes regulated by some factors, such as DegU and GerE, could not be found. On the other hand, 6 of 11 known binding sites of CcpA overlapped with PCEs.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Histogram of PCE scores calculated from sequence alignments</p>
               </caption>
               <text>
                  <p>Histogram of PCE scores calculated from sequence alignments. <b>(a)</b> Three or <b>(b)</b> two sequences were aligned. Green bars correspond to the score of actual PCEs and yellow bars to the score of spurious PCEs generated by joining upstream regions with unrelated coding regions. In the yellow bars, the averaged values of five trials are shown with their error bars.</p>
               </text>
               <graphic file="gb-2001-2-11-research0048-1"/>
            </fig>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Correspondence between known transcription factor binding sites and PCEs</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Factor name</p>
                     </c>
                     <c ca="center">
                        <p>Orthologs<sup>*</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of known sites<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of sites to be detected<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of overlaps<sup>&#167;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AbrB</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>11 (1)</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AhrC</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>5 (1)</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AraR</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>5 (1)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BirA</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BltR</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BmrR</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CcpA</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>33 (17)</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CodY</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ComA</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ComK</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CtsR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DegU</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>14 (3)</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DeoR</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>LexA</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ExuR</p>
                     </c>
                     <c ca="center">
                        <p>S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fnr</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GerE</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>21 (2)</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GlnR</p>
                     </c>
                     <c ca="center">
                        <p>S</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GltC</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GltR</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GntR</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hpr</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>8 (1)</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>HrcA</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>IolR</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>LevR</p>
                     </c>
                     <c ca="center">
                        <p>S</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>LicT</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>LrpC</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mta</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MtrB</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PhoP</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>6 (2)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PyrR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PurR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RibC</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RocR</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SacT</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SacY</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SenS</p>
                     </c>
                     <c ca="center">
                        <p>None</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SinR</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Spo0A</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>22 (1)</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SpoIIID</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>12 (5)</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TnrA</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TreR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Xre</p>
                     </c>
                     <c ca="center">
                        <p>H</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>XylR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MntR</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Zur</p>
                     </c>
                     <c ca="center">
                        <p>H S</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>232 (34)</p>
                     </c>
                     <c ca="center">
                        <p>122</p>
                     </c>
                     <c ca="center">
                        <p>52</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>*</sup>Name(s) of species having the orthologous gene with the <it>B. subtilis</it> gene. H: <it>B. halodurans;</it> S: <it>B. stearothermophilus</it>.<sup>&#8224;</sup>Total number of experimentally verified binding sites of &lt; 50 bp. The number of binding sites in the coding region is shown in parentheses. <sup>&#8225;</sup>Number of known binding sites in the region analyzed in this work. <sup>&#167;</sup>Number of analyzed sites overlapping with PCEs over 5 bp.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Clustering of PCEs and its verification</p>
            </st>
            <p>Using the clustering process, 188 clusters were obtained which contained many known or possible co-regulated genes (see below). To estimate the number of false positives, we performed the same clustering procedure five times against the 1,884 PCEs of randomly shuffled sequences. Figure <figr fid="F2">2</figr> shows the histogram of similarity scores used during these clustering processes. It shows that many false-positive clusters can occur by chance around a cut-off score of 60, but that they are rare above score 80. Although about half of detected PCEs might be false positives, such PCEs are usually short (Figure <figr fid="F1">1</figr>) and the similarity score between them is relatively low (Figure <figr fid="F2">2</figr>, blue bar). We therefore conclude that non-meaningful PCEs are rarely included in our clustering results, at least in the clusters with higher scores.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Histogram of similarity scores used during the clustering process</p>
               </caption>
               <text>
                  <p>Histogram of similarity scores used during the clustering process. Red bars represent clustering of PCEs within the upstream regions of orthologous genes, green bars the clustering of PCEs with randomly shuffled sequence, and blue bars the clustering of PCEs identified when the upstream regions are linked to unrelated coding regions. For the green and blue bars, average values are shown with their error bars.</p>
               </text>
               <graphic file="gb-2001-2-11-research0048-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Prediction of co-regulated genes</p>
            </st>
            <p>Among the 188 clusters obtained, we excluded 34 because of the alignment of hypothetical Shine-Dalgarno (SD) sequences (see below). The remaining clusters, ranked by the highest similarity score within each cluster, are available as a table (see Additional data files). We expect that many members of each cluster will be co-regulated by a common factor, especially when their similarity scores are above 80. We now discuss the clustered genes in terms of some typical regulons (Table <tblr tid="T2">2</tblr>).</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Comparison of some typical regulons with our results</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Regulon</p>
                     </c>
                     <c ca="center">
                        <p>Gene<sup>*</sup></p>
                     </c>
                     <c ca="center">
                        <p>Cluster information<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Sequence of PCE<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>pyr</it> operon (regulator: PyrR)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>pyrR</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>AGTCCAGAGAGGCTGAGAAGGA-T</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>pyrP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>AATCCAGAGAGGTTG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>pyrB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>CAGAGAGGCTT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>S-box regulon (regulator: unknown)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>metK</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1,11</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yusC</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ykrW</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1,5,11</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yjcI</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5,11</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>metE</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ykrT</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5,11</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yitJ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5,11</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>cysH</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yoaD</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yxjG</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yxjH</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hypothetical xanthine regulon (regulator: unknown)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>purE</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>xpt</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>14,20</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>pbuG</it>
                           <sup>*</sup>
                        </p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases (regulator: uncharged tRNA)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>serS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGCAACGCGAG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>valS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AAAAAAGGTGGTACCGCGA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>thrS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>GAAAAAAGGGTGGAACCACGA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tyrS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>TTAGTAGGGTGGTACCGCGA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>leuS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGTACCGCGGG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tyrZ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGTACCGCGTG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ilvB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGTACCGCGGAAAG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>pheS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AATAAGGGTGGTACCGCG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>hisS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AACTAGGGTGGCACCACGGGTAT..</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>glyQ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>GCAACTAGGGTGGAACCGCGGG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>alaS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGTACCGCGAG-A</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ileS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGGTACCGCGAGA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>proB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>AAGGTGGTACCACGGA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>cysE</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>D</p>
                     </c>
                     <c ca="center">
                        <p>C-AAACAGAGTGGAACCGCG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>trpS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>AGGGTGG</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>thrZ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Heat-shock regulon (regulator: CtsR)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ctsR</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>GTCAAATATAGTCAAAGTCA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>clpE</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>GGTCAAAGATAGTCAAA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dnaJ</it>
                           <sup>*</sup>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>GAAAGTCAAAGTCAGGCAT</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>clpP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CcpA regulon<sup>&#167;</sup> (regulator: CcpA)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>bglS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>TAGAAAACGCTTTCAA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>msmX</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>GTAAACGCTTTCTT</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yvfK</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>...TCTT-TAAAGCGCTTTCAT</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>mfd</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>GACCAAAGCGTTTTT</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>bglP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>AAATGAAAGCGTTGACA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>sucC</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>TATAGAATGAAAGCGC</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>mmgA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>D</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>ATTGTAAGCGCT</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>hutP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>D</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>AGTTAATAGTTATCAGA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>rbsR</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>D</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>GTAAACGGTTACATAAACA</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yxjC</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ackA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>licB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>acuA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>B</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>acsA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>xylA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>iolB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>galT</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>uxaC</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ydhO</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>acoA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>araB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>lcfA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dra</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>kdgA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yobO</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>treP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>yxkJ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>amyE</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>gntR</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>xynP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>levD</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dctP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>citM</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>*</sup>Probable new members identified by our analysis are shown with anasterisk. <sup>&#8224;</sup>Cluster number(s) are shown when available, otherwise, one of the situation codes is shown: A, orthologous genes not found; B, no overlaps between known binding site and PCE; C, PCE overlaps with known site but is too short; D, PCE overlaps with known site but is slightly different; E, binding site exists within the coding region. <sup>&#8225;</sup>PCE sequence in <it>B. subtilis</it>. The region overlapping with a known binding site is shown in bold. <sup>&#167;</sup>CcpA-dependent genes identified by a systematic experiment [31] are not included.</p>
               </tblfn>
            </tbl>
            <sec>
               <st>
                  <p>Clusters 2 and 3: the T-box family</p>
               </st>
               <p>One of the most conspicuous clusters detected in our analysis was the so-called T-box family, which consists of many aminoacyl-tRNA synthetase operons and some operons related to amino-acid biosynthesis [<abbr bid="B22">22</abbr>]. It is known that these operons are regulated by the attenuation mechanism, where an uncharged tRNA molecule is used as an effector. The PCE shared in cluster 2 is a part of the attenuation region where an uncharged tRNA is believed to bind (the T-box), whereas the PCE in cluster 3 is a region loosely complementary to the T-box. All the members of cluster 3 are included in cluster 2. In addition to 11 aminoacyl-tRNA synthetases, it makes sense that <it>proB</it> and <it>ilvB</it> were clustered because their function is related to amino-acid synthesis. However, three additional members could not be detected; two of them had less similar or shorter PCEs and the other did not have an orthologous counterpart.</p>
            </sec>
            <sec>
               <st>
                  <p>Cluster 34: the <it>pyr</it> operon</p>
               </st>
               <p>The <it>pyr</it> operon contains at least three genes, each of which is directly regulated by PyrR, a transcription attenuation regulator ([<abbr bid="B23">23</abbr>] and Figure <figr fid="F3">3a</figr>). Each leader region of these genes can form three different RNA secondary structures (terminator, antiterminator and anti-antiterminator) when transcribed ([<abbr bid="B24">24</abbr>] and Figure <figr fid="F3">3b</figr>). PyrR then binds to the anti-antiterminator regions of the mRNAs. Cluster 34 contains <it>pyrR</it> and <it>pyrP,</it> their PCEs corresponding to a part of each anti-antiterminator. The other gene, <it>pyrB,</it> was, however, not detected because its PCE was not sufficiently well conserved to become long enough for clustering.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>Post-transcriptional regulation of the <it>pyr</it> operon</p>
                  </caption>
                  <text>
                     <p>Post-transcriptional regulation of the <it>pyr</it> operon. <b>(a)</b> The three attenuation regions in the operon. <b>(b)</b> Two alternative secondary structures of the transcript of each attenuation region. In the presence of high UMP concentration, PyrR binds to the anti-antiterminator and stabilizes the formation of the terminator structure, while preventing the formation of the antiterminator.</p>
                  </text>
                  <graphic file="gb-2001-2-11-research0048-3"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Clusters 1, 5 and 11; S-box regulon</p>
               </st>
               <p>The S-box regulon is a hypothetical regulon relating to methionine and/or cysteine biosynthesis. The leader regions of its putative transcriptional units have considerable sequence similarity and seem to form complex secondary structures that are similar to those in the <it>pyr</it> operon [<abbr bid="B25">25</abbr>]. Three different PCEs were identified in our analyses, each of which forms a cluster related to each of the others. The PCEs correspond to several parts of the hypothetical anti-antiterminator region, where an unidentified binding factor is postulated to stabilize its secondary structure [<abbr bid="B25">25</abbr>]. Of the eleven putative members of this regulon, seven were included in at least one of these clusters whereas three could not be detected because of the lack of orthologous genes. The leader region of the remaining one, <it>cysH,</it> was very poorly conserved.</p>
            </sec>
            <sec>
               <st>
                  <p>Clusters 14 and 20: hypothetical xanthine metabolic regulon</p>
               </st>
               <p>It has been suggested that the expression of the <it>xpt-pbuX</it> operon in <it>B. subtilis</it> is regulated by a termination-antitermination control mechanism similar to the mechanism suggested for the <it>pur</it> biosynthesis operon, <it>purEKBCSLQFMNHD</it> [<abbr bid="B26">26</abbr>]. It has been speculated that the regulatory proteins of these two operons are the same because they seem to have the same effector - xanthine [<abbr bid="B26">26</abbr>]. Our results support this hypothesis because <it>xpt</it> and <it>purE</it> were clustered in cluster 20. <it>xpt</it> also belongs to another cluster, 14, with <it>pbuG</it>. As the PbuG protein has the characteristic Pfam [<abbr bid="B27">27</abbr>] domain of the xanthine/uracil permease family, <it>pbuG</it> is very likely to be a new member of the xanthine metabolism regulon.</p>
            </sec>
            <sec>
               <st>
                  <p>Cluster 6: class III heat-shock regulon</p>
               </st>
               <p>This cluster corresponds to a part of the class III heat-shock regulon, which is regulated by CtsR. Cluster 6 contains two of the three known genes that have experimentally verified CtsR-binding sites [<abbr bid="B28">28</abbr>,<abbr bid="B29">29</abbr>]. Interestingly, cluster 6 contains <it>dnaJ,</it> which belongs to the <it>dnaK</it> operon - <it>hrcA-grpE-dnaK-dnaJ-yqeT-yqeU-yqeV</it> [<abbr bid="B30">30</abbr>]. As the <it>dnaK</it> operon is involved in the class I heat-shock regulon (which corresponds to cluster 13) and as there is an internal promoter between <it>dnaK</it> and <it>dnaJ</it> [<abbr bid="B30">30</abbr>], there is likely to be regulatory overlap between the class I and the class III heat-shock regulons.</p>
            </sec>
            <sec>
               <st>
                  <p>Clusters 12, 47, 52 and 59: genes under glucose repression</p>
               </st>
               <p>The largest genetic network identified so far in <it>B. subtilis</it> is the regulatory system that is stimulated by glucose repression, in which the transcription factor CcpA has a central role [<abbr bid="B6">6</abbr>]. In our analysis, not many known CcpA-dependent genes were clustered and they were even split into three subgroups (clusters 47, 52 and 59). Two members of cluster 47 have PCEs overlapping with the CcpA-binding site, and another member, <it>yufK,</it> was recently shown in a microarray experiment to be under glucose repression [<abbr bid="B31">31</abbr>]. In cluster 52, <it>araA</it> was also shown to be under glucose repression. It seems very likely that CcpA regulates all members of this cluster because their PCEs are similar to the CcpA-binding site and their functions are consistent with this hypothesis. As for cluster 59, both of its two members, <it>bglP</it> and <it>sucC</it>, were shown to be under glucose repression [<abbr bid="B31">31</abbr>]. Many other known genes are regulated by CcpA. As shown in Table <tblr tid="T2">2</tblr>, their CcpA-binding sites reside within their coding regions in most cases, whereas these sites can be less conserved in other cases. As noted above, many of the known binding sites overlap with PCEs. Therefore, it seems possible that the split into subgroups has some biological meaning.</p>
               <p>There are also co-expressed genes that are subject to CcpA-independent glucose repression. All three members of cluster 12 were shown to be under glucose repression, two of which, <it>gapB</it> and <it>pckA,</it> were shown to be CcpA-independent in a recent systematic experiment [<abbr bid="B31">31</abbr>]. Our results support this because PCEs in cluster 12 are not similar to the CcpA-binding site.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Potentially new regulons/members</p>
            </st>
            <p>As described above, we found several potentially new members of known regulons: for example, <it>dnaJ</it> in cluster 6, <it>pbuG</it> in cluster 14, <it>kduI</it> and <it>odhA</it> in cluster 52 (see table in Additional data files for more examples). In addition, <it>topA</it> in cluster 15 is likely to belong to the SpooA regulon because PCEs of this cluster is very similar to the SpooA-binding site and its functions are related to sporulation. There are, however, potential regulons not reported so far. For example, <it>aroA</it> and <it>aroF</it> in cluster 29 seem to constitute a regulon related to the metabolism of aromatic amino acids. In this regard, clusters 24 and 16 are especially interesting. Cluster 24 contains two genes (<it>dnaA</it> and <it>dnaN</it>) related to DNA replication and its PCEs are very similar to the DnaA-binding site (DnaA-box:TTATCCACA). <it>yqeG,</it> another member of cluster 24, has two DnaA-like PCEs in its upstream region. It is known that DnaA box is often found in multiple copies. Moreover, cluster 16 contains <it>yqeG</it> and <it>dnaA,</it> its PCEs being very similar to the SpooA-binding site. Thus, it is likely that both DnaA and SpooA bind to the upstream regions of <it>yqeG</it> and <it>dnaA,</it> suggesting a new crosstalk of regulatory networks between DNA replication and sporulation. <it>yqeG,</it> whose function cannot be inferred by sequence similarity, may be involved in DNA replication and/or sporulation. As there are many additional cases where functionally related genes are included in the same cluster (see Additional data files), we expect that future experiments will prove that at least some of them are co-regulated.</p>
         </sec>
         <sec>
            <st>
               <p>On the possibility of misclustering due to general patterns</p>
            </st>
            <p>In our method, there is a concern that a set of functionally unrelated genes can be clustered from general motifs such as the -35/-10 boxes and the SD sequence. Thus, we investigated the occurrences of these motifs in the clusters.</p>
            <p>As the SD sequence is located at some relatively definite distances from the translation start site, which is known at least in principle, it is relatively easy to detect the SD sequence. With the criterion described in Materials and methods, we excluded 34 clusters, all members of which contain an SD-like PCE (Table <tblr tid="T3">3</tblr>). Apparently, many of these genes are translation related (that is, ribosomal proteins and elongation factors). Possibly their SD sequence has been highly conserved to maximize their translation efficiency. Another possibility is that there are some factors that recognize such SD-like PCEs and that these clusters are co-regulated by them.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Clusters having SD-like PCEs</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>Gene</p>
                     </c>
                     <c ca="left">
                        <p>Functional classification<sup>*</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>atpG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Membrane bioenergetics</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>spoVG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Sporulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yyaA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Sporulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpoC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsL</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpoB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ydaO</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ydcD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>secG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>sspE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Sporulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rplK</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>sspA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Sporulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rplJ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rplU</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ftsA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Cell division</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpmE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>fusA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>cysE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yeeI</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpoA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>gerE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Regulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>sigA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Initiation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>gerM</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Germination</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>asnS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>nusG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Termination</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ypjB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yjcI</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>sigG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Initiation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>acpA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of lipids</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>prfA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Termination</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>thdF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Detoxification</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>minC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Cell division</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>cwlJ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Cell wall</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>hag</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Mobility and chemotaxis</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>aprX</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>tsf</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yvgY</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Transport/binding proteins and lipoproteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yabR</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of nucleotides and nucleic acids</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yqfC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ileS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yocD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Detoxification</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>gcvH</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsJ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rplQ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>dnaA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>DNA replication</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>thrS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ysgA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>RNA modification</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yjzC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ytdA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Specific pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ywrD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>spoVT</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Regulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>dxs</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Specific pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>pyrP</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Transport/binding proteins and lipoproteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>leuS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ykkC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Protein folding</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ylaN</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yslB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>thrS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yvdF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Specific pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>citG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>TCA cycle</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ykoY</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Detoxification</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ripX</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Phage-related functions</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>trpE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>lepA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>greA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Elongation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ytaG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>citZ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>TCA cycle</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ykzG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yocC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ybyB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>pgk</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Main glycolytic pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>pheS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Aminoacyl-tRNA synthetases</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>pbpX</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Cell wall</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rpsB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ybxF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>gcaD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Cell wall</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yrrK</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>zur</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Regulation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yjbK</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>pdhA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Main glycolytic pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yneF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>fbaA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Main glycolytic pathways</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>rocD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Metabolism of amino acids and related molecules</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>cah</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Detoxification</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>appA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Transport/binding proteins and lipoproteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>appA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Transport/binding proteins and lipoproteins</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>yobV</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>spoIVA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Sporulation</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>*</sup>Functional classification is obtained from the SubtiList website [42,43]. Genes belonging to the same cluster are grouped together</p>
               </tblfn>
            </tbl>
            <p>It is more difficult to detect the -35/-10 boxes than the SD sequence because the distance between the start sites of transcription and translation is rather variable. We investigated the number of known -35/-10 boxes overlapping with the PCEs using the DBTBS database [<abbr bid="B19">19</abbr>,<abbr bid="B20">20</abbr>]. As shown in Table <tblr tid="T4">4</tblr>,19% of them overlap with the PCEs on average. It is possible that the presence of the -35/-10 boxes might have affected the clustering of clusters 7, 22, 42, 53, 122, 129, 134 and 144. However, we do not regard this as a serious problem because the conservation of these boxes is relatively weak and because it is natural that many regulatory elements overlap with the -35/-10 boxes. Namely, if a PCE overlaps with the -35/-10 box in a cluster, it does not directly mean that the clustering is a mistake. On the other hand, it could be also problematic if no -35/-10 like elements were found around PCEs because it may not be a promoter region but an intergenic region within an operon. However, considering that it is still difficult to predict the position of promoters in bacterial genomes exactly, we did not use information of promoter existence in our scheme. In future, it seems to be reasonable to include the prediction of operon structure in our method [<abbr bid="B32">32</abbr>,<abbr bid="B33">33</abbr>,<abbr bid="B34">34</abbr>].</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Number of -35/-10 boxes that overlap with PCEs for each sigma factor</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Sigma factor</p>
                     </c>
                     <c ca="center">
                        <p>Number of sites<sup>*</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of -35 boxes<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Number of -10 boxes<sup>&#8225;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigA</p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigB</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigD</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigE</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigF</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigG</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigH</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigK</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigL</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigW</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SigX</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>-35/-10 boxes that overlap with a PCE by 5 bp or more were counted. If the box is shorter than 5 bp, those fully overlapping with PCE were counted. <sup>*</sup>Number of known -35/-10 boxes that exist in the regions analyzed in our work. <sup>&#8224;</sup>Number of -35 boxes that overlap with PCE. <sup>&#167;</sup>Number of -10 boxes that overlap with PCE.</p>
               </tblfn>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>In this work, we aligned the upstream regions of orthologous genes between three closely related species and identified the PCEs within them. Genes of <it>B. subtilis</it> were then clustered according to the similarity of the PCEs in their upstream region. Most parameters in our method were determined such that as many known co-regulated genes are clustered together and the nature of the clustered genes was thoroughly investigated. In this sense, the use of closely related species, one of which has a long history of experimental research, was essential in our work.</p>
         <p>There are several potential difficulties in our approach. One is that the regulatory system of co-regulated genes must be conserved in a pair of species at least. In fact, even in the close relatives compared, only a proportion of genes had orthologous counterparts. However, this situation will be improved as the number of sequenced bacterial genomes increases. Another is that it is difficult to cluster genes harboring relatively short and/or variable elements. For example, although many of the known binding sites for CcpA, AbrB, SpooA and LexA overlap with PCEs, genes regulated by them were not clustered well with a reasonable value of the cut-off score. Currently, it is rather difficult to detect elements of about 6 bp long. It seems biologically reasonable, however, that in some large regulons, such as one regulated by CcpA, its binding affinity is modulated for each element. Thus, that all members of a known large regulon are not clustered is not always a failure of our approach. The third difficulty is related to the operon structure of bacterial genes. In some operons, the order of constituent genes is not conserved across species. Our method could not deal with cases when the position of the first gene was changed. As noted above, future incorporation of operon prediction may be useful. In fact, there is already research combining the predictions of transcription units and transcription factor binding sites [<abbr bid="B8">8</abbr>].</p>
         <p>On the other hand, our method could detect not only the DNA-binding sites for transcription factors but also some binding sites in RNA or conserved RNA secondary structure elements. This seems to reflect the fact that <it>B. subtilis</it> heavily exploits the antitermination mechanism to control gene expression [<abbr bid="B22">22</abbr>]. Thus, our method could grasp a global feature of the gene regulatory mechanism in <it>B. subtilis,</it> without any <it>a priori</it> knowledge about it.</p>
         <p>In conclusion, although it is difficult to detect the entire set of co-regulated genes with our method, it can be used as a powerful tool to explore them. In addition, our results can be used as criteria for comparing results from other methods, and are useful for developing a more elaborate method. Thus, our approach is a model for further studies.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Genome sequence data</p>
            </st>
            <p>Genome sequences of <it>B. subtilis</it> [<abbr bid="B17">17</abbr>] and <it>B. halodurans</it> [<abbr bid="B18">18</abbr>] with the annotation information were obtained from GenBank [<abbr bid="B35">35</abbr>] (accession numbers: AL009126 and BA000004, respectively). Unfinished genome sequence of <it>B. stearothermophilus</it> was downloaded from the website of the <it>B. stearothermophilus</it> genome-sequencing project at the University of Oklahoma [<abbr bid="B36">36</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Identification of orthologous genes</p>
            </st>
            <p>Genes orthologous between <it>B. subtilis</it> and <it>B. halodurans</it> were obtained by finding the best match counterpart of amino-acid sequence from each genome with BLASTP [<abbr bid="B37">37</abbr>]. As the annotation in the genome of <it>B. stearothermophilus</it> was not given, orthologs between <it>B. subtilis</it> and <it>B. stearothermophilus</it> were obtained as follows: a TBLASTN search was done against the contig sequences of <it>B. stearothermophilus</it> for each amino-acid sequence of <it>B. subtilis.</it> If the best-hit alignment started before the tenth residue of the query, this translated counterpart was used as a BLASTP query against all <it>B. subtilis</it> sequences. If its best hit was identical with the initial query, they were regarded as orthologous.</p>
         </sec>
         <sec>
            <st>
               <p>Alignment of upstream regions</p>
            </st>
            <p>Although binding sites for transcription factors can sometimes exist in coding regions, we excluded <it>B. subtilis</it> genes with upstream intergenic regions of less than 50 bp from further analyses, in order to reduce potential noise. Next, the upstream 300 bp region of each <it>B. subtilis</it> gene and that of an orthologous gene, if any, were aligned with a local pairwise alignment program LALIGN [<abbr bid="B38">38</abbr>,<abbr bid="B39">39</abbr>]. The open gap penalty was set to 20, which is higher than the default value. Locally conserved regions in an upstream region of 300 bp from closely related species were realigned with the entire upstream region of <it>B. subtilis</it> without overlap. The most conserved element of either <it>B. halodurans</it> or <it>B. stearothermophilus</it> was first aligned with the upstream 300 bp sequence of <it>B. subtilis.</it> Next, the second most conserved element is aligned, unless this element overlaps with the previous alignment. This procedure was repeated for all detected elements. The final alignments are shown in DBTBS [<abbr bid="B19">19</abbr>,<abbr bid="B20">20</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Identification of phylogenetically conserved elements (PCEs)</p>
            </st>
            <p>On the basis of the alignments described above, we defined PCEs within the upstream noncoding region as follows: first, 3 bp segments where all of the nucleotides were conserved for three species were sought. Then, each segment was extended until a consecutive unconserved site appeared for each direction. Unless its score was less than 10, the sequence was designated a PCE (for the scoring of PCEs, see below). To increase the number of PCEs, we also identified PCEs even when they were conserved in only two species under a more stringent condition: segments of 6 bp where the nucleotides were conserved at all positions were first sought. Then, each sequence was extended in each direction until it faced a 3 bp segment in which two of the positions were unconserved. Unless its score was less than 20, it was assigned as a PCE (the cut-off score was chosen by observing the number of spurious PCEs detected when the upstream regions are joined to unrelated coding sequences). Thus, a PCE is an alignment of three or two conserved fragments from different species.</p>
         </sec>
         <sec>
            <st>
               <p>Scoring PCEs</p>
            </st>
            <p>Suppose a PCE, denoted by M, consists of a set of fragments of (two or three) species, S. The score of M was defined by</p>
            <p>Score(M) = -log<sub>2</sub> [&lt;&#928;<sub>x</sub> F<sub>xi</sub><sup>Nx</sup>>] (i &#8712; S, x &#8712; A, T, G, C),</p>
            <p>where the brackets (&lt; >) denote an average over S, F<sub>xi</sub>denotes the fraction of nucleotide x in the 300 bp upstream sequence of species i, and N<sub>x</sub> is the number of positions at which nucleotide x is conserved over S in M. Thus, the score of PCEs becomes low if they are short and rich in frequent nucleotides.</p>
         </sec>
         <sec>
            <st>
               <p>Clustering genes</p>
            </st>
            <p>Genes were clustered according to the similarity of PCEs in their upstream region. A similarity measure (S<sub>MN</sub>) between two PCEs, M and N, was defined by the sum of all pairwise alignment scores between any constituent sequences from both PCEs:</p>
            <p>s<sub>MN</sub> = &#931;<sub>mn</sub>L<sub>mn</sub> (m &#8712; sequences in M, n &#8712; sequences in N)</p>
            <p>L<sub>mn</sub> = max [l<sub>mn</sub>, <it>d</it> &#8226; l<sub>mn'</sub>],</p>
            <p>where l<sub>mn</sub> denotes the score of the Smith-Waterman local alignment algorithm [<abbr bid="B40">40</abbr>] between constituent sequences m and n (the match score, the mismatch cost and the gap cost were set to 1, 2 and 3, respectively); n' denotes the reverse complement of n; and <it>d</it> is an empirical cost for selecting n' (we set <it>d</it> = 0.9). As s<sub>MN</sub> becomes larger as the number of constituent sequences of M and N is larger, s<sub>MN</sub> was further normalized as follows:</p>
            <p>S<sub>MN</sub> = s<sub>MN</sub> &#8226; 9<it>b</it>/(k<sub>m</sub> &#8226; k<sub>n</sub>),</p>
            <p>where k<sub>m</sub>, and k<sub>n</sub> denote the number of constituent sequences of M and N, respectively; <it>b</it> is again an empirical cost for smaller values of k<sub>m</sub>, or k<sub>n</sub>:</p>
            <p><it>b</it> = 1.0 if both k<sub>m</sub> and k<sub>n</sub> are 3</p>
            <p><it>b</it> = 0.9 if either k<sub>m</sub> or k<sub>n</sub> is 2</p>
            <p><it>b</it> = 0.8 if both k<sub>m</sub> and k<sub>n</sub> are 2</p>
            <p>We used a simple algorithm UPGMA [<abbr bid="B41">41</abbr>] to cluster genes. The UPGMA algorithm was continued until no pairs of PCEs have a normalized similarity value of more than 60. We chose all of the above-mentioned empirical parameters by observing the results for known co-regulated genes.</p>
         </sec>
         <sec>
            <st>
               <p>Discarding clusters with SD-like PCEs</p>
            </st>
            <p>We discarded clusters when all of their members contain the SD sequence-like elements. More specifically, a member is considered to have an SD-like element if the <it>B. subtilis</it> sequence of its PCE contains a 5 bp segment where there are at least two Gs and one A but no Cs, and if this segment lies within the region 20 bp upstream from the translation initiation site. Subsequently, the cluster was discarded if all of the other members also have its corresponding regions.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>A <supplr sid="S1">table</supplr> showing all clusters ranked by the highest similarity score within each cluster is available as an Excel file.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Table showing all clusters ranked by the highest similarity score within each cluster</p>
            </caption>
            <text>
               <p>Table showing all clusters ranked by the highest similarity score within each cluster</p>
            </text>
            <file name="gb-2001-2-11-research0048-S1.xls">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We are grateful to Kenichi Yoshida and Joan Fujimura for critically reading the manuscript. We also thank the <it>Bacillus stearothermophilus</it> Genome Sequencing Project funded by NSF EPSCoR Program (Experimental Program to Stimulate Competitive Research Grant EPS-9550478) for providing the unfinished genome sequence of <it>B. stearothermophilus</it>. This work was partly supported by a Grant-in-Aid for Scientific Research on Priority Areas (C) "Genome Information Science" from the Ministry of Education, Culture, Sports, Science and Technology of Japan, by Special Coordination Funds for Promoting Science and Technology, and by Industrial Science and Technology Program from New Energy and Industrial Technology Development Organization, Japan.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Exploring the metabolic and genetic control of gene expression on a genomic scale.</p>
            </title>
            <aug>
               <au>
                  <snm>DeRisi</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>680</fpage>
            <lpage>686</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5338.680</pubid>
                  <pubid idtype="pmpid" link="fulltext">9381177</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns.</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Bostein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Systematic determination of genetic network architecture.</p>
            </title>
            <aug>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>22</volume>
            <fpage>281</fpage>
            <lpage>285</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/10343</pubid>
                  <pubid idtype="pmpid" link="fulltext">10391217</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Prediction of transcriptional regulatory sites in the complete genome sequence of <it>Escherichia coli</it> K-12.</p>
            </title>
            <aug>
               <au>
                  <snm>Thieffry</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Huerta</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>391</fpage>
            <lpage>400</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/14.5.391</pubid>
                  <pubid idtype="pmpid" link="fulltext">9682052</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete <it>Escherichia coli</it> K-12 genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Robinson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>McGire</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>284</volume>
            <fpage>241</fpage>
            <lpage>254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.2160</pubid>
                  <pubid idtype="pmpid" link="fulltext">9813115</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Evaluation and characterization of catabolite-responsive elements (<it>cre</it>) of <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Miwa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nakata</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ogiwara</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fujita</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>1206</fpage>
            <lpage>1210</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102602</pubid>
                  <pubid idtype="pmpid" link="fulltext">10666464</pubid>
                  <pubid idtype="doi">10.1093/nar/28.5.1206</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Roytberg</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>2981</fpage>
            <lpage>2989</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148515</pubid>
                  <pubid idtype="pmpid" link="fulltext">10390542</pubid>
                  <pubid idtype="doi">10.1093/nar/27.14.2981</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>A comparative genomics approach to prediction of new members of regulons.</p>
            </title>
            <aug>
               <au>
                  <snm>Tan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Moreno-Hagelsieb</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>566</fpage>
            <lpage>584</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.149301</pubid>
                  <pubid idtype="pmpid" link="fulltext">11282972</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Conservation of the binding site for arginine repressor in all bacterial lineages.</p>
            </title>
            <aug>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>research0013.1</fpage>
            <lpage>0013.8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">31482</pubid>
                  <pubid idtype="pmpid" link="fulltext">11305941</pubid>
                  <pubid idtype="doi">10.1186/gb-2001-2-4-research0013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Prediction of transcription regulatory sites in Archaea by a comparative genomic approach.</p>
            </title>
            <aug>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>695</fpage>
            <lpage>705</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102549</pubid>
                  <pubid idtype="pmpid" link="fulltext">10637320</pubid>
                  <pubid idtype="doi">10.1093/nar/28.3.695</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>McGuire</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>744</fpage>
            <lpage>757</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.6.744</pubid>
                  <pubid idtype="pmpid" link="fulltext">10854408</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Predicting regulons and their cis-regulatory motifs by comparative genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>McGuire</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>4523</fpage>
            <lpage>4530</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">113887</pubid>
                  <pubid idtype="pmpid" link="fulltext">11071941</pubid>
                  <pubid idtype="doi">10.1093/nar/28.22.4523</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Oeltjen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>959</fpage>
            <lpage>966</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9331366</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Active conservation of noncoding sequences revealed by three-way species comparisons.</p>
            </title>
            <aug>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mayor</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1304</fpage>
            <lpage>1306</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117030</pubid>
                  <pubid idtype="pmpid" link="fulltext">10984448</pubid>
                  <pubid idtype="doi">10.1101/gr.142200</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Human-mouse genome comparisons to locate regulatory sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Palumbo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fickett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <fpage>225</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/79965</pubid>
                  <pubid idtype="pmpid" link="fulltext">11017083</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>McCue</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Carmack</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Ryan</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Lie</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Derbyshire</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lewrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>774</fpage>
            <lpage>782</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">30389</pubid>
                  <pubid idtype="pmpid" link="fulltext">11160901</pubid>
                  <pubid idtype="doi">10.1093/nar/29.3.774</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>The complete genome sequence of the Gram-positive bacterium <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kunst</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Moszer</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Albertini</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Alloni</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Azevedo</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bertero</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Bessieres</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bolotin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Borchert</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>390</volume>
            <fpage>249</fpage>
            <lpage>256</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/36786</pubid>
                  <pubid idtype="pmpid" link="fulltext">9384377</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Complete genome sequence of the alkaliphilic bacterium <it>Bacillus halodurans</it> and genomic sequence comparison with <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Takami</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakasone</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Maeno</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sasaki</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Masui</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Fuji</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hirama</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>Y</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>4317</fpage>
            <lpage>4331</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">113120</pubid>
                  <pubid idtype="pmpid" link="fulltext">11058132</pubid>
                  <pubid idtype="doi">10.1093/nar/28.21.4317</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>DBTBS: A database of <it>Bacillus subtilis</it> promoters and transcription factors.</p>
            </title>
            <aug>
               <au>
                  <snm>Ishii</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Terai</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fujita</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>278</fpage>
            <lpage>280</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29858</pubid>
                  <pubid idtype="pmpid" link="fulltext">11125112</pubid>
                  <pubid idtype="doi">10.1093/nar/29.1.278</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>DBTBS</p>
            </title>
            <url>http://elmo.ims.u-tokyo.ac.jp/dbtbs</url>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Systematic function analysis of <it>Bacillus subtilis</it> genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Res Microbiol</source>
            <pubdate>2000</pubdate>
            <volume>151</volume>
            <fpage>129</fpage>
            <lpage>134</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0923-2508(00)00118-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10865958</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Aminoacyl-tRNA synthetase gene regulation in <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Condon</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Grunberg-Manago</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Puzer</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Biochimie</source>
            <pubdate>1996</pubdate>
            <volume>78</volume>
            <fpage>381</fpage>
            <lpage>389</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0300-9084(96)84744-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">8915527</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Roles of the three transcriptional attenuators of the <it>Bacillus subtilis</it> pyrimidine biosynthetic operon in the regulation of its expression.</p>
            </title>
            <aug>
               <au>
                  <snm>Lu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Turner</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Switzer</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1995</pubdate>
            <volume>177</volume>
            <fpage>1315</fpage>
            <lpage>1325</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">176739</pubid>
                  <pubid idtype="pmpid" link="fulltext">7868607</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Function of RNA secondary structures in transcriptional attenuation of the <it>Bacillus subtili spyr</it> operon.</p>
            </title>
            <aug>
               <au>
                  <snm>Lu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Turner</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Switzer</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1996</pubdate>
            <volume>93</volume>
            <fpage>14462</fpage>
            <lpage>14467</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">26155</pubid>
                  <pubid idtype="pmpid" link="fulltext">8962074</pubid>
                  <pubid idtype="doi">10.1073/pnas.93.25.14462</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>The S box regulon: a new global transcription termination control system for methionine and cysteine biosynthesis genes in Gram-positive bacteria.</p>
            </title>
            <aug>
               <au>
                  <snm>Grundy</snm>
                  <fnm>FJ</fnm>
               </au>
               <au>
                  <snm>Henkin</snm>
                  <fnm>TH</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1998</pubdate>
            <volume>30</volume>
            <fpage>737</fpage>
            <lpage>749</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1998.01105.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10094622</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Xanthine metabolism in <it>Bacillus subtilis</it>: characterization of the <it>xpt-pbuX</it> operon and evidence for purine- and nitrogen-controlled expression of genes involved in xanthine salvage and catabolism.</p>
            </title>
            <aug>
               <au>
                  <snm>Christiansen</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Schou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nygaard</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Saxild</snm>
                  <fnm>HH</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1997</pubdate>
            <volume>179</volume>
            <fpage>2540</fpage>
            <lpage>2550</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">179002</pubid>
                  <pubid idtype="pmpid" link="fulltext">9098051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>The Pfam protein families database.</p>
            </title>
            <aug>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Howe</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>263</fpage>
            <lpage>266</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102420</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592242</pubid>
                  <pubid idtype="doi">10.1093/nar/28.1.263</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>CtsR, a novel regulator of stress and heat shock response, controls clp and molecular chaperone gene expression in gram-positive bacteria.</p>
            </title>
            <aug>
               <au>
                  <snm>Derre</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rapoport</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Msadek</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1999</pubdate>
            <volume>31</volume>
            <fpage>117</fpage>
            <lpage>131</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1999.01152.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">9987115</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>ClpE, a novel type of HSP100 ATPase, is part of the CtsR heat shock regulon of <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Derre</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rapoport</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Devine</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rose</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Msadek</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1999</pubdate>
            <volume>32</volume>
            <fpage>581</fpage>
            <lpage>593</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1999.01374.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10320580</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>The <it>dnaK</it> operon of <it>Bacillus subtilis</it> is heptacistronic.</p>
            </title>
            <aug>
               <au>
                  <snm>Homuth</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Masuda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mogk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Schumann</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1997</pubdate>
            <volume>179</volume>
            <fpage>1153</fpage>
            <lpage>1164</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">178811</pubid>
                  <pubid idtype="pmpid" link="fulltext">9023197</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Combined transcriptome and proteome analysis as a powerful approach to study genes under glucose repression in <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Miwa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kang</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Matsunaga</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yamaguchi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tojo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nishi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>683</fpage>
            <lpage>692</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">30401</pubid>
                  <pubid idtype="pmpid" link="fulltext">11160890</pubid>
                  <pubid idtype="doi">10.1093/nar/29.3.683</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Modeling and predicting transcriptional units of <it>Escherichia coli</it> genes using hidden Markov models.</p>
            </title>
            <aug>
               <au>
                  <snm>Yada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Totoki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>987</fpage>
            <lpage>993</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.12.987</pubid>
                  <pubid idtype="pmpid" link="fulltext">10745988</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Operons in <it>Escherichia coli</it>: genomic analyses and predictions.</p>
            </title>
            <aug>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Moreno-Hagelsieb</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>6652</fpage>
            <lpage>6657</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">18690</pubid>
                  <pubid idtype="pmpid" link="fulltext">10823905</pubid>
                  <pubid idtype="doi">10.1073/pnas.110147297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Prediction of operons in microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ermolaeva</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>1216</fpage>
            <lpage>1221</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29727</pubid>
                  <pubid idtype="pmpid" link="fulltext">11222772</pubid>
                  <pubid idtype="doi">10.1093/nar/29.5.1216</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>GenBank</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>The <it>Bacillus stearothermophilus</it> genome-sequencing project</p>
            </title>
            <url>http://www.genome.ou.edu/bstearo.html</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>LALIGN</p>
            </title>
            <url>ftp://ftp.virginia.edu/pub/fasta</url>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Flexible sequence similarity searching with the FASTA3 program package.</p>
            </title>
            <aug>
               <au>
                  <snm>Pearson</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>Methods Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>132</volume>
            <fpage>185</fpage>
            <lpage>219</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10547837</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Identification of common molecular subsequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Waterman</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1981</pubdate>
            <volume>147</volume>
            <fpage>195</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7265238</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>A statistical method for evaluating systematic relationships.</p>
            </title>
            <aug>
               <au>
                  <snm>Sokal</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Michener</snm>
                  <fnm>CD</fnm>
               </au>
            </aug>
            <source>Univ Kansas Sci Bull</source>
            <pubdate>1958</pubdate>
            <volume>28</volume>
            <fpage>1409</fpage>
            <lpage>1438</lpage>
         </bibl>
         <bibl id="B42">
            <title>
               <p>The complete genome of <it>Bacillus subtilis</it>: from sequence annotation to data management and analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Moszer</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1998</pubdate>
            <volume>430</volume>
            <fpage>28</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(98)00620-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">9678589</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>SubtiList</p>
            </title>
            <url>http://genolist.pasteur.fr/SubtiList</url>
         </bibl>
      </refgrp>
   </bm>
</art>
