<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-10-23</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Computational prediction of cAMP receptor protein (CRP) binding sites in cyanobacterial genomes</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Xu</snm>
               <fnm>Minli</fnm>
               <insr iid="I1"/>
               <email>mxu5@uncc.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Su</snm>
               <fnm>Zhengchang</fnm>
               <insr iid="I1"/>
               <email>zcsu@uncc.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Bioinformatics and Genomics, Bioinformatics Research Center, the University of North Carolina at Charlotte, Charlotte, NC 28233, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>1</issue>
         <fpage>23</fpage>
         <url>http://www.biomedcentral.com/1471-2164/10/23</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19146659</pubid>
               <pubid idtype="doi">10.1186/1471-2164-10-23</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>31</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>15</day>
               <month>1</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>15</day>
               <month>1</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Xu and Su; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Cyclic AMP receptor protein (CRP), also known as catabolite gene activator protein (CAP), is an important transcriptional regulator widely distributed in many bacteria. The biological processes under the regulation of CRP are highly diverse among different groups of bacterial species. Elucidation of CRP regulons in cyanobacteria will further our understanding of the physiology and ecology of this important group of microorganisms. Previously, CRP has been experimentally studied in only two cyanobacterial strains: <it>Synechocystis sp</it>. PCC 6803 and <it>Anabaena sp</it>. PCC 7120; therefore, a systematic genome-scale study of the potential CRP target genes and binding sites in cyanobacterial genomes is urgently needed.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We have predicted and analyzed the CRP binding sites and regulons in 12 sequenced cyanobacterial genomes using a highly effective <it>cis</it>-regulatory binding site scanning algorithm. Our results show that cyanobacterial CRP binding sites are very similar to those in <it>E. coli</it>; however, the regulons are very different from that of <it>E. coli</it>. Furthermore, CRP regulons in different cyanobacterial species/ecotypes are also highly diversified, ranging from photosynthesis, carbon fixation and nitrogen assimilation, to chemotaxis and signal transduction. In addition, our prediction indicates that <it>crp </it>genes in modern cyanobacteria are likely inherited from a common ancestral gene in their last common ancestor, and have adapted various cellular functions in different environments, while some cyanobacteria lost their <it>crp </it>genes as well as CRP binding sites during the course of evolution.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The CRP regulons in cyanobacteria are highly diversified, probably as a result of divergent evolution to adapt to various ecological niches. Cyanobacterial CRPs may function as lineage-specific regulators participating in various cellular processes, and are important in some lineages. However, they are dispensable in some other lineages. The loss of CRPs in these species leads to the rapid loss of their binding sites in the genomes.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Cyclic AMP receptor protein (CRP), also known as catabolite gene activator protein (CAP), is an important transcriptional regulator widely distributed in a variety of bacterial groups <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The biological processes under the regulation of CRP are highly diverse, including energy metabolism <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, cell division and development <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, toxin production <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, competence development <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, quorum sensing <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and cellular motility <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. CRP belongs to the CRP/FNR transcription factor (TF) superfamily <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, which are generally believed to function as global regulators throughout the eubacteria <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Each member of the CRP/FNR superfamily contains an N-terminal effector binding domain and a C-terminal helix-turn-helix (HTH) DNA binding domain (DBD) <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The TFs of this superfamily form a homodimer <it>in vivo</it>, and are activated by the binding of specific small effector molecules to their effector binding domains <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The CRP dimer is activated by the binding of two cAMP molecules to the effector binding domain of each subunit, which causes a conformational change in the DBDs, allowing each to bind to half of a specific pseudo-palindromic DNA sequence in the promoters of the genes that are under CRP regulation <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Upon the binding, CRP interacts with the C-terminal domain of the alpha subunit of the RNA polymerase, affecting the RNA polymerase binding to the promoter, and thus leads to the change of the transcription initiation rate of the target gene <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>.</p>
         <p>The functions of CRP as well as its target genes (the CRP regulons) have been well studied in <it>E. coli </it>and other heterotrophic bacteria <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and it seems that all the sequenced <it>E. coli </it>genomes encode one copy of the <it>crp </it>gene. CRP in <it>E. coli </it>is characterized as a global regulator, which controls the expression of more than 200 transcriptional units involved in various important biological processes of this organism <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Through decades of research, 269 CRP binding sites (RegulonDB release 5.8 <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>) in this species have been experimentally identified, which show a pseudo-palindromic consensus in the form of TGTGAN<sub>6</sub>TCACA. More recently, slightly different CRP binding sites with the consensus TGCGAN<sub>6</sub>TCGCA were also identified in <it>E. coli </it>and other &#947;-proteobacteria <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. One of the major functions of CRP in <it>E. coli </it>involves the transcriptional regulation of genes related to organic carbon assimilation and energy metabolism <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>As the life of <it>E. coli </it>and other heterotrophic organisms relies on the assimilation of organic carbon sources from the environment, it is not surprising that CRP works as an important global regulator to coordinate a variety of biological processes in these organisms. Cyanobacteria, on the other hand, are a group of autotrophic organisms capable of oxygenic photosynthesis; therefore, they do not rely on organic carbon source from the environment. Intriguingly, at least half of the sequenced cyanobacterial genomes encode at least one copy of the <it>crp </it>gene (see below). CRP proteins have been experimentally studied in two cyanobacterial strains, i.e. <it>Synechocystis sp</it>. PCC 6803 (PCC6803) <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp> and <it>Anabaena sp</it>. PCC 7120 (PCC7120) <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. In the PCC6803 genome, the open reading frame (ORF) <it>sll1371 </it>encodes a homologue to the <it>E. coli crp </it>gene <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, and has been named <it>sycrp1</it>. It has been shown that the product of this gene, SyCRP1 forms a homodimer, which can bind cAMP with high affinity <it>in vitro </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Furthermore, in the presence of cAMP, SyCRP1 could form a complex with DNA that contains the consensus CRP binding site similar to that in <it>E. coli </it>(TGTGAN<sub>6</sub>TCACA) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Further studies have revealed that SyCRP1 was essential for type IV pilus biogenesis and was involved in cell motility in PCC6803 <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B25">25</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. On the other hand, in the PCC7120 genome two ORFs, <it>alr0295 </it>and <it>alr2325</it>, were found to encode putative CRPs, and were named <it>ancrpA </it>and <it>ancrpB</it>, respectively <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Equilibrium dialysis measurements showed that both AnCRPA and AnCRPB (the gene products of <it>ancrpA </it>and <it>ancrpB </it>respectively) could bind cAMP. Electrophoresis mobility shift assay (EMSA) further demonstrated that AnCrpA could bind to the consensus CRP binding site in <it>E. coli </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. It has also been reported that both AnCrpA and AnCrpB are functional in PCC7120, the former regulates the expression of several genes involved in nitrogen fixation <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, and the latter controls the genes induced by nitrogen depletion <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. A few CRP binding sites in these two genomes have also been experimentally determined, which were found to form a palindromic motif with the consensus sequence TGTGAN<sub>6</sub>TCACA similar to that in <it>E.coli </it><abbrgrp><abbr bid="B25">25</abbr><abbr bid="B28">28</abbr></abbrgrp>. In addition, the promoter regions of most of these identified CRP-activated genes in these two cyanobacterial genomes also contain an <it>E. coli </it>-10 &#963;<sup>70</sup>-like box (TAN<sub>3</sub>T), located ~22 bp downstream the CRP binding site. These studies also suggested that CRPs in cyanobacteria might regulate a very different set of genes than those in <it>E. coli</it>. However, a systematic genome-scale study of the potential CRP target genes as well as CRP binding sites in cyanobacteria is hitherto lacking. In this paper, we have predicted the CRP regulons as well as CRP binding sites in 12 sequenced cyanobacterial genomes that encode at least one copy of the <it>crp </it>gene using a highly effective motif scanning algorithm <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. We have also investigated the degradation of the CRP binding sites in the rest of sequenced cyanobacterial genomes in which the <it>crp </it>genes were lost during the course of evolution.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>1. Materials</p>
            </st>
            <p>Genome sequences, predicted ORFs and annotation files of the following 29 cyanobacterial genomes were downloaded from the NCBI website at <url>ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria</url>: <it>Acaryochloris marina </it>MBIC11017 (MBIC11017), <it>Anabaena variabilis </it>ATCC 29413 (ATCC29413), <it>Anabaena sp</it>. PCC 7120 (PCC7120), <it>Gloeobacter violaceus </it>PCC 7421 (PCC7421), <it>Prochlorococcus marinus str</it>. AS9601 (AS9601), <it>Prochlorococcus marinus str</it>. MIT 9211 (MIT9211), <it>Prochlorococcus marinus str</it>. MIT 9215 (MIT9215), <it>Prochlorococcus marinus str</it>. MIT 9301 (MIT9301), <it>Prochlorococcus marinus </it>MIT9303 (MIT9303), <it>Prochlorococcus marinus </it>MIT9312 (MIT9312), <it>Prochlorococcus marinus str</it>. MIT 9313 (MIT9313), <it>Prochlorococcus marinus str</it>. MIT 9515 (MIT9515), <it>Prochlorococcus marinus str</it>. NATL1A (NATL1A), <it>Prochlorococcus marinus str</it>. NATL2A (NATL2A), <it>Prochlorococcus marinus </it>CCMP1375 (CCMP1375), <it>Prochlorococcus marinus </it>MED4 (MED4), <it>Synechococcus elongatus </it>PCC 6301 (PCC6301), <it>Synechococcus elongatus </it>PCC 7942 (PCC7942), <it>Synechococcus sp</it>. CC9311 (CC9311), <it>Synechococcus sp</it>. CC9605 (CC9605), <it>Synechococcus sp</it>. JA-2-3B'a(2&#8211;13) (B-Prime), <it>Synechococcus sp</it>. JA-3-3Ab (A-Prime), <it>Synechocystis sp</it>. PCC 6803 (PCC6803), <it>Synechococcus sp</it>. CC9902 (CC9902), <it>Synechococcus sp</it>. RCC307 (RCC307), <it>Synechococcus sp</it>. WH 7803 (WH7803), <it>Synechococcus sp</it>. WH8102 (WH8102), <it>Thermosynechococcus elongatus </it>BP-1 (BP-1), and <it>Trichodesmium erythraeum </it>IMS101 (IMS101).</p>
         </sec>
         <sec>
            <st>
               <p>2. Prediction of operons</p>
            </st>
            <p>We predicted operon structures in each cyanobacterial genomes using the Operon Finder Software (OFS) developed by Westover <it>et al </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. OFS predicts operons based on three pieces of information, including the intergenic distance, functional relatedness of gene annotations, and conserved gene neighborhoods. In this study, both a multi-gene operon and a singleton operon containing only one gene are referred as a transcription unit (TU).</p>
         </sec>
         <sec>
            <st>
               <p>3. Prediction of orthologues</p>
            </st>
            <p>A simple bidirectional best hit (BDBH) approach using the BLASTP program with an E-value cutoff 10<sup>-10 </sup>was used for the prediction of orthologous genes between each pair of genomes.</p>
         </sec>
         <sec>
            <st>
               <p>4. Phylogenetic analysis</p>
            </st>
            <p>To construct the CRP tree, the full length amino acid sequences of cyanobacterial CRPs were identified by the criteria described above using SyCRP1 (sll1371) in PCC6803 as the query sequence. Multiple sequence alignments of the identified cyanobacterial CRP sequences and that of <it>E. coli </it>K12 were performed using ClustalW implemented in MEGA <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> with default settings. A neighbor-joining (NJ) tree with Poisson correction was constructed using the MEGA program with the <it>E. coli </it>CRP (GI:16131236) being the outgroup. To construct the tree in Figure S1 (see Additional file <supplr sid="S1">1</supplr>), we used SyCRP1 as the query sequence to search the RefSeq database using BLASTP with an E-value cutoff 10<sup>-7</sup>. If there were multiple hits from a species, the hit with the smallest E-value was identified as the CRP in that species. The resulting sequences were used to construct an un-rooted tree in the similar way as described above. To construct the species tree, the DNA sequences of the 16S rRNA genes of the sequenced cyanobacteria and that of <it>E. coli </it>K12 were aligned using ClustalW with manual refinement. After the indels were discarded, the final alignments contain 1311 positions. A Neighbor-Joining tree of the 16 rRNA gene sequences was constructed with the <it>E. coli </it>K12 sequence being the outgroup using Kimura 2-parameter model. Statistical significance at each node in the trees was evaluated using 1000 bootstrap resamplings.</p>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p><b>Figure S1</b>. A protein tree containing 202 SyCRP1 orthologues from a wide range of bacterial species. The red-labeled branch is CRPs from cyanobacteria.</p>
               </text>
               <file name="1471-2164-10-23-S1.emf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>5. Phylogenetic footprinting and construction of the profile of CRP binding sites in cyanobacteria</p>
            </st>
            <p>A previous screening of the target genes of SyCRP1 in PCC6803 using microarray showed that 18 genes in 13 putative TUs were down-regulated in a <it>sycrp1 </it>disruptant compared with the wild type strain <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Our preliminary manual scanning of the upstream promoter regions of these TUs revealed that four of them contain a putative CRP binding site similar to those in <it>E. coli </it>(the first four TUs in Table <tblr tid="T1">1</tblr>). The rest nine TUs that do not contain putative CRP binding sites are likely to be regulated indirectly by CRP through different regulators. In addition, since the <it>crp </it>gene is auto-regulated in <it>E. coli </it>as well as in many other species <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, we also manually scanned the upstream inter-TU region of <it>sycrp1 </it>(<it>sll1371</it>) and identified a putative CRP binding site in it (Table <tblr tid="T1">1</tblr>). We thus used the orthologues (if they exist and were identified by the BDBH method) of these five TUs for the phylogenetic footprinting of CRP binding sites in cyanobacterial genomes.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Transcriptional units in PCC6803 used for the initial phylogenetic footprinting analysis.</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>ORF(s) within TU</p>
                     </c>
                     <c ca="left">
                        <p>Putative CRP binding site<sup>1</sup></p>
                     </c>
                     <c ca="left">
                        <p>Position<sup>2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>slr1667 slr1668 ssr2786</p>
                     </c>
                     <c ca="left">
                        <p><b>TGTGA</b>TCTGGG<b>TCACA</b></p>
                     </c>
                     <c ca="left">
                        <p>-245</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>slr2015 slr2016 slr2017 slr2018</p>
                     </c>
                     <c ca="left">
                        <p><b>GGTGT</b>TTATTG<b>TCACA</b></p>
                     </c>
                     <c ca="left">
                        <p>-346</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sll0443 sll0444 sll0445 sll0446 sll0447 sll0448 sll0449</p>
                     </c>
                     <c ca="left">
                        <p><b>GGTGA</b>TTAAGT<b>TCCCA</b></p>
                     </c>
                     <c ca="left">
                        <p>-371</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>slr0442</p>
                     </c>
                     <c ca="left">
                        <p><b>TGTGA</b>TCCAGA<b>TCACA</b></p>
                     </c>
                     <c ca="left">
                        <p>-189</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sll1371 sll1372 sll1373</p>
                     </c>
                     <c ca="left">
                        <p><b>AGTGA</b>AAAAAC<b>TCACT</b></p>
                     </c>
                     <c ca="left">
                        <p>-143</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>1. Bold, the highly conserved positions within the palindromic CRP binding sites.</p>
                  <p>2. Position of the putative CRP binding site relative to the first codon in the TU.</p>
               </tblfn>
            </tbl>
            <p>For this purpose, we pooled the entire upstream inter-TU regions of these five TUs in PCC6803, as well as those of TUs containing their orthologous genes in other cyanobacterial genomes which encode at least one <it>crp </it>gene. Motif finding programs including CUBIC <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and BioProspector <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, were then applied to predict CRP binding sites in these sequences. CUBIC is a graph theoretic based algorithm that identifies highly similar <it>k</it>-mers in a set of pooled sequences; while BioProspector uses a Gibbs sampling strategy to find overrepresented <it>k</it>-mers in a set of pooled sequences. Putative CRP binding sites with high scores were manually picked up from the motif finding results, and were used to build a preliminary profile of the CRP binding sites in cyanobacterial genomes.</p>
            <p>Since the number of binding sites used to construct the preliminary profile of CRP binding sites was relatively small (for the reason, see the Results section), in order to minimize possible bias of binding site sampling, we conducted a one-round iteration to obtain a more representative profile of the CRP binding sites. To this end, we scanned cyanobacterial genomes with the preliminary profile using the techniques described below, and picked the high scoring sites from each genome to construct the more representative profile of the CRP binding sites. We then used this final profile for genome-scale predictions of CRP binding sites in the cyanobacterial genomes.</p>
         </sec>
         <sec>
            <st>
               <p>6. Genome-wide prediction of CRP binding sites</p>
            </st>
            <p>The whole genome screening of all possible CRP binding sites were performed using an algorithm that we have developed previously <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The design of this algorithm is to enhance the prediction specificity by integrating the information of co-occurrence of multiple binding sites in the upstream region of a gene and that in the upstream regions of its orthologues in related genomes. Briefly, the final profile of the CRP binding sites obtained above was first used to scan all the inter-TU regions of the cyanobacterial genomes. The best motif was returned for each inter-TU region. Then the 19 to 31 bp downstream region of each putative binding site was further scanned for an <it>E. coli </it>-10 &#963;<sup>-70 </sup>like box (TAN<sub>3</sub>T) using a corresponding profile that we have constructed previously from cyanobacterial genomes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, to which a &#963;-factor of RNA polymerase is likely to bind to transcribe the downstream TU. Then, the upstream inter-TU regions of the orthologues of the genes in that particular TU in other cyanobacterial genomes were scanned for similar CRP binding sites and -10 like boxes. A score that combines these three pieces of information was computed to rank the putative CRP binding sites for each possible CRP-regulated TU.</p>
            <p>Specifically, let <it>M </it>be the final profile of the CRP binding sites obtained above. For each predicted transcription unit <it>U</it>(<it>g</it><sub>1</sub>, ..., <it>g</it><sub><it>n</it></sub>) containing genes <it>g</it><sub>1</sub>, ..., <it>g</it><sub><it>n </it></sub>in a genome <it>G</it>, we extract the entire upstream intergenic region (if that region is larger than 800 bp, then only the 800 bp upstream the translation starting site were extracted) and denoted it as <inline-formula><m:math name="1471-2164-10-23-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>I</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemysaK0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39E0@</m:annotation></m:semantics></m:math></inline-formula>. We also extracted a random DNA sequence with the same length as <inline-formula><m:math name="1471-2164-10-23-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>I</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemysaK0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39E0@</m:annotation></m:semantics></m:math></inline-formula> from the coding region, denoted as <inline-formula><m:math name="1471-2164-10-23-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>C</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4qam0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39D4@</m:annotation></m:semantics></m:math></inline-formula>. We say that <inline-formula><m:math name="1471-2164-10-23-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>I</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemysaK0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39E0@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math name="1471-2164-10-23-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>C</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4qam0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39D4@</m:annotation></m:semantics></m:math></inline-formula> are associated with <it>U</it>(<it>g</it><sub>1</sub>, ..., <it>g</it><sub><it>n</it></sub>) and with each genes <it>g</it><sub>1</sub>, ..., <it>g</it><sub><it>n </it></sub>as well. All the extracted <inline-formula><m:math name="1471-2164-10-23-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>I</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemysaK0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39E0@</m:annotation></m:semantics></m:math></inline-formula> from one genome are denoted as set <it>I</it><sub><it>U</it></sub>, and all the <inline-formula><m:math name="1471-2164-10-23-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>C</m:mi><m:mrow><m:mi>U</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>g</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>g</m:mi><m:mi>n</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4qam0aaSbaaSqaaiabdwfavjabcIcaOiabdEgaNnaaBaaameaacqaIXaqmaeqaaSGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaWqaaiabd6gaUbqabaWccqGGPaqkaeqaaaaa@39D4@</m:annotation></m:semantics></m:math></inline-formula> in the same genome are denoted as set <it>C</it><sub><it>U</it></sub>. For each <it>t </it>&#8712; <it>I</it><sub><it>U </it></sub>or <it>t </it>&#8712; <it>C<sub>U</sub></it>, we scan for possible CRP bindings sites using the profile <it>M</it>. The score of a putative binding site found in <it>t </it>by scanning with profile <it>M </it>is defined as,</p>
            <p>
               <display-formula id="M1">
                  <m:math name="1471-2164-10-23-i3" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>s</m:mi>
                              <m:mi>M</m:mi>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:munder>
                              <m:mrow>
                                 <m:mi>max</m:mi>
                                 <m:mo>&#8289;</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>h</m:mi>
                                 <m:mo>&#8834;</m:mo>
                                 <m:mi>t</m:mi>
                              </m:mrow>
                           </m:munder>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>i</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>l</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>I</m:mi>
                                    <m:mi>i</m:mi>
                                 </m:msub>
                              </m:mrow>
                           </m:mstyle>
                           <m:mi>ln</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>h</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>q</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>h</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4Cam3aaSbaaSqaaiabd2eanbqabaGccqGGOaakcqWG0baDcqGGPaqkcqGH9aqpdaWfqaqaaiGbc2gaTjabcggaHjabcIha4bWcbaGaemiAaGMaeyOGIWSaemiDaqhabeaakmaaqahabaGaemysaK0aaSbaaSqaaiabdMgaPbqabaaabaGaemyAaKMaeyypa0JaeGymaedabaGaemiBaWganiabggHiLdGccyGGSbaBcqGGUbGBjuaGdaWcaaqaaiabdchaWjabcIcaOiabdMgaPjabcYcaSiabdIgaOjabcIcaOiabdMgaPjabcMcaPiabcMcaPaqaaiabdghaXjabcIcaOiabdIgaOjabcIcaOiabdMgaPjabcMcaPiabcMcaPaaakiabcYcaSaaa@5B65@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M2">
                  <m:math name="1471-2164-10-23-i4" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>I</m:mi>
                              <m:mi>i</m:mi>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munder>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>b</m:mi>
                                    <m:mo>&#8712;</m:mo>
                                    <m:mo>{</m:mo>
                                    <m:mi>A</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>C</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>G</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>T</m:mi>
                                    <m:mo>}</m:mo>
                                 </m:mrow>
                              </m:munder>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>b</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mi>ln</m:mi>
                                 <m:mo>&#8289;</m:mo>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mi>p</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>i</m:mi>
                                       <m:mo>,</m:mo>
                                       <m:mi>b</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>q</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>b</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>/</m:mo>
                           <m:mi>a</m:mi>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemysaK0aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpcqGGOaakdaaeqbqaaiabdchaWjabcIcaOiabdMgaPjabcYcaSiabdkgaIjabcMcaPiGbcYgaSjabc6gaULqbaoaalaaabaGaemiCaaNaeiikaGIaemyAaKMaeiilaWIaemOyaiMaeiykaKcabaGaemyCaeNaeiikaGIaemOyaiMaeiykaKcaaaWcbaGaemOyaiMaeyicI4Saei4EaSNaemyqaeKaeiilaWIaem4qamKaeiilaWIaem4raCKaeiilaWIaemivaqLaeiyFa0habeqdcqGHris5aOGaeiykaKIaei4la8IaemyyaeMaeiilaWcaaa@58AC@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M3">
                  <m:math name="1471-2164-10-23-i5" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>a</m:mi>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>4</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mi>ln</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>n</m:mi>
                           <m:mo>+</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>ln</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>n</m:mi>
                           <m:mo>+</m:mo>
                           <m:mn>4</m:mn>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>&#8722;</m:mo>
                           <m:mfrac>
                              <m:mn>1</m:mn>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>4</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mstyle displaystyle="true">
                              <m:munder>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>b</m:mi>
                                    <m:mo>&#8712;</m:mo>
                                    <m:mo>{</m:mo>
                                    <m:mi>A</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>C</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>G</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>T</m:mi>
                                    <m:mo>}</m:mo>
                                 </m:mrow>
                              </m:munder>
                              <m:mrow>
                                 <m:mi>ln</m:mi>
                                 <m:mo>&#8289;</m:mo>
                                 <m:mi>q</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>b</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>&#8722;</m:mo>
                              </m:mrow>
                           </m:mstyle>
                           <m:mfrac>
                              <m:mi>n</m:mi>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>4</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mi>ln</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:munder>
                              <m:mrow>
                                 <m:mi>min</m:mi>
                                 <m:mo>&#8289;</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>b</m:mi>
                                 <m:mo>&#8712;</m:mo>
                                 <m:mo>{</m:mo>
                                 <m:mi>A</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>C</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>G</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>T</m:mi>
                                 <m:mo>}</m:mo>
                              </m:mrow>
                           </m:munder>
                           <m:mi>q</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>b</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyyaeMaeyypa0tcfa4aaSaaaeaacqWGUbGBcqGHRaWkcqaIXaqmaeaacqWGUbGBcqGHRaWkcqaI0aanaaGccyGGSbaBcqGGUbGBcqGGOaakcqWGUbGBcqGHRaWkcqaIXaqmcqGGPaqkcqGHsislcyGGSbaBcqGGUbGBcqGGOaakcqWGUbGBcqGHRaWkcqaI0aancqGGPaqkcqGHsisljuaGdaWcaaqaaiabigdaXaqaaiabd6gaUjabgUcaRiabisda0aaakmaaqafabaGagiiBaWMaeiOBa4MaemyCaeNaeiikaGIaemOyaiMaeiykaKIaeyOeI0caleaacqWGIbGycqGHiiIZcqGG7bWEcqWGbbqqcqGGSaalcqWGdbWqcqGGSaalcqWGhbWrcqGGSaalcqWGubavcqGG9bqFaeqaniabggHiLdqcfa4aaSaaaeaacqWGUbGBaeaacqWGUbGBcqGHRaWkcqaI0aanaaGccyGGSbaBcqGGUbGBdaWfqaqaaiGbc2gaTjabcMgaPjabc6gaUbWcbaGaemOyaiMaeyicI4Saei4EaSNaemyqaeKaeiilaWIaem4qamKaeiilaWIaem4raCKaeiilaWIaemivaqLaeiyFa0habeaakiabdghaXjabcIcaOiabdkgaIjabcMcaPiabcYcaSaaa@8148@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <it>l </it>is the length of the binding sites of <it>M</it>, <it>h </it>any substring of <it>t </it>with length <it>l</it>, <it>h</it>(<it>i</it>) the base at the <it>i</it>-th position of <it>h</it>, <it>p</it>(<it>i, b</it>) the relative frequency of base <it>b </it>occurring at position <it>i </it>in <it>M</it>, <it>q</it>(<it>b</it>) the background frequency of base <it>b</it>, and <it>n </it>the number of sequences used to construct <it>M</it>. When computing <it>p</it>(<it>i, b</it>), a pseudo count 1 is added to the frequency of each base at each position, and <it>a </it>is for normalization to keep <it>I</it><sub><it>i </it></sub>within the range of [0,1].</p>
            <p>When multiple profiles are considered for scanning <it>t</it>, we sum up the individual <it>S</it><sub><it>M</it></sub>(<it>t</it>) s as defined by</p>
            <p>
               <display-formula id="M4">
                  <m:math name="1471-2164-10-23-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>s</m:mi>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>M</m:mi>
                                    <m:mn>1</m:mn>
                                 </m:msub>
                                 <m:mn>...</m:mn>
                                 <m:msub>
                                    <m:mi>M</m:mi>
                                    <m:mi>z</m:mi>
                                 </m:msub>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>j</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>z</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>s</m:mi>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>M</m:mi>
                                          <m:mi>j</m:mi>
                                       </m:msub>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>t</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4Cam3aaSbaaSqaaiabd2eannaaBaaameaacqaIXaqmaeqaaSGaeiOla4IaeiOla4IaeiOla4Iaemyta00aaSbaaWqaaiabdQha6bqabaaaleqaaOGaeiikaGIaemiDaqNaeiykaKIaeyypa0ZaaabCaeaacqWGZbWCdaWgaaWcbaGaemyta00aaSbaaWqaaiabdQgaQbqabaaaleqaaOGaeiikaGIaemiDaqNaeiykaKcaleaacqWGQbGAcqGH9aqpcqaIXaqmaeaacqWG6bGEa0GaeyyeIuoakiabc6caUaaa@4940@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>For this study, we use <it>M</it><sub>1 </sub>for the CRP binding sites and <it>M</it><sub>2 </sub>for the -10 like box (TAN<sub>3</sub>T).</p>
            <p>For the inter-TU sequence <it>t </it>associated with <it>U</it>(<it>g</it><sub>1</sub>, ..., <it>g</it><sub><it>n</it></sub>) in genome <it>G</it>, if <it>g</it><sub><it>i </it></sub>has orthologues in <it>m</it><sub><it>i </it></sub>closely related genomes <inline-formula><m:math name="1471-2164-10-23-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>G</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>G</m:mi><m:mrow><m:msub><m:mi>m</m:mi><m:mi>i</m:mi></m:msub></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4raC0aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWGhbWrdaWgaaWcbaGaemyBa02aaSbaaWqaaiabdMgaPbqabaaaleqaaaaa@36B7@</m:annotation></m:semantics></m:math></inline-formula>, we denote <it>o</it><sub><it>k</it></sub>(<it>g</it><sub><it>i</it></sub>) as the inter-TU sequence upstream the TU containing <it>g</it><sub><it>i </it></sub>'s orthologue in genome <it>G</it><sub><it>k</it></sub>. When the presence of similar motifs in <it>o</it><sub><it>k</it></sub>(<it>g</it><sub><it>i</it></sub>) is also considered, the score of co-occurrence of the multiple binding sites in <it>t </it>is redefined as</p>
            <p>
               <display-formula id="M5">
                  <m:math name="1471-2164-10-23-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>s</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:msub>
                              <m:mi>s</m:mi>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>M</m:mi>
                                    <m:mn>1</m:mn>
                                 </m:msub>
                                 <m:mn>...</m:mn>
                                 <m:msub>
                                    <m:mi>M</m:mi>
                                    <m:mi>z</m:mi>
                                 </m:msub>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>+</m:mo>
                           <m:munder>
                              <m:mrow>
                                 <m:mi>max</m:mi>
                                 <m:mo>&#8289;</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mn>1</m:mn>
                                 <m:mo>&lt;</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo>&lt;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:munder>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>j</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>z</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:mstyle displaystyle="true">
                                    <m:munderover>
                                       <m:mo>&#8721;</m:mo>
                                       <m:mrow>
                                          <m:mi>k</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mn>1</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>m</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                       </m:mrow>
                                    </m:munderover>
                                    <m:mrow>
                                       <m:mfrac>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>l</m:mi>
                                                <m:mi>j</m:mi>
                                             </m:msub>
                                             <m:mo>&#8722;</m:mo>
                                             <m:msub>
                                                <m:mi>d</m:mi>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mo>,</m:mo>
                                                   <m:mi>j</m:mi>
                                                   <m:mo>,</m:mo>
                                                   <m:mi>k</m:mi>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mrow>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>m</m:mi>
                                                <m:mi>i</m:mi>
                                             </m:msub>
                                             <m:msub>
                                                <m:mi>l</m:mi>
                                                <m:mi>j</m:mi>
                                             </m:msub>
                                          </m:mrow>
                                       </m:mfrac>
                                       <m:msub>
                                          <m:mi>s</m:mi>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>M</m:mi>
                                                <m:mi>j</m:mi>
                                             </m:msub>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>o</m:mi>
                                          <m:mi>k</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>g</m:mi>
                                          <m:mi>i</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mstyle>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4CamNaeiikaGIaemiDaqNaeiykaKIaeyypa0Jaem4Cam3aaSbaaSqaaiabd2eannaaBaaameaacqaIXaqmaeqaaSGaeiOla4IaeiOla4IaeiOla4Iaemyta00aaSbaaWqaaiabdQha6bqabaaaleqaaOGaeiikaGIaemiDaqNaeiykaKIaey4kaSYaaCbeaeaacyGGTbqBcqGGHbqycqGG4baEaSqaaiabigdaXiabgYda8iabdMgaPjabgYda8iabd6gaUbqabaGcdaaeWbqaamaaqahabaqcfa4aaSaaaeaacqWGSbaBdaWgaaqaaiabdQgaQbqabaGaeyOeI0Iaemizaq2aaSbaaeaacqWGPbqAcqGGSaalcqWGQbGAcqGGSaalcqWGRbWAaeqaaaqaaiabd2gaTnaaBaaabaGaemyAaKgabeaacqWGSbaBdaWgaaqaaiabdQgaQbqabaaaaOGaem4Cam3aaSbaaSqaaiabd2eannaaBaaameaacqWGQbGAaeqaaaWcbeaakiabcIcaOiabd+gaVnaaBaaaleaacqWGRbWAaeqaaOGaeiikaGIaem4zaC2aaSbaaSqaaiabdMgaPbqabaGccqGGPaqkaSqaaiabdUgaRjabg2da9iabigdaXaqaaiabd2gaTnaaBaaameaacqWGPbqAaeqaaaqdcqGHris5aOGaeiykaKcaleaacqWGQbGAcqGH9aqpcqaIXaqmaeaacqWG6bGEa0GaeyyeIuoakiabcYcaSaaa@78FD@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <it>d</it><sub><it>i</it>, <it>j</it>, <it>k </it></sub>is the Hamming distance between the sequence found by using profile <it>M</it><sub><it>j </it></sub>in <it>t </it>and <it>o</it><sub><it>k</it></sub>(<it>g</it><sub><it>i</it></sub>), <it>l</it><sub><it>j </it></sub>the length of binding site motif with profile <it>M</it><sub><it>j</it></sub>.</p>
         </sec>
         <sec>
            <st>
               <p>7. Statistical evaluation of the predicted binding sites</p>
            </st>
            <p>We evaluated the statistical significance of our predictions by comparing the probability of finding a high scoring binding site in inter-TU sequences with that of finding the same high scoring binding site in randomly extracted coding sequences <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Let <it>p</it>(<it>S</it><sub><it>t</it></sub>><it>s</it>) be the probability that an extracted sequence <it>t </it>(<it>t </it>&#8712; <it>I</it><sub><it>U </it></sub>or <it>t </it>&#8712; <it>C</it><sub><it>U</it></sub>) contains a putative binding site with a score (<it>S</it><sub><it>t</it></sub>) larger than <it>s</it>. To avoid possible biased sampling, 300 <it>C</it><sub><it>u</it></sub>s for each TU in each genome were randomly generated, and <it>S</it><sub><it>Cu </it></sub>was computed for each <it>C</it><sub><it>u</it></sub>. Then a log-odd ratio (LOR) function defined as following was used to estimate the confidence of the prediction:</p>
            <p>
               <display-formula id="M6">
                  <m:math name="1471-2164-10-23-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>L</m:mi>
                           <m:mi>O</m:mi>
                           <m:mi>R</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>s</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mi>ln</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mi>S</m:mi>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>I</m:mi>
                                          <m:mi>U</m:mi>
                                       </m:msub>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>></m:mo>
                                 <m:mi>s</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mi>S</m:mi>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>C</m:mi>
                                          <m:mi>U</m:mi>
                                       </m:msub>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>></m:mo>
                                 <m:mi>s</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemitaWKaem4ta8KaemOuaiLaeiikaGIaem4CamNaeiykaKIaeyypa0JagiiBaWMaeiOBa4wcfa4aaSaaaeaacqWGWbaCcqGGOaakcqWGtbWudaWgaaqaaiabdMeajnaaBaaabaGaemyvaufabeaaaeqaaiabg6da+iabdohaZjabcMcaPaqaaiabdchaWjabcIcaOiabdofatnaaBaaabaGaem4qam0aaSbaaeaacqWGvbqvaeqaaaqabaGaeyOpa4Jaem4CamNaeiykaKcaaiabc6caUaaa@4A9D@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Since <inline-formula><m:math name="1471-2164-10-23-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>C</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaeiikaGIaem4uam1aaSbaaSqaaiabdoeadnaaBaaameaacqWGvbqvaeqaaaWcbeaakiabg6da+iabdohaZjabcMcaPaaa@3546@</m:annotation></m:semantics></m:math></inline-formula> is the probability of type-I error for testing the null hypothesis that <it>I</it><sub><it>u </it></sub>does not contain a binding site when <inline-formula><m:math name="1471-2164-10-23-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>I</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uam1aaSbaaSqaaiabdMeajnaaBaaameaacqWGvbqvaeqaaaWcbeaaaaa@2FB6@</m:annotation></m:semantics></m:math></inline-formula> is greater than a cutoff <it>s</it>, we used it to estimate the false positive rate of the prediction results. In this way, <inline-formula><m:math name="1471-2164-10-23-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>C</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaeiikaGIaem4uam1aaSbaaSqaaiabdoeadnaaBaaameaacqWGvbqvaeqaaaWcbeaakiabg6da+iabdohaZjabcMcaPaaa@3546@</m:annotation></m:semantics></m:math></inline-formula> could be also considered as an empirical <it>p-</it>value, and a cutoff of <it>p </it>&lt; 0.01 was used for the CRP binding site prediction in each genome.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>1. Conservation of the DBDs of the CRP proteins in cyanobacterial genomes</p>
            </st>
            <p>Using the BDBH algorithm and the criterion described in Methods, we identified orthologues of SyCRP1 (sll1371) of PCC6803 in 12 of the 29 sequenced cyanobacterial genomes. All these 12 genomes encode one SyCRP1 homologue, with the exception that the PCC6803 and PCC7120 genomes contain two: sll1371 and sll1924 (SyCRP2) in PCC6803, and alr0295 (AnCRPA) and alr2325 (AnCRPB) in PCC7120, which are identified by a unidirectional BLASTP search using SyCRP1 of PCC6803 as the query and an E-value cutoff 10<sup>-20</sup>. The phylogenetic tree of these 12 CRP orthologues shows that they fall into two distinct groups (Figure <figr fid="F1">1a</figr>). There are clear subtle differences between the DBDs of the two groups. Nonetheless, the residues of the DBD that are in direct interaction with the DNA counterpart as revealed by the crystal structure of <it>E. coli </it>CRP/DNA complex <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>, are highly conserved in all these cyanobacterial CRP sequences, suggesting that they might bind to similar DNA sequences (Figure <figr fid="F1">1b</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>(a) Phylogenetic relationships of CRPs from 12 cyanobacterial genomes</p>
               </caption>
               <text>
                  <p><b>(a) Phylogenetic relationships of CRPs from 12 cyanobacterial genomes</b>. The tree is rooted with the CRP of <it>E. coli </it>K12, and bootstrap values are shown on the nodes. <b>(b) </b>Multiple sequence alignments of the DNA binding domains of the 12 cyanobacterial CRPs. The DNA binding domains contains a helix-turn-helix motif in which residues 1&#8211;9 form the first helix, residues 10&#8211;12 form the turn and residues 13&#8211;25 form the second helix. Particularly, residues 13&#8211;21 form the DNA recognition helix with residues 13, 14, 18 (indicated by solid dots) being in direct contact with DNA through forming hydrogen bonds with it <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. <b>(c)</b>. Logo representation of the profile of the 112 putative CRP binding sites predicted by the phylogenetic footprinting technique. Logo was generated by the Weblogo server <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>.</p>
               </text>
               <graphic file="1471-2164-10-23-1"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>2. Profiles of the CRP binding sites predicted by phylogenetic footprinting</p>
            </st>
            <p>When the 18 genes located in the five TUs (Table <tblr tid="T1">1</tblr>) in PCC6803 that are likely to be regulated by SyCRP1 in this species, are searched against the 12 cyanobacterial genomes that encode <it>crp </it>genes, we found a total of 30 orthologues in five genomes. These 30 orthologous genes are located in a total of 10 TUs in these five genomes, indicating that these TUs are not well conserved in these 12 genomes. By using the phylogenetic footprinting techniques (see Methods), we predicted a total of eight putative CRP binding sites from the 10 upstream inter-TU regions, suggesting that the CRP binding sites are largely shared by these orthologous genes. In order to increase the representation of the profile of the CRP binding sites and to minimize the possible bias of our original choice of the five TUs, we performed a one-round iteration of putative CRP binding site scanning using this preliminary profile constructed from these eight putative CRP binding sites (see Methods). From this preliminary whole genome scanning results, we selected a total of 112 putative CRP binding sites (Table S1 in Additional file <supplr sid="S2">2</supplr>) with high scores to construct the final profile of CRP binding sites. These sites display a strong pseudo-palindromic structure with consensus TGTGAN<sub>6</sub>TCACA (Figure <figr fid="F1">1c</figr>), which is similar to the canonical CRP binding sites in <it>E. coli</it>, suggesting that the pattern of CRP binding sites is well conserved between cyanobacterial and <it>E. coli</it>. This result is also in agreement with the observation that the binding sites of members of the CRP/FNR superfamily maintain a high level of conservation across difference lineages <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p><b>Table S1</b>. Putative CRP binding sites used to construct the profile of CRP binding sites. <b>Table S2</b>. Most conserved putative CRP-regulated genes/TUs in 12 cyanobacterial genomes. <b>Table S3&#8211;14</b>. Predicted CRP binding sites in 12 cyanobacterial genome at <it>p</it>-value &lt; 0.01.</p>
               </text>
               <file name="1471-2164-10-23-S2.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>3. Genome-wide prediction of CRP binding sites in the 12 cyanobacterial genomes</p>
            </st>
            <p>We then apply our motif scanning algorithm to predict putative CRP binding sites in the 12 cyanobacterial genomes that encode a <it>crp </it>gene using the profile of the 112 putative CRP binding sites as well as that of the previously prepared -10 like box from cyanobacteria <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The log-odds ratio (LOR) function of the predictions in each genome is all high when the score <it>s </it>is high (Figures <figr fid="F2">2</figr>), suggesting that these genomes are likely to contain functional CRP binding sites. The predictions in each of the 12 genomes at <it>p</it>-value &lt; 0.01 are listed in Tables S3-S14 in Additional file <supplr sid="S2">2</supplr>, and all the prediction results are available upon request. In general, all these predicted CRP promoters contain a pseudo-palindromic CRP binding site and most of them also contain a downstream TA-rich &#963;-factor binding site, therefore they are likely to be true CRP promoters. As shown in Table S12 (see Additional file <supplr sid="S2">2</supplr>), the five CRP binding sites associated with the five TUs (shown in bold) in PCC6803, which we selected as the starting point of the our genome-wide prediction, were not necessarily ranked very high (they were ranked 4th, 12th, 14th, 17th, and 43rd) among the 59 predicted CRP binding sites in that genome, suggesting that our initial choice of the five TUs did not bias our prediction to them. Table <tblr tid="T2">2</tblr> summarizes the predictions in the 12 cyanobacterial genomes, and the most prevalent genes of predicted CRP regulons are listed in Table S2 in Additional file <supplr sid="S2">2</supplr>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Summary of the genome-wide CRP binding site predictions.</p>
               </caption>
               <tblbdy cols="9">
                  <r>
                     <c ca="left">
                        <p>Genome</p>
                     </c>
                     <c ca="center">
                        <p>No. of TU</p>
                     </c>
                     <c ca="center">
                        <p>% genes shared with <it>E. coli </it>K12<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>Score at p &lt; 0.05</p>
                     </c>
                     <c ca="center">
                        <p>LOR at p &lt; 0.05</p>
                     </c>
                     <c ca="center">
                        <p>No. of sites predicted at p &lt; 0.05</p>
                     </c>
                     <c ca="center">
                        <p>Score at p &lt; 0.01</p>
                     </c>
                     <c ca="center">
                        <p>LOR at p &lt; 0.01</p>
                     </c>
                     <c ca="center">
                        <p>No. of sites predicted at p &lt; 0.01</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="9">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MBIC11017</p>
                     </c>
                     <c ca="center">
                        <p>3406</p>
                     </c>
                     <c ca="center">
                        <p>16.18</p>
                     </c>
                     <c ca="center">
                        <p>6.42</p>
                     </c>
                     <c ca="center">
                        <p>0.96</p>
                     </c>
                     <c ca="center">
                        <p>445</p>
                     </c>
                     <c ca="center">
                        <p>6.85</p>
                     </c>
                     <c ca="center">
                        <p>1.60</p>
                     </c>
                     <c ca="center">
                        <p>174</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ATCC29413</p>
                     </c>
                     <c ca="center">
                        <p>3278</p>
                     </c>
                     <c ca="center">
                        <p>19.98</p>
                     </c>
                     <c ca="center">
                        <p>6.95</p>
                     </c>
                     <c ca="center">
                        <p>1.14</p>
                     </c>
                     <c ca="center">
                        <p>521</p>
                     </c>
                     <c ca="center">
                        <p>7.41</p>
                     </c>
                     <c ca="center">
                        <p>1.85</p>
                     </c>
                     <c ca="center">
                        <p>211</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A-prime</p>
                     </c>
                     <c ca="center">
                        <p>1429</p>
                     </c>
                     <c ca="center">
                        <p>29.08</p>
                     </c>
                     <c ca="center">
                        <p>6.03</p>
                     </c>
                     <c ca="center">
                        <p>0.32</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>6.4</p>
                     </c>
                     <c ca="center">
                        <p>0.98</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B-prime</p>
                     </c>
                     <c ca="center">
                        <p>1495</p>
                     </c>
                     <c ca="center">
                        <p>29.20</p>
                     </c>
                     <c ca="center">
                        <p>6.1</p>
                     </c>
                     <c ca="center">
                        <p>0.37</p>
                     </c>
                     <c ca="center">
                        <p>109</p>
                     </c>
                     <c ca="center">
                        <p>6.51</p>
                     </c>
                     <c ca="center">
                        <p>1.08</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PCC7120</p>
                     </c>
                     <c ca="center">
                        <p>3300</p>
                     </c>
                     <c ca="center">
                        <p>18.56</p>
                     </c>
                     <c ca="center">
                        <p>6.89</p>
                     </c>
                     <c ca="center">
                        <p>1.17</p>
                     </c>
                     <c ca="center">
                        <p>537</p>
                     </c>
                     <c ca="center">
                        <p>7.34</p>
                     </c>
                     <c ca="center">
                        <p>2.01</p>
                     </c>
                     <c ca="center">
                        <p>249</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MIT9313</p>
                     </c>
                     <c ca="center">
                        <p>1340</p>
                     </c>
                     <c ca="center">
                        <p>32.19</p>
                     </c>
                     <c ca="center">
                        <p>6.47</p>
                     </c>
                     <c ca="center">
                        <p>0.63</p>
                     </c>
                     <c ca="center">
                        <p>124</p>
                     </c>
                     <c ca="center">
                        <p>6.83</p>
                     </c>
                     <c ca="center">
                        <p>1.55</p>
                     </c>
                     <c ca="center">
                        <p>63</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MIT9303</p>
                     </c>
                     <c ca="center">
                        <p>1786</p>
                     </c>
                     <c ca="center">
                        <p>24.62</p>
                     </c>
                     <c ca="center">
                        <p>6.34</p>
                     </c>
                     <c ca="center">
                        <p>0.69</p>
                     </c>
                     <c ca="center">
                        <p>186</p>
                     </c>
                     <c ca="center">
                        <p>6.81</p>
                     </c>
                     <c ca="center">
                        <p>1.10</p>
                     </c>
                     <c ca="center">
                        <p>55</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CC9311</p>
                     </c>
                     <c ca="center">
                        <p>1656</p>
                     </c>
                     <c ca="center">
                        <p>26.89</p>
                     </c>
                     <c ca="center">
                        <p>6.22</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>130</p>
                     </c>
                     <c ca="center">
                        <p>6.65</p>
                     </c>
                     <c ca="center">
                        <p>0.66</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CC9605</p>
                     </c>
                     <c ca="center">
                        <p>1391</p>
                     </c>
                     <c ca="center">
                        <p>28.42</p>
                     </c>
                     <c ca="center">
                        <p>5.93</p>
                     </c>
                     <c ca="center">
                        <p>0.67</p>
                     </c>
                     <c ca="center">
                        <p>237</p>
                     </c>
                     <c ca="center">
                        <p>6.37</p>
                     </c>
                     <c ca="center">
                        <p>1.31</p>
                     </c>
                     <c ca="center">
                        <p>90</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PCC6803</p>
                     </c>
                     <c ca="center">
                        <p>1622</p>
                     </c>
                     <c ca="center">
                        <p>27.14</p>
                     </c>
                     <c ca="center">
                        <p>6.34</p>
                     </c>
                     <c ca="center">
                        <p>0.76</p>
                     </c>
                     <c ca="center">
                        <p>181</p>
                     </c>
                     <c ca="center">
                        <p>6.82</p>
                     </c>
                     <c ca="center">
                        <p>1.19</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>IMS101</p>
                     </c>
                     <c ca="center">
                        <p>3253</p>
                     </c>
                     <c ca="center">
                        <p>19.34</p>
                     </c>
                     <c ca="center">
                        <p>7.13</p>
                     </c>
                     <c ca="center">
                        <p>1.01</p>
                     </c>
                     <c ca="center">
                        <p>451</p>
                     </c>
                     <c ca="center">
                        <p>7.68</p>
                     </c>
                     <c ca="center">
                        <p>1.40</p>
                     </c>
                     <c ca="center">
                        <p>132</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BP-1</p>
                     </c>
                     <c ca="center">
                        <p>1075</p>
                     </c>
                     <c ca="center">
                        <p>31.41</p>
                     </c>
                     <c ca="center">
                        <p>6.16</p>
                     </c>
                     <c ca="center">
                        <p>0.62</p>
                     </c>
                     <c ca="center">
                        <p>101</p>
                     </c>
                     <c ca="center">
                        <p>6.68</p>
                     </c>
                     <c ca="center">
                        <p>0.93</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>1. Calculated as the number of genes that have an orthologue in the <it>E. coli </it>K12 genome normalized to the total number of genes in that genome.</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Evaluation of the genome-wide prediction of CRP regulons in 12 cyanobacterial genomes</p>
               </caption>
               <text>
                  <p><b>Evaluation of the genome-wide prediction of CRP regulons in 12 cyanobacterial genomes</b>. The green curves represent the probability <inline-formula><m:math name="1471-2164-10-23-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>I</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaeiikaGIaem4uam1aaSbaaSqaaiabdMeajnaaBaaameaacqWGvbqvaeqaaaWcbeaakiabg6da+iabdohaZjabcMcaPaaa@3552@</m:annotation></m:semantics></m:math></inline-formula> and the blue ones represent <inline-formula><m:math name="1471-2164-10-23-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>C</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaeiikaGIaem4uam1aaSbaaSqaaiabdoeadnaaBaaameaacqWGvbqvaeqaaaWcbeaakiabg6da+iabdohaZjabcMcaPaaa@3546@</m:annotation></m:semantics></m:math></inline-formula>. The red curves are the log-odd ratio (LOR), defined as <inline-formula><m:math name="1471-2164-10-23-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>L</m:mi><m:mi>O</m:mi><m:mi>R</m:mi><m:mo stretchy="false">(</m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo><m:mo>=</m:mo><m:mi>ln</m:mi><m:mo>&#8289;</m:mo><m:mo stretchy="false">(</m:mo><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>I</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo><m:mo>/</m:mo><m:mi>p</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mrow><m:msub><m:mi>C</m:mi><m:mi>U</m:mi></m:msub></m:mrow></m:msub><m:mo>></m:mo><m:mi>s</m:mi><m:mo stretchy="false">)</m:mo><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemitaWKaem4ta8KaemOuaiLaeiikaGIaem4CamNaeiykaKIaeyypa0JagiiBaWMaeiOBa4MaeiikaGIaemiCaaNaeiikaGIaem4uam1aaSbaaSqaaiabdMeajnaaBaaameaacqWGvbqvaeqaaaWcbeaakiabg6da+iabdohaZjabcMcaPiabc+caViabdchaWjabcIcaOiabdofatnaaBaaaleaacqWGdbWqdaWgaaadbaGaemyvaufabeaaaSqabaGccqGH+aGpcqWGZbWCcqGGPaqkcqGGPaqkaaa@4BBD@</m:annotation></m:semantics></m:math></inline-formula> (see Methods).</p>
               </text>
               <graphic file="1471-2164-10-23-2"/>
            </fig>
            <p>Although it has been estimated that CRP controls the expression of more than 200 TUs in <it>E. coli </it><abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, and the RegulonDB (release 5.8) contains 288 experimentally verified CRP binding sites, the number of predicted CRP binding sites at <it>p</it>-value &lt; 0.01 in each cyanobacterial genomes is relatively small, ranging from 29 in BP-1 (containing 1075 putative TUs) to 249 in ATCC29413 (containing 3300 putative TUs), if one considers that the <it>E. coli </it>K12 genome encodes a total of 2070 putative TUs (predicted by the algorithm described in Methods), and that some of these cyanobacterial genomes contain much more TUs/genes (Table <tblr tid="T2">2</tblr>) than the <it>E. coli </it>K12 genome does. This result suggests that cyanobacterial CRPs might regulate fewer genes than the <it>E. coli </it>CRP does. We then ask whether the target genes of CRPs are conserved between <it>E. coli </it>and cyanobacteria, as well as among the 12 cyanobacterial genomes, given that the DBDs of CRPs as well as their binding sites are highly conserved.</p>
            <p>Although these 12 cyanobacterial genomes share 13.88~32.19% of their genes with <it>E. coli </it>(Table <tblr tid="T2">2</tblr>), they almost have no common genes in their CRP regulons with that of <it>E. coli </it>K12 (Tables <tblr tid="T3">3</tblr>, and S3-S14 in Additional file <supplr sid="S2">2</supplr>), suggesting that the CRPs in cyanobacteria have adapted to regulate very different sets of genes than those in <it>E. coli </it>during the course of evolution. On the other hand, the majority of the CRP targets in these cyanobacterial genomes are not conserved either, rather, only a small portion of the CRP targets are shared by more than 2 of the 12 cyanobacterial genomes (Figure <figr fid="F3">3</figr>), even though our motif scanning algorithm might be biased toward the genes that have orthologues in reference genomes (see Methods). In the extreme cases, many CRP targets are species or lineage specific, suggesting that the targets of CRP have changed very rapidly since the speciation of these cyanobacterial genomes. For example, only 32 out of the 149 putative CRP target genes in the PCC6803 genome are conserved in at least 2 of the 12 cyanobacterial genomes. The only exception is the PCC7120 genome in which the number is 218 out of 442. However, this is largely because the reference genomes include a closely related species ATCC29413. Moreover, the most conserved putative CRP-regulated genes are only shared by five genomes (Figure <figr fid="F3">3</figr>). In addition, 17 of the 29 sequenced cyanobacterial genome do not encode the <it>crp </it>gene, suggesting that CRP is not required for the life of these 17 organisms; alternatively, the function of CRP in these organisms have been replaced by another TF.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>The CRP regulons are not conserved between PCC6803, PCC7120 and <it>E. coli </it>K12.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p><it>E. coli </it>K12</p>
                     </c>
                     <c ca="center">
                        <p>PCC6803</p>
                     </c>
                     <c ca="center">
                        <p>PCC7120</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No. of genes</p>
                     </c>
                     <c ca="center">
                        <p>4133</p>
                     </c>
                     <c ca="center">
                        <p>5366</p>
                     </c>
                     <c ca="center">
                        <p>3172</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No. of TUs</p>
                     </c>
                     <c ca="center">
                        <p>2070</p>
                     </c>
                     <c ca="center">
                        <p>1622</p>
                     </c>
                     <c ca="center">
                        <p>3300</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No. of CRP-regulated genes</p>
                     </c>
                     <c ca="center">
                        <p>410</p>
                     </c>
                     <c ca="center">
                        <p>382 (p &lt; 0.05),149 (p &lt; 0.01)</p>
                     </c>
                     <c ca="center">
                        <p>969 (p &lt; 0.05),442 (p &lt; 0.01)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No. of CRP-regulated TUs</p>
                     </c>
                     <c ca="center">
                        <p>270</p>
                     </c>
                     <c ca="center">
                        <p>181 (p &lt; 0.05),59 (p &lt; 0.01)</p>
                     </c>
                     <c ca="center">
                        <p>537 (p &lt; 0.05),249 (p &lt; 0.01)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No. of CRP-regulated genes shared with <it>E. coli </it>K12</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>7 (p &lt; 0.05), 2 (p &lt; 0.01)</p>
                     </c>
                     <c ca="center">
                        <p>17 (p &lt; 0.05),6 (p &lt; 0.01)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Distribution of shared putative CRP-regulated genes in sequenced cyanobacterial genomes</p>
               </caption>
               <text>
                  <p><b>Distribution of shared putative CRP-regulated genes in sequenced cyanobacterial genomes</b>. Only two representative examples from PCC6803 and PCC7120 are shown for clarity. The horizontal axis is the number of genomes that share putative CRP-regulated genes with PCC6803 or PCC7120, and the vertical axis is the number of shared genes. Among the 149 predicted CRP-regulated genes (at <it>p </it>&lt; 0.01) in PCC6803, 117 (75%) are species specific, and the rest 32 (25%) are shared with other genomes; of the 442 predicted CRP-regulated genes (at p &lt; 0.01) in PCC7120, 224 (50.7%) are species specific, and the rest 318 (49.3%) are shared with other genomes, most of which (191) are shared with the closely related ATCC29413 genome.</p>
               </text>
               <graphic file="1471-2164-10-23-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>4. Functional classification of CRP regulons in cyanobacteria</p>
            </st>
            <p>Based on the functional annotations of the CRP target genes that we have predicted in this study, CRP seems to be involved in a rather diverse spectrum of functions in cyanobacteria (Table <tblr tid="T4">4</tblr>) as summarized below.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Putative CRP-regulated genes involved in different biological processes.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>Genome</p>
                     </c>
                     <c ca="center">
                        <p>Photosynthesis and carbon fixation</p>
                     </c>
                     <c ca="center">
                        <p>Carbon metabolism</p>
                     </c>
                     <c ca="center">
                        <p>Nitrogen assimilation</p>
                     </c>
                     <c ca="center">
                        <p>Transporters/porins</p>
                     </c>
                     <c ca="center">
                        <p>Kinases</p>
                     </c>
                     <c ca="center">
                        <p>Transcription factors</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MBIC11017</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_1272</it>
                        </p>
                        <p>
                           <it>AM1_1560</it>
                        </p>
                        <p>
                           <it>AM1_0526</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_2193</it>
                        </p>
                        <p>
                           <it>AM1_2114</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_2462</it>
                        </p>
                        <p>
                           <it>AM1_5490</it>
                        </p>
                        <p>
                           <it>AM1_0481</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_1165</it>
                        </p>
                        <p>
                           <it>AM1_6038</it>
                        </p>
                        <p>
                           <it>AM1_0773</it>
                        </p>
                        <p>
                           <it>AM1_2986</it>
                        </p>
                        <p>
                           <it>AM1_3534</it>
                        </p>
                        <p>
                           <it>AM1_4901</it>
                        </p>
                        <p>
                           <it>AM1_2335</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_5208</it>
                        </p>
                        <p>
                           <it>AM1_5107</it>
                        </p>
                        <p>
                           <it>AM1_2792</it>
                        </p>
                        <p>
                           <it>AM1_5169</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>AM1_3844</it>
                        </p>
                        <p>
                           <it>AM1_4185</it>
                        </p>
                        <p>
                           <it>AM1_5140</it>
                        </p>
                        <p>
                           <it>AM1_0632</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ATCC29413</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_4451</it>
                        </p>
                        <p>
                           <it>Ava_3710</it>
                        </p>
                        <p>
                           <it>Ava_0640</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_1491</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_4669</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_0687</it>
                        </p>
                        <p>
                           <it>Ava_4995</it>
                        </p>
                        <p>
                           <it>Ava_1172</it>
                        </p>
                        <p>
                           <it>Ava_0874</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_4457</it>
                        </p>
                        <p>
                           <it>Ava_0873</it>
                        </p>
                        <p>
                           <it>Ava_0613</it>
                        </p>
                        <p>
                           <it>Ava_4753</it>
                        </p>
                        <p>
                           <it>Ava_1559</it>
                        </p>
                        <p>
                           <it>Ava_3542</it>
                        </p>
                        <p>
                           <it>Ava_1149</it>
                        </p>
                        <p>
                           <it>Ava_3207</it>
                        </p>
                        <p>
                           <it>Ava_3867</it>
                        </p>
                        <p>
                           <it>Ava_4503</it>
                        </p>
                        <p>
                           <it>Ava_3009</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Ava_3703</it>
                        </p>
                        <p>
                           <it>Ava_1629</it>
                        </p>
                        <p>
                           <it>Ava_2629</it>
                        </p>
                        <p>
                           <it>Ava_1558</it>
                        </p>
                        <p>
                           <it>Ava_1021</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>A-prime</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CYA_0295</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CYA_2315</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>B-prime</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CYB_2824</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>CYB_0465</it>
                        </p>
                        <p>
                           <it>CYB_2795</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PCC7120</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>asr0847</it>
                        </p>
                        <p>
                           <it>alr0317 alr0318</it>
                        </p>
                        <p>
                           <it>alr0523 alr0524 alr0525</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>alr0169</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>alr0874</it>
                        </p>
                        <p>
                           <it>all3335 all3334 all3333 all3332</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>alr2210 alr2211 alr2212 alr2213</it>
                        </p>
                        <p>
                           <it>alr2118 alr2119 asr2220</it>
                        </p>
                        <p>
                           <it>all3335</it>
                        </p>
                        <p>
                           <it>alr3938</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>all0853</it>
                        </p>
                        <p>
                           <it>alr3037</it>
                        </p>
                        <p>
                           <it>alr1192</it>
                        </p>
                        <p>
                           <it>all1191</it>
                        </p>
                        <p>
                           <it>all3207</it>
                        </p>
                        <p>
                           <it>alr0428</it>
                        </p>
                        <p>
                           <it>all4668</it>
                        </p>
                        <p>
                           <it>alr1665</it>
                        </p>
                        <p>
                           <it>alr3268</it>
                        </p>
                        <p>
                           <it>alr2137</it>
                        </p>
                        <p>
                           <it>all0323</it>
                        </p>
                        <p>
                           <it>all3767</it>
                        </p>
                        <p>
                           <it>all2883</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>alr1044</it>
                        </p>
                        <p>
                           <it>all0187</it>
                        </p>
                        <p>
                           <it>all2962</it>
                        </p>
                        <p>
                           <it>alr1941</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MIT9313</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>PMT1665</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>PMT2110</it>
                        </p>
                        <p>
                           <it>PMT1524</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>PMT0265</it>
                        </p>
                        <p>
                           <it>PMT0845</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>PMT0986</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MIT9303</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P9303_22121</it>
                        </p>
                        <p>
                           <it>P9303_22711</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P9303_04171</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>P9303_17661</it>
                        </p>
                        <p>
                           <it>P9303_11401</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CC9311</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>sync_0219</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CC9605</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Syncc9605_1640</it>
                        </p>
                        <p>
                           <it>Syncc9605_0485</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Syncc9605_1575</it>
                        </p>
                        <p>
                           <it>Syncc9605_2642</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Syncc9605_2284</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PCC6803</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>sll1577 sll1578 sll1579 sll1580 ssl3093</it>
                        </p>
                        <p>
                           <it>sll0634</it>
                        </p>
                        <p>
                           <it>slr1739</it>
                        </p>
                        <p>
                           <it>sll1874 sll1875</it>
                        </p>
                        <p>
                           <it>slr1459</it>
                        </p>
                        <p>
                           <it>slr1838 slr1839 slr1838</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>sll1709</it>
                        </p>
                        <p>
                           <it>slr2082 slr2083</it>
                        </p>
                        <p>
                           <it>ssl2153</it>
                        </p>
                        <p>
                           <it>sll0084</it>
                        </p>
                        <p>
                           <it>slr1349</it>
                        </p>
                        <p>
                           <it>slr1350</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>slr1392</it>
                        </p>
                        <p>
                           <it>sll0537</it>
                        </p>
                        <p>
                           <it>sll0240</it>
                        </p>
                        <p>
                           <it>slr1452 slr1453 slr1454 slr1455 slr1457 slr1453</it>
                        </p>
                        <p>
                           <it>slr1488</it>
                        </p>
                        <p>
                           <it>sll0273</it>
                        </p>
                        <p>
                           <it>slr1950</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>slr1400</it>
                        </p>
                        <p>
                           <it>slr1805</it>
                        </p>
                        <p>
                           <it>slr0484</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>sll1708</it>
                        </p>
                        <p>
                           <it>sll1371</it>
                        </p>
                        <p>
                           <it>slr1489</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>IMS101</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Tery_4669</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Tery_2879</it>
                        </p>
                        <p>
                           <it>Tery_1324</it>
                        </p>
                        <p>
                           <it>Tery_4986</it>
                        </p>
                        <p>
                           <it>Tery_3858</it>
                        </p>
                        <p>
                           <it>Tery_0199</it>
                        </p>
                        <p>
                           <it>Tery_2779</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Tery_1627</it>
                        </p>
                        <p>
                           <it>Tery_3423</it>
                        </p>
                        <p>
                           <it>Tery_2051</it>
                        </p>
                        <p>
                           <it>Tery_4912</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Tery_1557</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>BP-1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tsr0033</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tlr1171</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tll0330</it>
                        </p>
                        <p>
                           <it>tlr2000 tlr2001 tlr2002 tlr2003</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tll0559</it>
                        </p>
                        <p>
                           <it>tlr0335</it>
                        </p>
                        <p>
                           <it>tlr2000 tlr2001 tlr2002 tlr2003</it>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <it>tll0328</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <sec>
               <st>
                  <p>4.1. Photosynthesis and carbon fixation</p>
               </st>
               <p>Various numbers of genes involved in photosynthesis and carbon fixation were predicted to bear a CRP-regulated promoter in the 12 cyanobacterial genomes that encode <it>crp </it>genes. Specifically, a total of 14 (at <it>p </it>&lt; 0.01) photosystem I and II reaction center genes were predicted to be regulated by CRP, including <it>AM1_0526, asr0847, Ava_4451, Ava_0640, asr0847, CYA_0295, CYB_2824, PMT1665, P9303_22121, P9303_22711, Syncc9605_1640, sll0634, slr1739</it>, and <it>Tery_4669</it>. Several carbon dioxide concentrating mechanism protein ccmK genes, including <it>alr0317-0318 </it>and <it>slr1838-slr1839</it>, were also predicted to bear a CRP binding site. Moreover, putative CRP-regulated promoters were found for the phycobiliprotein family light-harvesting genes, including <it>alr0523-0525, sll1577-1580, ssl3093, slr1459 </it>and <it>tsr0033</it>. In consistent with these results, it has been previously shown that cellular cAMP levels change significantly in response to environmental stimuli such as light-dark cycle <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. Thus cyanobacterial CRPs are likely to participate in photosynthesis pathways.</p>
            </sec>
            <sec>
               <st>
                  <p>4.2. Nitrogen assimilation</p>
               </st>
               <p>Three nitrogenase related proteins AM1_2462, Ava_4669 and alr0874 in MBIC11017, ATCC29413, and PCC7120, respectively, were predicted to be CRP-regulated. Furthermore, an operon encoding a nitrate transporter (all3332-3335) in PCC7120 was predicted to bear a CRP binding site. It has been previously reported that nitrogen starvation resulted in a 3&#8211;4-fold increase in intracellular cAMP level in <it>Anabaena variabilis </it><abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. Based on our prediction results, a possible scenario for a role of CRP in the signaling pathway of nitrogen assimilation could be as follows. Nitrogen starvation somehow increases the adenylyl cyclase activity, leading to an increase in the intracellular cAMP level. The activation of CRP by cAMP then lowers the transcription level of genes such as nitrate transporter (<it>all3332-3335</it>), while it enhances the expression of genes like <it>alr0874 </it>(nitrogenase reductase) and nitrogenase (<it>AM1_2462, Ava_4669</it>), as the cell switches to the more energy intensive nitrogen fixation of nitrogen gas. Nonetheless, an in-depth study is needed to elucidate the details of the role that CRP may play in nitrogen fixation in these cyanobacteria. Since not all cyanobacterial species are capable of nitrogen fixation, this role of CRP is unique to the cyanobacterial species capable of nitrogen fixation, such as MBIC11017, ATCC29413, and PCC7120.</p>
            </sec>
            <sec>
               <st>
                  <p>4.3. Transporters and porins</p>
               </st>
               <p>A few genes coding for transporters and porins were predicted to be CRP-regulated, including several ion transporter in PCC7120 (<it>alr2210-2213 </it>and <it>alr2118-2120</it>) and PCC6803 (<it>slr1392 </it>and <it>slr1950</it>). Besides, several antiporters and ABC transporters were also predicted to be CRP-regulated in various genomes, e.g. CYA_2315, Ava_0687, sll0240, slr1452-1457, tll0559.</p>
            </sec>
            <sec>
               <st>
                  <p>4.4. Kinases and two-component signal transduction systems</p>
               </st>
               <p>Dozens of genes coding for kinases and two-component signal transduction systems were predicted to be CRP-regulated, suggesting that CRP might play an important role in response to environmental changes in cyanobacteria. Interestingly, it has been reported that the CRPs in PCC6803 are involved in phototaxis as both <it>sycrp1 </it>and adenylyl cyclase mutants showed impaired phototaxis <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B29">29</abbr></abbrgrp>. However, the genes that are involved in signal transduction for pilus assembly and phototaxis, as listed in <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, were not predicted to be CRP-regulated by our algorithm, therefore, they might be regulated by CRP indirectly.</p>
            </sec>
            <sec>
               <st>
                  <p>4.5. Other functions unique to a species/lineage</p>
               </st>
               <p>Among the top hits of our prediction results, a large portion of putative CRP-regulated genes are species or lineage specific. The functions of these genes vary from genome to genome, such as the type IV pilus synthesis in PCC6803 (<it>slr1667-1668 </it>and <it>slr2015-2018</it>) and MBIC11017 (<it>AM1_3323-3324</it>); various transposase in PCC7120 (<it>all3624</it>), BP-1 (<it>tll2385</it>) and IMS101 (<it>Tery_0925</it>); methyltransferase in MBIC11017 (<it>AM1_5474</it>); aldo/keto reductase in A/B-Prime (<it>CYA_0976 </it>and <it>CYB_2928</it>); nblA in BP-1 (<it>tsr0033</it>); TPR repeat containing protein in ATCC29413 (<it>Ava</it>_<it>3483</it>); peptidase in MIT9313 (<it>PMT1940</it>); nuclease in CC9311 <it>(sync_1258</it>), etc. However, the functions of many other species or lineage specific putative CRP-regulated genes are largely unknown, most of them are annotated as hypothetical proteins, such as <it>slr0442</it>, <it>sll1268</it>, <it>ssr2848 </it>and <it>sll1924 </it>in PCC6803, <it>asr4669 </it>in PCC7120, <it>AM1</it>_<it>3950</it>-<it>3951</it>, <it>AM1</it>_<it>4103</it>, <it>AM1_4957 </it>and <it>AM1_2209-2210 </it>in MBIC11017, <it>CYA_0127/CYB_2776 </it>in A/B-Prime, <it>Tery_2530 </it>and <it>Tery_1044 </it>in IMS101, <it>Ava_3757 </it>in ATCC29413, <it>PMT1492</it>, and <it>PMT1223 </it>in MIT9313, <it>P9303</it>_<it>06191</it>, <it>P9303</it>_<it>04111 </it>and <it>P9303</it>_<it>12031 </it>in MIT9303, <it>sync_093</it>, and <it>sync_1261 </it>in CC9311, and <it>Syncc9605_0955 </it>and <it>Syncc9605_0452 </it>in CC9605. It would be interesting to experimentally characterize the functions of these genes as well as the roles that CRP plays in their transcriptional regulation.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>5. crp genes were lost in some cyanobacterial genomes during the course of evolution</p>
            </st>
            <p>The presence of <it>crp </it>genes in some, but not all cyanobacterial genomes, raised the question about their evolutionary origin: are <it>crp </it>genes in cyanobacteria acquired through horizontal gene transfer (HGT) from other species, or are they vertically inherited from their last common ancestor, but lost in some cyanobacterial species/strains? To address this question, we constructed a species tree of the 29 sequenced cyanobacterial genomes based on their 16S rRNA gene sequences (Figure <figr fid="F4">4</figr>). Although some nodes in the tree are not well supported by the currently available sequence data, the relationships of the species in the tree are in excellent agreement with a previously inferred cyanobacterial species tree <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. As shown in Figure <figr fid="F4">4</figr>, the genomes encoding a <it>crp </it>gene do not form a particular monophyletic group, rather, they are sporadically scattered across the tree. On the other hand, all the known cyanobacterial CRPs form a monophyletic group (Figure S1 in Additional file <supplr sid="S1">1</supplr>), suggesting that they are likely derived from a common ancestor. Therefore, the current sporadic distribution of the <it>crp </it>genes within the sequenced cyanobacteria likely resulted from differential gene losses during the course of evolution. Independent acquisition of the <it>crp </it>gene by individual cyanobacterial lineage/species, though theoretically possible, is less parsimonious since it would entail multiple HGT events from the same or closely related donors. Therefore, we conclude that the 17 cyanobacterial genomes that do not encode a <it>crp </it>gene actually lost their original ones during the course of evolution to adapt to their respective environments. It is interesting to note that most of the species lacking a <it>crp </it>gene belong to marine cyanobacteria; these species often have a reduced genome and inhabit a relative stable oligotrophic environment. On the other hand, the species that contain a <it>crp </it>gene are distributed in both fresh water and terrestrial environment. This again suggests that the <it>crp gene </it>might be beneficial to the species that live in a variable environment, and those that have adapted to a more stable environment such as oligotrophic ocean lost their <it>crp </it>genes.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Relationships between the 29 sequenced cyanobacterial species/strains inferred from their 16S rRNA genes</p>
               </caption>
               <text>
                  <p><b>Relationships between the 29 sequenced cyanobacterial species/strains inferred from their 16S rRNA genes</b>. The tree is rooted with the 16S rRNA gene of <it>E. coli </it>K12. Bootstrap values are shown on the nodes. The cyanobacterial species/strains encoding a <it>crp </it>gene are indicated by bold font.</p>
               </text>
               <graphic file="1471-2164-10-23-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>6. Degradation of CRP binding sites in cyanobacterial genomes after the crp genes were lost</p>
            </st>
            <p>We have previously shown that when a genome lost a TF in the course of evolution, then it would rapidly lose its cognate biding sites in inter-TU regions <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. To extend this conclusion, we applied our CRP binding site prediction algorithm to the 17 cyanobacterial genomes that do not encode a CRP orthologue. Indeed, in four genomes, namely, CC9902, RCC307, WH8102 and WH7803 (Figures S2 in Additional file <supplr sid="S3">3</supplr>), the LOR function oscillates around zero as the score <it>s </it>increases, indicating that there is no significant difference between the signal of CRP binding sites in the inter-TU regions and that in the randomly selected coding regions (see Methods). These results strongly suggest that these four genomes are unlikely to contain functional CRP binding sites, which is in agreement with our previous observation <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. However, there is a clear CRP binding site signal in the rest of 13 genomes as indicated by their relatively high LOR when the score <it>s </it>is high, suggesting that there exist CRP-like binding sites in the inter-TU regions in these genomes. The reason for this unexpected observation is unknown, but one explanation would be that these sites are bound by a different regulator in these genomes. To identify possible TFs in these genomes that are likely bind these CRP-like binding sites, we analyze the distribution of the CRP/FNR superfamily in these genomes, and found that they all encode at least one member of the superfamily, which are likely to recognize these CRP-like binding sites, as it has been shown that members of the CRP/FRN superfamily recognize similar consensus sequence, and the binding specificity is achieved through competitive binding among the members.</p>
            <suppl id="S3">
               <title>
                  <p>Additional File 3</p>
               </title>
               <text>
                  <p><b>Figure S2</b>. Genome-wide scanning for CRP-like binding sites in the 17 genomes that do not encode a CRP protein.</p>
               </text>
               <file name="1471-2164-10-23-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>1. CRP is a lineage-specific global regulator</p>
            </st>
            <p>Studies have shown that CRP in <it>E. coli </it>functions as an important global regulator controlling the expression of genes involved in many pathways such as the carbon and energy metabolism pathways. It was also reported that CRP regulates a variety of genes in other bacterial species. For instance, one recent study showed that CRP-like protein regulates genes involved in quorum sensing, motility and intestinal colonization in <it>Vibrio cholerae </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. In this study, we suggest that CRP in cyanobacteria seems to have distinct functions. First, our results show that the members of the CRP regulons in cyanobacteria have little in common with those in <it>E. coli </it>(Table <tblr tid="T3">3</tblr>), which is consistent with the observation that genes whose expression is mostly affected by <it>sycrp1 </it>disruption in PCC6803 are involved in the type IV pilus synthesis <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B25">25</abbr><abbr bid="B29">29</abbr></abbrgrp>. In contrast, genes that are involved in carbon and energy metabolisms as seen in <it>E. coli </it>are not significantly affected by <it>sycrp1 </it>disruption <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B25">25</abbr><abbr bid="B29">29</abbr></abbrgrp>. Second, cyanobacterial CRPs seem to regulate distinct sets of genes specific to each lineage or strain as we can not clearly identify a particular set of genes common to most of the 12 cyanobacterial genomes that are under CRP control (Figure <figr fid="F3">3</figr>, Table S2 in Additional file <supplr sid="S2">2</supplr>). Third, more than half (17) of the 29 completely sequenced cyanobacterial genomes do not encode a CRP orthologue, suggesting that CRP is dispensable in these strains/species. For a closely related group of species, it is unlikely that an essential global regulator can be replaced or lost in some genomes while present in the others. Therefore, we conclude that CRP is not a global regulator with conserved functions; instead, it is likely a lineage-specific global regulator.</p>
            <p>In fact, it has been shown that global regulators are not necessarily conserved in moderately related species. For instance, among the 7 and 6 global regulators in <it>E. coli </it>and <it>B. subtilis</it>, respectively, none of them is in common <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Furthermore, it is not surprising that CRP in cyanobacteria does not function as a conserved global regulator as seen in <it>E. coli</it>, given that CRP is mainly involved in the transcriptional regulation of genes related to carbon metabolism in <it>E. coli</it>, while organic carbon assimilation is no longer a constraint for the growth of autotrophic cyanobacteria. On the other hand, since nitrogen assimilation is a constraint for cyanobacteria, the nitrogen assimilation regulator, NtcA, a member of the CRP/FNR family, which is unanimously encode in all the 29 sequenced genomes, has been characterized as one of the conserved global regulators in cyanobacteria. Thus, it seems that global regulators are often lineage-specific and that the environment plays a vital role in determining which TF functions as a global regulator, and which genes are regulated by the TF.</p>
         </sec>
         <sec>
            <st>
               <p>2. The functions of CRP in cyanobacteria are highly diverse</p>
            </st>
            <p>It has been reported that SyCRP1 regulates the cellular motility in PCC6803, as <it>sycrp1 </it>disruptants were devoid of mobility and showed reduced type IV pilus biogenesis <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. In another study, it was shown that the operons <it>slr1667-slr1668 </it>and <it>slr2015-slr2018 </it>in PCC6803, which are involved in type IV pilus biosynthesis <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B45">45</abbr></abbrgrp>, were down-regulated in <it>sycrp1 </it>disruptants using microarray gene expression profiling <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. In consistent with these findings, we have predicted putative CRP binding sites for these genes. However, the <it>slr1667-slr1668 </it>genes were unique to PCC6803, and the orthologues of <it>slr2015-slr2018 </it>could only be found in MBIC11017 (Table S2 in Additional file <supplr sid="S2">2</supplr>) among the 29 cyanobacterial genomes. Thus, this function of CRP is likely to be restricted to these two species/strains only. In addition, based on our predictions of CRP regulons in the 12 cyanobacterial genomes, we argue that CRP might be also involved in other functions in different cyanobacterial lineage/strains, including carbon fixation, photosynthesis, nitrogen fixation, ion channels/transporters, and two-component signal transduction, etc. (Table <tblr tid="T4">4</tblr>, Table S3 &#8211; S14 in Additional file <supplr sid="S2">2</supplr>). Furthermore, as a large portion of our predicted CRP binding sites are associated with hypothetical proteins, CRP might be involved in the regulation of the other novel functions yet to be discovered. In this regard, we have provided a set of candidates for further experimental characterization. However, due to the lack of sufficient information about the functions of the CRP target genes, it is currently rather difficult to derive a general pathway model involving CRP regulons in cyanobacteria. Lastly, AnCrpA was shown to regulate the expression of genes related to nitrogen fixation in PCC7120 <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, including <it>all1517 </it>(<it>nifB</it>), <it>all1439</it>, <it>all1432 </it>(<it>hesA</it>), <it>alr2515 </it>(<it>coax-II</it>), and <it>alr2834 </it>(<it>hepC</it>). However, our algorithm failed to find high scoring CRP binding sites for all these genes, the possible reasons for this are addressed below.</p>
         </sec>
         <sec>
            <st>
               <p>3. Binding sites of AnCRPA in Anabaena sp. PCC7120</p>
            </st>
            <p>Our failure to identify high scoring CRP binding sites in the upstream regions of these nitrogen fixation genes in PCC7120 is actually in agreement with the finding by Suzuki and coworkers who suggested that AnCrpA might have a different binding site pattern from the conventional pseudo-palindromic motif <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. However, an <it>in vitro </it>binding affinity test using EMSA showed that AnCrpA could bind to the conventional palindromic motif <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Thus, a possible explanation of this inconsistency would be that AnCrpA in PCC7120 could form a monomer or a heterodimer in addition to a homodimer, given that two CRP homologues are encoded in this genome, and one of them does not have a DBD. Such a monomer or heterodimer might favor a non-canonical CRP binding site, while the homodimer remains its ability to bind to the conventional palindromic motif. Clearly, a more in-depth study on this topic is needed to verify this hypothesis. This explanation is in agreement with the results that our algorithm predicts many CRP binding sites for other genes in PCC7120 with statistical significance (Table S7 in Additional file <supplr sid="S2">2</supplr>).</p>
         </sec>
         <sec>
            <st>
               <p>4. Origin of crp genes in cyanobacteria and implications for the evolution of CRP regulatory networks in bacteria</p>
            </st>
            <p>Because the <it>crp </it>gene is widely distributed in many distantly related bacterial groups, including actinobacteria, aquificales, bacteroidetes, chlamydiae, chloroflexi, cyanobacteria, deinococci, firmicutes, fibrobacteres, planctomycetes, proteobacteria, spirochaetes, etc. (Figure S1 in Additional file <supplr sid="S1">1</supplr>), it was likely present in the common ancestor of the extant eubacterial lineages. Under this scenario, the <it>crp </it>gene in this ancestor was flexible enough to regulate different sets of genes. Alternatively, it is also possible that the <it>crp </it>gene originally evolved in a specific bacterial lineage and was subsequently spread to other groups via HGT. Such HGT events may benefit the recipient organism given the flexibility of CRP in regulating different biochemical activities, which are often coupled to environment stimuli leading to the generation of intracellular signaling molecule cAMP. Therefore, the <it>crp </it>gene in the ancestor of modern cyanobacteria was acquired by either vertical inheritance or an ancient HGT event from other lineages. Some cyanobacteria lost their <it>crp </it>genes since harboring the gene might not necessarily increase their fitness in their new environments. On the other hand, in other cyanobacteria, the <it>crp </it>genes were adapted to better meet their unique physiology and environmental requirements. In other groups of bacteria, <it>crp </it>evolved to regulate other lineage/species specific functions. For instance, it regulates the carbon and energy metabolisms in <it>E. coli</it>, cell-cell communication in <it>Stenotrophomonas maltophilia </it><abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, and quorum sensing in <it>Vibrio cholerae </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, etc</p>
         </sec>
         <sec>
            <st>
               <p>5. Rapid degradation of CRP binding sites</p>
            </st>
            <p>Evolution of <it>cis</it>-regulatory binding sites is an interesting, but not well-studied problem. The 17 cyanobacterial genomes that do not encode a CRP orthologue provided us an excellent opportunity to examine the degradation of the CRP binding site in these genomes. It was expected that high scoring CRP binding sites do not present in those genomes, as previous studies have indicated that binding sites rapidly fade out when the corresponding TF was lost during the course of evolution <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B47">47</abbr></abbrgrp>. We do see such fading in the CC9902, RCC307, WH7803, and WH8102 genomes (Figure S2 in Additional file <supplr sid="S3">3</supplr>). These observations were in consistent with the well-accepted rule that "if no such a TF in a genome, then no corresponding binding sites in the genome". However, surprisingly, in the rest 13 genomes, high scoring CRP-like binding sites seem to appear in the inter-TU regions with a higher probability than in the randomly selected coding regions (Figure S2 in Additional file <supplr sid="S3">3</supplr>), suggesting that there exist sequence patterns similar to CRP binding sites in these cyanobacterial genomes. A possible explanation of this unexpected observation would be that there exist in these genome TFs that recognize binding sites similar to CRP binding sites. Indeed, at least one member of the CRP/FNR superfamily is encoded in these 13 genomes, and it has been shown that the binding sites of the members of this TF superfamily are well conserved across many species with a wide range of evolutionary distance <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Thus, the CRP-like binding sites found in these genomes are likely to be recognized by these non-CRP TFs of the superfamily. The specificity of these similar binding sites is likely to be achieved through the competition among homologous TFs for the same binding site, which is governed by their thermodynamic equilibrium <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In this paper, we have predicted CRP binding sites in 12 cyanobacteria genomes that encode a CRP orthologue using a highly accurate motif scanning algorithm. Based on the analysis of these predictions as well as experimental data available to us, we conclude that 1) CRP has rather different functions in cyanobacteria than in <it>E. coli</it>; 2) cyanobacterial CRP also has a very diverse spectrum of functions in different lineages or species/stains, and is even dispensable in some species/strain; 3) CRPs in modern cyanobacteria are likely to be vertically inherited from their last common ancestor, and some cyanobacteria lost their <it>crp </it>genes during the course of evolution to adapt to their new environments; and 4) once the <it>crp </it>gene is lost, its binding sites degrade rapidly. Although many of our predictions still await experimental verification, we should have provided a high quality candidate set for further experimental characterization of the CRP binding sites and regulons in this important group of bacteria.</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>ORF: Open reading frame; TU: Transcription unit; BDBH: bidirectional best hit; CRP: cAMP receptor protein; DBD: DNA binding domain; HTH: helix-turn-helix; EMSA: electrophoresis mobility shift assay; TF: transcription factor; HGT: horizontal gene transfer; LOR: log-odd ratio; MBIC11017: <it>Acaryochloris marina </it>MBIC11017; ATCC29413: <it>Anabaena variabilis </it>ATCC 29413; A-prime: <it>Synechococcus sp</it>. JA-3-3Ab; B-prime: <it>Synechococcus sp</it>. JA-2-3B'a(2-13); PCC7421: <it>Gloeobacter violaceus </it>PCC 7421; PCC7120: <it>Nostoc sp</it>. PCC 7120; CCMP1375: <it>Prochlorococcus marinus </it>CCMP1375; MED4: <it>Prochlorococcus marinus </it>MED4; AS9601: <it>Prochlorococcus marinus </it>AS9601; MIT9211: <it>Prochlorococcus marinus </it>MIT 9211; MIT9215: <it>Prochlorococcus marinus </it>MIT 9215; MIT9301: <it>Prochlorococcus marinus </it>MIT 9301; MIT9312: <it>Prochlorococcus marinus </it>MIT 9312; MIT9303: <it>Prochlorococcus marinus </it>MIT9303; MIT9313: <it>Prochlorococcus marinus </it>MIT9313; MIT9515: <it>Prochlorococcus marinus </it>MIT 9515; NATL1A: <it>Prochlorococcus marinus </it>NATL1A; NATL2A: <it>Prochlorococcus marinus </it>NATL2A; PCC7942: <it>Synechococcus elongatus </it>PCC 7942; PCC6301: <it>Synechococcus elongatus </it>PCC 6301; RCC307: <it>Synechococcus sp</it>. RCC307; WH7803: <it>Synechococcus sp</it>. WH 7803; WH8102: <it>Synechococcus sp</it>. WH8102; CC9605: <it>Synechococcus sp</it>. CC9605; CC9902: <it>Synechococcus sp</it>. CC9902; CC9311: <it>Synechococcus sp</it>. CC9311; PCC6803: <it>Synechocystis sp</it>. PCC 6803; BF-1: <it>Thermosynechococcus elongates BP-1</it>; and IMS101: <it>Trichodesmium erythraeum </it>IMS101.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>M.X. designed and conducted the experiments, and wrote the manuscript. Z.S. conceived the project and wrote the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This research was supported by a start-up fund from the University of North Carolina at Charlotte to Z.S. We would like to thank Drs. Anthony Fodor, Devaki Bhaya, and Jingling Huang for their critical reading of this manuscript and suggestions. We would also like to thank the two anonymous reviewers whose comments have greatly improved this paper.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Transcriptional regulation by cAMP and its receptor protein</p>
            </title>
            <aug>
               <au>
                  <snm>Kolb</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Busby</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Buc</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Garges</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adhya</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1993</pubdate>
            <volume>62</volume>
            <fpage>749</fpage>
            <lpage>795</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.bi.62.070193.003533</pubid>
                  <pubid idtype="pmpid" link="fulltext">8394684</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Identification of the CRP regulon using in vitro and in vivo transcriptional profiling</p>
            </title>
            <aug>
               <au>
                  <snm>Zheng</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Constantinidou</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hobman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Minchin</snm>
                  <fnm>SD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>19</issue>
            <fpage>5874</fpage>
            <lpage>5893</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">528793</pubid>
                  <pubid idtype="pmpid" link="fulltext">15520470</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh908</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Anaerobic citrate metabolism and its regulation in enterobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Bott</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Arch Microbiol</source>
            <pubdate>1997</pubdate>
            <volume>167</volume>
            <issue>2/3</issue>
            <fpage>78</fpage>
            <lpage>88</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1007/s002030050419</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The galactose regulon of Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Weickert</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Adhya</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1993</pubdate>
            <volume>10</volume>
            <issue>2</issue>
            <fpage>245</fpage>
            <lpage>251</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-2958.1993.tb01950.x</pubid>
                  <pubid idtype="pmpid">7934815</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Control mechanism of the Escherichia coli K-12 cell cycle is triggered by the cyclic AMP-cyclic AMP receptor protein complex</p>
            </title>
            <aug>
               <au>
                  <snm>Utsumi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Noda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kawamukai</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Komano</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1989</pubdate>
            <volume>171</volume>
            <issue>5</issue>
            <fpage>2909</fpage>
            <lpage>2912</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">209987</pubid>
                  <pubid idtype="pmpid" link="fulltext">2540158</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The gene encoding cAMP receptor protein is required for competence development in Haemophilus influenzae Rd</p>
            </title>
            <aug>
               <au>
                  <snm>Chandler</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1992</pubdate>
            <volume>89</volume>
            <issue>5</issue>
            <fpage>1626</fpage>
            <lpage>1630</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">48505</pubid>
                  <pubid idtype="pmpid" link="fulltext">1542653</pubid>
                  <pubid idtype="doi">10.1073/pnas.89.5.1626</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The cyclic AMP receptor protein modulates quorum sensing, motility and multiple genes that affect intestinal colonization in Vibrio cholerae</p>
            </title>
            <aug>
               <au>
                  <snm>Liang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Pascual-Montano</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Silva</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Benitez</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Microbiology</source>
            <pubdate>2007</pubdate>
            <volume>153</volume>
            <fpage>2964</fpage>
            <lpage>2975</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/mic.0.2007/006668-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">17768239</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>A cAMP receptor protein, SYCRP1, is responsible for the cell motility of Synechocystis sp. PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yoshihara</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Okamoto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ikeuchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Plant Cell Physiol</source>
            <pubdate>2002</pubdate>
            <volume>43</volume>
            <issue>4</issue>
            <fpage>460</fpage>
            <lpage>463</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/pcp/pcf050</pubid>
                  <pubid idtype="pmpid" link="fulltext">11978874</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Phototaxis and Impaired Motility in Adenylyl Cyclase and Cyclase Receptor Protein Mutants of Synechocystis sp. Strain PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Bhaya</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nakasugi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Fazeli</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Burriesci</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2006</pubdate>
            <volume>188</volume>
            <issue>20</issue>
            <fpage>7306</fpage>
            <lpage>7310</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1636242</pubid>
                  <pubid idtype="pmpid" link="fulltext">17015670</pubid>
                  <pubid idtype="doi">10.1128/JB.00573-06</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Phylogeny of the bacterial superfamily of Crp-Fnr transcription regulators: exploiting the metabolic spectrum by controlling alternative gene programs</p>
            </title>
            <aug>
               <au>
                  <snm>Korner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sofia</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Zumft</snm>
                  <fnm>WG</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Rev</source>
            <pubdate>2003</pubdate>
            <volume>27</volume>
            <issue>5</issue>
            <fpage>559</fpage>
            <lpage>592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-6445(03)00066-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">14638413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Evolutionary dynamics of prokaryotic transcriptional regulatory networks</p>
            </title>
            <aug>
               <au>
                  <snm>Madan Babu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Teichmann</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>358</volume>
            <issue>2</issue>
            <fpage>614</fpage>
            <lpage>633</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2006.02.019</pubid>
                  <pubid idtype="pmpid" link="fulltext">16530225</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Structure of catabolite gene activator protein at 2.9 A resolution suggests binding to left-handed B-DNA</p>
            </title>
            <aug>
               <au>
                  <snm>McKay</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1981</pubdate>
            <volume>290</volume>
            <issue>5809</issue>
            <fpage>744</fpage>
            <lpage>749</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/290744a0</pubid>
                  <pubid idtype="pmpid">6261152</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Allosteric regulation of the cAMP receptor protein</p>
            </title>
            <aug>
               <au>
                  <snm>Harman</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2001</pubdate>
            <volume>1547</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>17</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11343786</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Transcription activation by catabolite activator protein (CAP)</p>
            </title>
            <aug>
               <au>
                  <snm>Busby</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ebright</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>293</volume>
            <issue>2</issue>
            <fpage>199</fpage>
            <lpage>213</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1999.3161</pubid>
                  <pubid idtype="pmpid" link="fulltext">10550204</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Bipartite functional map of the E. coli RNA polymerase alpha subunit: involvement of the C-terminal region in transcription activation by cAMP-CRP</p>
            </title>
            <aug>
               <au>
                  <snm>Igarashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ishihama</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1991</pubdate>
            <volume>65</volume>
            <issue>6</issue>
            <fpage>1015</fpage>
            <lpage>1022</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(91)90553-B</pubid>
                  <pubid idtype="pmpid" link="fulltext">1646077</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Detection of the protein-protein interaction between cyclic AMP receptor protein and RNA polymerase, by (13)C-carbonyl NMR</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Won</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Kyogoku</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>J Biochem</source>
            <pubdate>2001</pubdate>
            <volume>130</volume>
            <issue>1</issue>
            <fpage>57</fpage>
            <lpage>61</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11432780</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Transcription activation at class II CAP-dependent promoters: two interactions between CAP and RNA polymerase</p>
            </title>
            <aug>
               <au>
                  <snm>Niu</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tau</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Heyduk</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ebright</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1996</pubdate>
            <volume>87</volume>
            <issue>6</issue>
            <fpage>1123</fpage>
            <lpage>1134</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)81806-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">8978616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>A common role of CRP in transcription activation: CRP acts transiently to stimulate events leading to open complex formation at a diverse set of promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Tagami</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Aiba</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>1998</pubdate>
            <volume>17</volume>
            <issue>6</issue>
            <fpage>1759</fpage>
            <lpage>1767</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1170523</pubid>
                  <pubid idtype="pmpid" link="fulltext">9501097</pubid>
                  <pubid idtype="doi">10.1093/emboj/17.6.1759</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Cyclic AMP in prokaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Botsford</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Harman</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Microbiol Rev</source>
            <pubdate>1992</pubdate>
            <volume>56</volume>
            <issue>1</issue>
            <fpage>100</fpage>
            <lpage>122</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">372856</pubid>
                  <pubid idtype="pmpid" link="fulltext">1315922</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The comprehensive updated regulatory network of Escherichia coli K-12</p>
            </title>
            <aug>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Santos-Zavaleta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gama-Castro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Peralta-Gil</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Penaloza-Spinola</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Martinez-Antonio</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Karp</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>1</issue>
            <fpage>5</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1382256</pubid>
                  <pubid idtype="pmpid" link="fulltext">16398937</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-5</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12</p>
            </title>
            <aug>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gama-Castro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Martinez-Antonio</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Diaz-Peredo</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sanchez-Solano</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Peralta-Gil</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Garcia-Alonso</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Jimenez-Jacinto</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Santos-Zavaleta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bonavides-Martinez</snm>
                  <fnm>C</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <issue>32 Database</issue>
            <fpage>D303</fpage>
            <lpage>306</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308874</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681419</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh140</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Non-canonical CRP sites control competence regulons in Escherichia coli and many other gamma-proteobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Cameron</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Redfield</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>20</issue>
            <fpage>6001</fpage>
            <lpage>6014</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1635313</pubid>
                  <pubid idtype="pmpid" link="fulltext">17068078</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl734</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Genomic survey of cAMP and cGMP signalling components in the cyanobacterium Synechocystis PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Ochoa de Alda</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Houmard</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Microbiology</source>
            <pubdate>2000</pubdate>
            <volume>146</volume>
            <issue>Pt 12</issue>
            <fpage>3183</fpage>
            <lpage>3194</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11101676</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Identification and characterization of a novel cAMP receptor protein in the cyanobacterium Synechocystis sp. PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hisabori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yanagisawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2000</pubdate>
            <volume>275</volume>
            <issue>9</issue>
            <fpage>6241</fpage>
            <lpage>6245</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.275.9.6241</pubid>
                  <pubid idtype="pmpid" link="fulltext">10692419</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Screening for the target gene of cyanobacterial cAMP receptor protein SYCRP1</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yanagisawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2002</pubdate>
            <volume>43</volume>
            <issue>4</issue>
            <fpage>843</fpage>
            <lpage>853</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2002.02790.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12085767</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Systematic single base-pair substitution analysis of DNA binding by the cAMP receptor protein in cyanobacterium Synechocystis sp. PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Omagari</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Takano</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hao</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sarai</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Suyama</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2004</pubdate>
            <volume>563</volume>
            <issue>1&#8211;3</issue>
            <fpage>55</fpage>
            <lpage>58</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(04)00248-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">15063722</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Two cAMP receptor proteins with different biochemical properties in the filamentous cyanobacterium Anabaena sp. PCC 7120</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hisabori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2004</pubdate>
            <volume>571</volume>
            <issue>1&#8211;3</issue>
            <fpage>154</fpage>
            <lpage>160</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.febslet.2004.06.074</pubid>
                  <pubid idtype="pmpid" link="fulltext">15280034</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>AnCrpA, a cAMP receptor protein, regulates nif-related gene expression in the cyanobacterium Anabaena sp. strain PCC 7120 grown with nitrate</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ehira</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ikeuchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2007</pubdate>
            <volume>581</volume>
            <issue>1</issue>
            <fpage>21</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.febslet.2006.11.070</pubid>
                  <pubid idtype="pmpid" link="fulltext">17173896</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Phototactic motility in the unicellular cyanobacterium Synechocystis sp. PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshihara</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ikeuchi</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Photochem Photobiol Sci</source>
            <pubdate>2004</pubdate>
            <volume>3</volume>
            <issue>6</issue>
            <fpage>512</fpage>
            <lpage>518</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1039/b402320j</pubid>
                  <pubid idtype="pmpid" link="fulltext">15170479</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Novel motility mutants of synechocystis strain PCC 6803 generated by in vitro transposon mutagenesis</p>
            </title>
            <aug>
               <au>
                  <snm>Bhaya</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shahi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Grossman</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <issue>20</issue>
            <fpage>6140</fpage>
            <lpage>6143</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99694</pubid>
                  <pubid idtype="pmpid" link="fulltext">11567015</pubid>
                  <pubid idtype="doi">10.1128/JB.183.20.6140-6143.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Dynamic transcriptional changes in response to rehydration in Anabaena sp. PCC 7120</p>
            </title>
            <aug>
               <au>
                  <snm>Higo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ikeuchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Microbiology</source>
            <pubdate>2007</pubdate>
            <volume>153</volume>
            <issue>Pt 11</issue>
            <fpage>3685</fpage>
            <lpage>3694</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/mic.0.2007/009233-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">17975076</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Comparative genomics analysis of NtcA regulons in cyanobacteria: regulation of nitrogen assimilation and its coupling to photosynthesis</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Olman</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>16</issue>
            <fpage>5156</fpage>
            <lpage>5171</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1214546</pubid>
                  <pubid idtype="pmpid" link="fulltext">16157864</pubid>
                  <pubid idtype="doi">10.1093/nar/gki817</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Computational prediction of Pho regulons in cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Olman</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>156</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1906773</pubid>
                  <pubid idtype="pmpid" link="fulltext">17559671</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-8-156</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Operon prediction without a training set</p>
            </title>
            <aug>
               <au>
                  <snm>Westover</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Buhler</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Sonnenburg</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>JI</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>7</issue>
            <fpage>880</fpage>
            <lpage>888</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti123</pubid>
                  <pubid idtype="pmpid" link="fulltext">15539453</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0</p>
            </title>
            <aug>
               <au>
                  <snm>Tamura</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dudley</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2007</pubdate>
            <volume>24</volume>
            <issue>8</issue>
            <fpage>1596</fpage>
            <lpage>1599</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msm092</pubid>
                  <pubid idtype="pmpid" link="fulltext">17488738</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Autoregulation of the Escherichia coli crp gene: CRP is a transcriptional repressor for its own gene</p>
            </title>
            <aug>
               <au>
                  <snm>Aiba</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1983</pubdate>
            <volume>32</volume>
            <issue>1</issue>
            <fpage>141</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(83)90504-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">6297782</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>CUBIC: identification of regulatory binding sites through data clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Olman</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Bioinform Comput Biol</source>
            <pubdate>2003</pubdate>
            <volume>1</volume>
            <issue>1</issue>
            <fpage>21</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1142/S0219720003000162</pubid>
                  <pubid idtype="pmpid" link="fulltext">15290780</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2001</pubdate>
            <fpage>127</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11262934</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Crystal structure of a CAP-DNA complex: the DNA is bent by 90 degrees</p>
            </title>
            <aug>
               <au>
                  <snm>Schultz</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Shields</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1991</pubdate>
            <volume>253</volume>
            <issue>5023</issue>
            <fpage>1001</fpage>
            <lpage>1007</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1653449</pubid>
                  <pubid idtype="pmpid" link="fulltext">1653449</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Comparative analysis of regulatory patterns in bacterial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Novichkov</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Novichkova</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <issue>4</issue>
            <fpage>357</fpage>
            <lpage>371</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/1.4.357</pubid>
                  <pubid idtype="pmpid" link="fulltext">11465053</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Rapid change in cyclic 3',5'-AMP concentration triggered by a light-off or light-on signal in Anabaena cylindrica</p>
            </title>
            <aug>
               <au>
                  <snm>Ohmri</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hasunuma</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Archives of Microbiology</source>
            <pubdate>1988</pubdate>
            <volume>150</volume>
            <fpage>203</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1007/BF00425163</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Effect of nitrogen starvation on the level of adenosine 3',5'-monophosphate in Anabaena variabilis</p>
            </title>
            <aug>
               <au>
                  <snm>Hood</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ownby</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Handa</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Bressan</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Biochimica et Biophysica Acta (BBA) &#8211; General Subjects</source>
            <pubdate>1979</pubdate>
            <volume>588</volume>
            <issue>2</issue>
            <fpage>193</fpage>
            <lpage>200</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0304-4165(79)90202-2</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Changes in cyclic AMP phosphodiesterase activity in Anabaena variabilis during growth and nitrogen starvation</p>
            </title>
            <aug>
               <au>
                  <snm>Ownby</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Kuenzi</snm>
                  <fnm>FE</fnm>
               </au>
            </aug>
            <source>FEMS Microbiology Letters</source>
            <pubdate>1982</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>243</fpage>
            <lpage>247</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1111/j.1574-6968.1982.tb00076.x</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Rocap</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Larimer</snm>
                  <fnm>FW</fnm>
               </au>
               <au>
                  <snm>Lamerdin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Malfatti</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chain</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ahlgren</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Arellano</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hauser</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hess</snm>
                  <fnm>WR</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>424</volume>
            <issue>6952</issue>
            <fpage>1042</fpage>
            <lpage>1047</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01947</pubid>
                  <pubid idtype="pmpid" link="fulltext">12917642</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Molecular physiology of cAMP signaling cascade in Synechocystis PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yanagisawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hisabori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ohmori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>the 11th International Symposium on Phototrophic Prokaryotes: 2003; Tokyo, Japan</source>
            <pubdate>2003</pubdate>
            <fpage>182</fpage>
         </bibl>
         <bibl id="B46">
            <title>
               <p>A cyclic AMP receptor protein-regulated cell-cell communication system mediates expression of a FecA homologue in Stenotrophomonas maltophilia</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>AC</fnm>
               </au>
            </aug>
            <source>Appl Environ Microbiol</source>
            <pubdate>2007</pubdate>
            <volume>73</volume>
            <issue>15</issue>
            <fpage>5034</fpage>
            <lpage>5040</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1951048</pubid>
                  <pubid idtype="pmpid" link="fulltext">17574998</pubid>
                  <pubid idtype="doi">10.1128/AEM.00366-07</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Large-Scale Turnover of Functional Transcription Factor Binding Sites in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Moses</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Pollard</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Nix</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>X-Y</fnm>
               </au>
               <au>
                  <snm>Biggin</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>PLoS Computational Biology</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <issue>10</issue>
            <fpage>e130</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1599766</pubid>
                  <pubid idtype="pmpid" link="fulltext">17040121</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0020130</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Predicting expression patterns from regulatory sequence in Drosophila segmentation</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Raveh-Sadka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Schroeder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Unnerstall</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Gaul</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>451</volume>
            <issue>7178</issue>
            <fpage>535</fpage>
            <lpage>540</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature06496</pubid>
                  <pubid idtype="pmpid" link="fulltext">18172436</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>WebLogo: a sequence logo generator</p>
            </title>
            <aug>
               <au>
                  <snm>Crooks</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Hon</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chandonia</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <issue>6</issue>
            <fpage>1188</fpage>
            <lpage>1190</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">419797</pubid>
                  <pubid idtype="pmpid" link="fulltext">15173120</pubid>
                  <pubid idtype="doi">10.1101/gr.849004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

