<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-395</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>GBNet: Deciphering regulatory rules in the co-regulated genes using a Gibbs sampler enhanced Bayesian network approach</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Shen</snm>
               <fnm>Li</fnm>
               <insr iid="I1"/>
               <email>shen@ucsd.edu</email>
            </au>
            <au id="A2">
               <snm>Liu</snm>
               <fnm>Jie</fnm>
               <insr iid="I1"/>
               <email>jliu@mccammon.ucsd.edu</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Wang</snm>
               <fnm>Wei</fnm>
               <insr iid="I1"/>
               <email>wei-wang@ucsd.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Chemistry and Biochemistry, University of California, San Diego, California, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>395</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/395</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18811979</pubid>
               <pubid idtype="doi">10.1186/1471-2105-9-395</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>08</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>24</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Shen et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Combinatorial regulation of transcription factors (TFs) is important in determining the complex gene expression patterns particularly in higher organisms. Deciphering regulatory rules between cooperative TFs is a critical step towards understanding the mechanisms of combinatorial regulation.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We present here a Bayesian network approach called GBNet to search for DNA motifs that may be cooperative in transcriptional regulation and the sequence constraints that these motifs may satisfy. We showed that GBNet outperformed the other available methods in the simulated and the yeast data. We also demonstrated the usefulness of GBNet on learning regulatory rules between YY1, a human TF, and its co-factors. Most of the rules learned by GBNet on YY1 and co-factors were supported by literature. In addition, a spacing constraint between YY1 and E2F was also supported by independent TF binding experiments.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We thus conclude that GBNet is a useful tool for deciphering the "grammar" of transcriptional regulation.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Decoding regulatory interactions between transcription factors (TFs) and their target genes is critical in understanding the complex gene expression patterns in response to extra- or intra-cellular signals. Many computational methods have been developed to identify the cis-regulatory elements recognized by TFs <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. These DNA motifs have also been determined by experimental measurements <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The accumulation of known TF motifs facilitates addressing a more challenging question, understanding the combinatorial regulation of TFs and deciphering the rules of how the TFs cooperate with each other, which is particularly important for studying transcriptional regulation in higher organisms <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
         <p>The previous efforts have been mainly focused on inferring which TFs may function together <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. However, these studies cannot reveal the regulatory mechanisms of combinatorial regulation, namely whether a TF motif has a positional preference relative to the transcription start site (TSS) or whether the order of the two motifs matters for their cooperation. The importance of such regulatory "grammar" has been observed in numerous studies. For example, the binding site of the repressor Giant relative to those of the Gal4 activators determined transcription of a reporter gene in the embryo of <it>Drosophila melanogaster </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <p>Searching for the sequence constraints between TF motifs is a difficult task. As we will see below, simple enumeration of all possible sequence features and conducting statistical test to evaluate the significance for each of them is computationally expensive for even a modest number of candidate motifs. Alternative methods are thus needed to tackle this problem. Recently, Elemento <it>et al</it>. developed a motif finding algorithm called FIRE that can predict various sequence constraints of motifs and whether pairs of motifs interact with each other <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. FIRE is based on the mutual information between the sequence features of interest and gene expression. However, because Elemento <it>et al</it>. emphasized on removing false positives, the relative small number of predicted motifs implies that some true motifs might be missed. For example, only 17 DNA motifs and 6 RNA motifs were identified from 78 clusters generated from 173 microarray experiments in yeast <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In addition, FIRE only considers consensus sequences with mismatches, which is a relatively simple representation of motifs compared with position weight matrix. More importantly, FIRE cannot consider the joint effects of multiple rules. The rules were tested individually by FIRE and the computational cost would be too high to enumerate all possible combinations of rules (see below). Therefore, for example, synergy between rules cannot be detected by FIRE.</p>
         <p>In the present study, we adopted a Bayesian network approach to identify regulatory grammars because Bayesian network explicitly models the nonlinear relationship between sequence rules. Our goal is to find enriched constraints for DNA motifs such as spacing between TF binding sites and positional bias of a TF sites relative to TSS in a group of sequences, often promoter sequences of a set of co-regulated genes. This can be considered as a generalization of motif finding algorithms. It is important to emphasize that we do not aim to predict gene expression based on sequences, which is the goal of the studies of, such as, Beer and Tavazoie <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> and Yuan et al. <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>.</p>
         <p>We implemented a Gibbs sampling procedure to search for optimal Bayesian network structure. We call our method GBNet, <b><ul>G</ul></b>ibbs sampler enhanced <b><ul>B</ul></b>ayesian <b><ul>Net</ul></b>work, and the software is available at <url>http://modem.ucsd.edu/shenli/gbnet.tgz</url>. To demonstrate the strength of our searching strategy, we compared the performance of GBNet with BBNet, in which a greedy searching algorithm is implemented to search for the optimal Bayesian network. We have applied both methods to simulated data as well as yeast and human data. The results showed that Gibbs sampling has much better performance than greedy search in searching for sequence constraints between cooperative TFs. We also demonstrated that numerous sequence features identified by GBNet for human transcription factor YY1 were supported by literature and experimental evidence.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>GBNet: a Gibbs-sampler enhanced Bayesian network</p>
            </st>
            <p>Uncovering transcriptional grammar in a group of genes exhibiting similar expression patterns may reveal the mechanisms of combinatorial regulation of transcription factors. We adopted a Bayesian network to model the non-linear regulatory relationship between sequence features and gene expression (Fig. <figr fid="F1">1</figr>). The structure of the Bayesian network represents the grammar (regulatory rules) of cis-regulation. Our aim is to maximize the posterior probability of the network structure given the data, i.e. Bayesian score of Eq.(1) (see Methods).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Searching for grammar of combinatorial regulation between transcription factors using a Bayesian network approach</p>
               </caption>
               <text>
                  <p>
                     <b>Searching for grammar of combinatorial regulation between transcription factors using a Bayesian network approach.</b>
                  </p>
               </text>
               <graphic file="1471-2105-9-395-1"/>
            </fig>
            <p>Because the number of sequence features grows exponentially with the number of candidate motifs, searching a set of optimal sequence features is not trivial. We employed a Gibbs sampler enhanced global search strategy to tackle this problem (see Methods for details). Six sequence features as defined in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> were considered: presence of a motif, distance from transcription start site (TSS), spacing between two motifs, orientation of a motif, presence of a second copy of a motif and order between two motifs. GBNet can therefore be considered as a generalization of sequence motif finding: instead of searching for enriched consensus motifs, enriched combinations of motifs satisfying a specific constraint is being searched.</p>
         </sec>
         <sec>
            <st>
               <p>Validation of GBNet on simulated data</p>
            </st>
            <p>We first validated the performance of GBNet using simulated data. To keep the sequences as natural as possible, we took the sequences from the 114 promoters in the fourth yeast cluster of <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. We then implanted a spacing constraint between two yeast motifs, distance between PAC and RRPE motifs less than 40 bp, in a portion of genes ranging from 40% to 80% with an interval of 10%. The original instances of PAC and RRPE were removed. These simulated sequences and the weight matrices of the 666 yeast motifs taken from <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> were input to GBNet for identification of enriched sequence constraints between these motifs. We used the same 1789 background sequences as in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <p>GBNet successfully learned the implemented spacing rule in all the five simulated datasets but BBNet learned none of them (Table <tblr tid="T1">1</tblr>). For example, when 40% of the genes contained the spacing rule, the presence of single motif PAC and RRPE were ranked 2nd and 5th, respectively, among all motifs under consideration. BBNet only learned the presence of PAC while GBNet still found the spacing constraint between the PAC and RRPE motifs. In all the five datasets, the rules found by GBNet gave better Bayesian scores than those found by BBNet (Table <tblr tid="T1">1</tblr>), which suggests better fitting to the data. It is important to point out that the Bayesian networks learned by GBNet were not necessarily more complex than those learned by BBNet. On the datasets that 60% and 70% of the genes contained the spacing rule, GBNet even gave Bayesian networks with less number of rules than BBNet (Table <tblr tid="T1">1</tblr>). Our analysis showed that GBNet outperforms BBNet in search of the best Bayesian network structure even for such a simple simulated data with only one sequence constraint implemented.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Sequence constraints learned by BBNet and GBNet in the five simulated datasets</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Perc<sup>a</sup></p>
                     </c>
                     <c ca="left">
                        <p>Rank<sup>b</sup></p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>BBNet</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>GBNet</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>BS</p>
                     </c>
                     <c ca="left">
                        <p>Rules</p>
                     </c>
                     <c ca="left">
                        <p>BS</p>
                     </c>
                     <c ca="left">
                        <p>Rules</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.4</p>
                     </c>
                     <c ca="left">
                        <p>2,5</p>
                     </c>
                     <c ca="left">
                        <p>-130.49</p>
                     </c>
                     <c ca="left">
                        <p>1. Distance to TSS of M604:180</p>
                        <p>2. Presence of PAC</p>
                        <p>3. Presence of M599</p>
                     </c>
                     <c ca="left">
                        <p>-123.48</p>
                     </c>
                     <c ca="left">
                        <p>1. Distance to TSS of M604:140</p>
                        <p>2. <b>Distance between RRPE and PAC:40</b></p>
                        <p>3. Distance to TSS of M599:160</p>
                        <p>4. Presence of M593</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.5</p>
                     </c>
                     <c ca="left">
                        <p>1,3</p>
                     </c>
                     <c ca="left">
                        <p>-120.03</p>
                     </c>
                     <c ca="left">
                        <p>1. Presence of PAC</p>
                        <p>2. Distance to TSS of M604:180</p>
                        <p>3. Presence of M599</p>
                     </c>
                     <c ca="left">
                        <p>-109.28</p>
                     </c>
                     <c ca="left">
                        <p>1. Distance to TSS of M604:200</p>
                        <p>2. <b>Distance between RRPE and PAC:40</b></p>
                        <p>3. Distance to TSS of M599:480</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.6</p>
                     </c>
                     <c ca="left">
                        <p>1,3</p>
                     </c>
                     <c ca="left">
                        <p>-114.19</p>
                     </c>
                     <c ca="left">
                        <p>1. Presence of PAC</p>
                        <p>2. Distance to TSS of M604:180</p>
                        <p>3. Presence of RRPE</p>
                        <p>4. Presence of M599</p>
                     </c>
                     <c ca="left">
                        <p>-102.11</p>
                     </c>
                     <c ca="left">
                        <p>1. <b>Distance between PAC and RRPE:40</b></p>
                        <p>2. Distance to TSS of M604:140</p>
                        <p>3. Distance between M604 and M599:500</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.7</p>
                     </c>
                     <c ca="left">
                        <p>1,2</p>
                     </c>
                     <c ca="left">
                        <p>-102.68</p>
                     </c>
                     <c ca="left">
                        <p>1. Presence of PAC</p>
                        <p>2. Presence of RRPE</p>
                        <p>3. Distance to TSS of M604:140</p>
                     </c>
                     <c ca="left">
                        <p>-91.18</p>
                     </c>
                     <c ca="left">
                        <p>1. <b>Distance between PAC and RRPE:40</b></p>
                        <p>2. Distance to TSS of M604:140</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0.8</p>
                     </c>
                     <c ca="left">
                        <p>1,2</p>
                     </c>
                     <c ca="left">
                        <p>-85.85</p>
                     </c>
                     <c ca="left">
                        <p>1. Presence of PAC</p>
                        <p>2. Presence of RRPE</p>
                        <p>3. Distance to TSS of M604:140</p>
                     </c>
                     <c ca="left">
                        <p>-70.1268</p>
                     </c>
                     <c ca="left">
                        <p>1. <b>Distance between PAC and RRPE:40</b></p>
                        <p>2. Distance between M604 and M599:340</p>
                        <p>3. Distance to TSS of M604:140</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p> The MXXX are AlignACE motif matrices taken from <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
                  <p><sup>a</sup>The percentage of sequences satisfying the spacing rule between PAC and RRPE motifs ranges from 0.4 to 0.8. <sup>b</sup>The single motif ranks for PAC and RRPE in each dataset are also shown: the first is PAC and the second is RRPE.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Validation of GBNet on the PAC and RRPE example</p>
            </st>
            <p>We then validated the performance of GBNet using a real dataset. We took the fourth yeast cluster of <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> in which Beer and Tavazoie found two regulatory rules for PAC and RRPE: 1. PAC (M600) is within 140 bp of ATG; 2. RRPE (M602) is within 240 bp of ATG. When both rules are satisfied, the genes containing the two motifs showed highly correlated expression patterns across a variety of conditions. When neither of these rules was satisfied, the gene expression patterns were indistinguishable from the background <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <p>In <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, the above two rules were learned from the bootstrap samples but not directly from the original sequences in the fourth yeast cluster. We generated numerous sets of 10 bootstrap samples. The two rules could be simultaneously identified in only 1 to 3 out of 10 bootstrap samples by BBNet and they were not necessarily the most abundant rules learned by BBNet from these samples. Different from Beer and Tavazoie's goal to predict gene expression from sequence, we aim to identify sequence constraints between cooperative TF binding motifs from a group of genes with coherent expression patterns. Therefore, bootstrap is not an option for our purpose. When we applied BBNet and GBNet to the original sequences in the fourth yeast cluster, BBNet correctly found the first rule of PAC but only the presence of RRPE instead of the distance constraint (Table S1); In contrast, GBNet successfully found both rules.</p>
            <p>To illustrate why the GBNet could but BBNet could not find the two rules, we examined each step of the Bayesian network structure learning (Fig. <figr fid="F2">2</figr>). When the searching reached a local optimum (state 1 in Fig. <figr fid="F2">2</figr>) with a Bayesian score of -101.3, the network contained two parent nodes (Fig. <figr fid="F2">2</figr>): "distance to ATG of PAC" and "presence of RRPE". If the "distance to ATG of RRPE" node was added, the Bayesian score would decrease. Therefore, the greedy search in BBNet stopped and did not add this rule. The searching was thus trapped in the local optimum. In contrast, a Metropolis jump was tried in GBNet with an accepting probability calculated based on the difference of the Bayesian scores before and after the jump (see Methods): the closer the two Bayesian scores, the more likely a jump got accepted. To further enhance the sampling power, simulated annealing was also employed in GBNet and multiple iterations were executed until the model was converged at a specific temperature. As a result of this searching strategy, the "distance to ATG of RRPE" rule was added by GBNet even though the Bayesian score became worse (state 2 in Fig. <figr fid="F2">2</figr>): -103.93 versus -101.3. Next, the Bayesian score was improved to -96.26 by removing the "presence of RRPE" node (state 3 in Fig. <figr fid="F2">2</figr>). The two correct rules were thus found and being kept to the end of the searching. This example illustrates the advantages of the searching strategy implemented in GBNet to avoid being trapped in the local optimum compared with the greedy search algorithm in BBNet.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>An example of the Bayesian network learning procedure in BBNet and GBNet</p>
               </caption>
               <text>
                  <p><b>An example of the Bayesian network learning procedure in BBNet and GBNet.</b> The sequences were taken from the fourth yeast cluster in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The magenta line represents the landscape of the Bayesian score (absolute value). The learning steps involving motifs other than PAC and RRPE were omitted for the illustration purpose. The parent nodes of the regulator rules learned in the three key steps are shown on the right.</p>
               </text>
               <graphic file="1471-2105-9-395-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Applying GBNet to the 49 yeast clusters</p>
            </st>
            <p>The above analyses suggested that GBNet can find the rules of combinatorial regulation between TFs. To have a large scale comparison between GBNet and BBNet, we then applied them to the 49 yeast clusters of 2770 genes in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> (Table S1). We compared GBNet and BBNet on the following aspects using the original data without bootstrap sampling.</p>
            <sec>
               <st>
                  <p>GBNet fits better models to the data than BBNet</p>
               </st>
               <p>A Bayesian score reflects how well a model fits to the data. The rules learned by GBNet gave better Bayesian scores in 47 clusters than those learned by BBNet (Table S1 in the Additional file <supplr sid="S1">1</supplr>). The sum of Bayesian scores for all 49 clusters is -4394.3 for BBNet and -4306.6 for GBNet. On average, GBNet achieved a better Bayesian score ~1.8/cluster than BBNet. Again, the Bayesian networks learned by GBNet are not more complex than those learned by BBNet. This can be seen by the average number of rules per cluster: 2.1 for BBNet vs. 2.3 for GBNet.</p>
               <suppl id="S1">
                  <title>
                     <p>Additional file 1</p>
                  </title>
                  <text>
                     <p>
                        <b>Microsoft Excel file containing results of Bayesian score, rules, etc. of 49 yeast clusters from both BBNet and GBNet.</b>
                     </p>
                  </text>
                  <file name="1471-2105-9-395-S1.xls">
                     <p>Click here for file</p>
                  </file>
               </suppl>
            </sec>
            <sec>
               <st>
                  <p>GBNet finds more biologically interesting rules</p>
               </st>
               <p>From the 49 yeast clusters, BBNet and GBNet learned 105 and 112 regulatory rules in total, respectively. Consistent with the observation in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, most (100 or 95%) of the regulatory rules learned by BBNet were simply "presence of a motif", which could also be learned by any motif finding algorithm. Because the searching started with "presence of a motif" and BBNet is easy to get trapped in local optima, it is not surprising that other types of sequence constraints were underrepresented. Although presence of a motif is still the majority of the rules learned by GBNet, the percentage is only 73% (82/112) and the portion of other types of constraints was significantly increased (Fig. <figr fid="F3">3</figr>). Finding rules other than presence distinguishes GBNet from other motif finding algorithms. This feature is particularly important in studying combinatorial regulation in higher organisms such as human.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>The number of different types of regulatory rules learned by BBNet and GBNet</p>
                  </caption>
                  <text>
                     <p>
                        <b>The number of different types of regulatory rules learned by BBNet and GBNet.</b>
                     </p>
                  </text>
                  <graphic file="1471-2105-9-395-3"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>GBNet searches more thoroughly in the rule space</p>
               </st>
               <p>To further demonstrate that GBNet is less prone to get trapped in local optima, we examined the ranks of single motifs that appear in the rules learned by GBNet and BBNet. In search for the optimal Bayesian network, all motifs under consideration were first sorted in the descending order by their individual Bayesian scores, which reflect how well an individual motif can explain the data. The motifs were then added to the Bayesian network in this order to expedite the convergence of searching. Therefore, it is not unexpected to see that a large portion (43%) of motifs present in the rules learned by BBNet had the highest individual ranks (Fig. <figr fid="F4">4</figr>). As a comparison, GBNet found rules that involved motifs giving lower Bayesian score if considered individually (lower individual ranks) but higher (better) Bayesian score if considered together with satisfying specific sequence constraints (Fig. <figr fid="F4">4</figr>).</p>
               <fig id="F4">
                  <title>
                     <p>Figure 4</p>
                  </title>
                  <caption>
                     <p>Distribution of motif ranks in BBNet and GBNet</p>
                  </caption>
                  <text>
                     <p>
                        <b>Distribution of motif ranks in BBNet and GBNet. Ties are in orange.</b>
                     </p>
                  </text>
                  <graphic file="1471-2105-9-395-4"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>GBNet searches much more efficiently in the rule space than enumeration</p>
            </st>
            <p>An alternative to developing models like GBNet for learning sequence constraints is enumerating all possible rules and selecting the best scored ones, using either Bayesian score or mutual information like in FIRE. However, the possible number of combinations of rules is so large to make this straightforward approach computationally prohibited. For example, learning spacing constraints among 50 candidate motifs in yeast, one needs to consider 19 function depths (0.05&#8211;0.95 with an interval of 0.05) for each motif and 30 possible distances between two motifs (20&#8211;600 bps with an interval of 20 bps). In total, (50 &#215; 49/2) &#215; 19 &#215; 19 &#215; 30 = 13,266,750 statistical tests have to be computed for learning the single spacing constraint. For a distance constraint from TSS (positional bias), 50 &#215; 19 &#215; 30 = 28500 tests need to be performed. Therefore, when consider the combination of the above two rules (spacing constraint and positional bias), there are total 13,266,750*28500 = 3.78*10<sup>11 </sup>possibilities, which makes enumeration infeasible. As a comparison, on average, GBNet calculated 570,000 times of Bayesian scores per cluster that considered combinations of all six types of constraints. The running time of GBNet per cluster was 3 to 4 hours on a desktop computer with a 1.8 GHz CPU, which means enumeration would take >1.99*10<sup>6 </sup>to 2.65*10<sup>6 </sup>hours for only considering the two types of rules mentioned above per cluster. The efficiency difference between GBNet and enumeration becomes more significant when more candidate motifs are considered for learning sequence constrains. This is because GBNet scales better than linearly with the number of candidate motifs (Table S2). As a comparison, the computational cost of enumeration is polynomial to the number of candidate motifs when considering one rule or exponential when considering combination of rules.</p>
         </sec>
         <sec>
            <st>
               <p>Dissecting transcriptional regulatory rules of a human transcription factor YY1</p>
            </st>
            <p>Combinatorial regulation is much more prevalent in higher organisms than in yeast. To demonstrate the usefulness of GBNet, we applied it to studying transcriptional regulation by a human transcription factor (TF) called YY1, which plays essential roles during development <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. For the purpose of comparison, we also analyzed this human dataset using BBNet. Our previous study showed that YY1 mainly binds to the 1.5 kbp regions around the transcription start site <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Therefore, we focus on searching for sequence constraints between YY1 and its cofactors in the proximal promoters.</p>
            <sec>
               <st>
                  <p>Identifying YY1 target genes and clustering of gene expression profiles</p>
               </st>
               <p>ChIP-chip analysis of YY1 binding has been conducted using a whole-genome promoter array in human HeLa cells <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. We used a Gibbs sampler based computational algorithm, called <b>GI</b>bbs sampler for finding <b>T</b>ranscription factor <b>TAR</b>get genes (GITTAR) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> that integrates sequence motif and ChIP-chip binding information to identify a set of confident YY1 target genes. The intuition behind the GITTAR algorithm is the same as that of MODEM <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, namely genes containing the YY1 motif and showing significant ChIP-chip ratio are likely to be YY1 targets. GITTAR identified 968 such genes and the average of their log2 ratios is 2.47 &#177; 0.70, which significantly deviates from the background (0.23 &#177; 0.65). A 12-bp long motif defined by GITTAR (Fig. S1 in the Additional file <supplr sid="S2">2</supplr>) was used in the following analyses.</p>
               <suppl id="S2">
                  <title>
                     <p>Additional file 2</p>
                  </title>
                  <text>
                     <p>
                        <b>Microsoft Word file containing supplemental information of the main article, Fig. S1&#8211;S2 and Table S2&#8211;S4.</b>
                     </p>
                  </text>
                  <file name="1471-2105-9-395-S2.doc">
                     <p>Click here for file</p>
                  </file>
               </suppl>
               <p>Because YY1 can cooperate with various TFs, we used gene expression profiles to define these co-regulated subgroups of the YY1 target genes. Su et al. <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> performed microarray experiments in 79 human tissues and 782 YY1 target genes identified by GITTAR were probed in their arrays. We found five clusters among these YY1 target genes using hierarchical clustering algorithm <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> (Fig. <figr fid="F5">5</figr>). Cluster H1 to H4 were selected based on a correlation cutoff of 0.60 and a cluster size cutoff of 10 genes. Cluster H5 was manually selected because its members were significantly up-regulated and tightly correlated in testis tissues (correlation = 0.64) despite the average pairwise correlation over all the 79 tissues was only 0.33. Cluster H5 represents tissue-specific expression of YY1 targets and it is interesting to examine the underlying mechanism of transcriptional regulation.</p>
               <fig id="F5">
                  <title>
                     <p>Figure 5</p>
                  </title>
                  <caption>
                     <p>Heatmap of YY1 target gene expression patterns</p>
                  </caption>
                  <text>
                     <p>
                        <b>Heatmap of YY1 target gene expression patterns.</b>
                     </p>
                  </text>
                  <graphic file="1471-2105-9-395-5"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Finding enriched motifs in each cluster</p>
               </st>
               <p>To search for potential YY1 co-factors, we collected 505 motifs of human TFs from the TRANSFAC database (Version 10.2) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and examined their enrichment in each of the five clusters against the genes outside of the cluster but included in the YY1 ChIP-chip study using Fisher's exact test <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> based on hypergeometric distribution (see Methods). In addition, we also conducted <it>de novo </it>motif finding using BioProspector <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. It is not surprising that the YY1 motif was always ranked on the top in Fisher's test. Numerous motifs like E2Fs, CREB, ELK1 and NFY were also significantly enriched. In total, a list of 74 motifs was compiled (Table S3) as the candidate motifs for further analysis of combinatorial rules by GBNet and BBNet.</p>
            </sec>
            <sec>
               <st>
                  <p>Learning combinatorial regulation between YY1 and its co-factors</p>
               </st>
               <p>The regulatory rules learned by GBNet and BBNet for all five clusters along with their P-values and Bayesian scores are listed in Table <tblr tid="T2">2</tblr>. Consistent with the observation in the simulated data and the yeast clusters, GBNet found sequence constraints between cooperative TFs in every cluster while BBNet only learned presence of motifs that can also be found by other means. The GBNet rules also achieved higher Bayesian scores than the BBNet presence rules, which suggest better fitting to the data. Again, the Bayesian networks learned by GBNet are not more complex than those learned by BBNet. In cluster H3, GBNet gave a Bayesian network with one less rule than BBNet but its Bayesian score is 10.0 higher. Two spacing constraints were found on H3: YY1-E2F and E2F-ELK1. We examined the gene expression pairwise correlation (PC) of the target genes of the two spacing constraints. While the E2F-ELK1 pair only marginally raises the PC, the YY1-E2F pair significantly improves the PC compared with background (Fig. <figr fid="F6">6</figr>). This shows the YY1-E2F pair is much more specific than the E2F-ELK1 pair in regulating transcriptional levels of their target genes. Finally, combining the two spacing constraints gives the optimal PC (Fig. <figr fid="F6">6</figr>).</p>
               <tbl id="T2">
                  <title>
                     <p>Table 2</p>
                  </title>
                  <caption>
                     <p>Sequence constraints learned by BBNet and GBNet in the five human YY1 clusters. The functional depth for each motif is in parentheses.</p>
                  </caption>
                  <tblbdy cols="5">
                     <r>
                        <c ca="left">
                           <p>Cluster</p>
                        </c>
                        <c cspan="2" ca="left">
                           <p>BBNet</p>
                        </c>
                        <c cspan="2" ca="left">
                           <p>GBNet</p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c ca="left">
                           <p>Rules, P-value</p>
                        </c>
                        <c ca="left">
                           <p>Bayesian Score</p>
                        </c>
                        <c ca="left">
                           <p>Rules, P-value</p>
                        </c>
                        <c ca="left">
                           <p>Bayesian Score</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="5">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>H1</p>
                        </c>
                        <c ca="left">
                           <p>Presence of YY1 (0.02), 4.05E-12</p>
                        </c>
                        <c ca="left">
                           <p>-18.14</p>
                        </c>
                        <c ca="left">
                           <p>Distance between ETS (0.01) and YY1 (0.02):120 bp, 5.32E-13</p>
                        </c>
                        <c ca="left">
                           <p>-17.23</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>H2</p>
                        </c>
                        <c ca="left">
                           <p>Presence of YY1 (0.01), 5.43E-10</p>
                        </c>
                        <c ca="left">
                           <p>-22.64</p>
                        </c>
                        <c ca="left">
                           <p>Distance between WT1 (0.02) and YY1 (0.01):40 bp, 1.09E-10</p>
                        </c>
                        <c ca="left">
                           <p>-21.92</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>H3</p>
                        </c>
                        <c ca="left">
                           <p>Presence of YY1 (0.01), 3.94E-114</p>
                           <p>Presence of E2F (0.2), 1.54E-33</p>
                           <p>Presence of ELK1 (0.04), 1.13E-20</p>
                        </c>
                        <c ca="left">
                           <p>-161.70</p>
                        </c>
                        <c ca="left">
                           <p>Distance between YY1 (0.01) and E2F (0.01): 40 bp, 1.67E-121</p>
                           <p>Distance between ELK1 (0.04) and E2F (0.01): 220 bp, 7.64E-26</p>
                        </c>
                        <c ca="left">
                           <p>-151.71</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>H4</p>
                        </c>
                        <c ca="left">
                           <p>Presence of YY1 (0.03), 8.64E-6</p>
                        </c>
                        <c ca="left">
                           <p>-20.56</p>
                        </c>
                        <c ca="left">
                           <p>Distance between YY1 (0.01) and E2F1 (0.1): 520 bp, 8.82E-9</p>
                        </c>
                        <c ca="left">
                           <p>-17.71</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>H5</p>
                        </c>
                        <c ca="left">
                           <p>Presence of YY1 (0.02), 1.90E-6</p>
                        </c>
                        <c ca="left">
                           <p>-21.39</p>
                        </c>
                        <c ca="left">
                           <p>Distance between YY1 (0.02) and ELK1 (0.01):160 bp, 9.79E-8</p>
                        </c>
                        <c ca="left">
                           <p>-19.98</p>
                        </c>
                     </r>
                  </tblbdy>
               </tbl>
               <fig id="F6">
                  <title>
                     <p>Figure 6</p>
                  </title>
                  <caption>
                     <p>Gene expression pairwise correlation distribution for target genes of the two spacing constraints found by GBNet on cluster H3</p>
                  </caption>
                  <text>
                     <p>
                        <b>Gene expression pairwise correlation distribution for target genes of the two spacing constraints found by GBNet on cluster H3.</b>
                     </p>
                  </text>
                  <graphic file="1471-2105-9-395-6"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Literature evidence to support the rules learned by GBNet</p>
               </st>
               <p>Most of the rules learned by GBNet were supported by literature: YY1 and ETS family proteins have been shown to form a complex that helps to maintain the normal function of human immune system <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>; Both YY1 and ETS are required for the transcriptional regulation of a variety of cellular processes such as adipocyte differentiation <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>; YY1 has been shown to physically interact with E2F family proteins for the specificity of E2F function <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>; Synergistic cooperation has been observed between YY1 and E2F1 for the transcriptional activity of p73, through a mechanism involving a physical interaction <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>; Two independent groups have verified that YY1 and ELK1 co-ordinate the expression of the SOD gene <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. In addition, GBNet also identified two new cooperative pairs: YY1-WT1 and E2F-ELK1. It is worth pointing out that GBNet specifies how the above TFs cooperative with each other and provides specific guidance for experimental validation.</p>
            </sec>
            <sec>
               <st>
                  <p>Independent E2F ChIP-chip experiments support the YY1-E2F distance constraint</p>
               </st>
               <p>A direct evidence to support the sequence constraints found by GBNet came from a recent study on the binding of E2F family members. Farnham and colleagues recently conducted ChIP-chip experiments on E2F family members, E2F1, E2F4 and E2F6, using the same promoter array that we used for our YY1 study (see Methods and <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>). They showed that E2F family members mainly bind to promoter regions within 2 kb of transcription start site (TSS) in HeLa cells and their bindings are interchangeable <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Among the 2815 human genes that are targets of any of the three E2F family members, 496 are in common with the 968 GITTAR YY1 targets (P-value &lt; 2.5e-167 based on hypergeometric distribution).</p>
               <p>To confirm the distance constraint between YY1 and E2F family members, we examined how many of YY1 and E2F sites that satisfy the distance constraint (within 40 bp) were supported by the E2F ChIP-chip experiments (Fig. <figr fid="F7">7</figr>). Because the probes in the promoter array were not uniformly distributed in each promoter, a predicted E2F site by GBNet may fall into a gap between probes. In addition, the sonicated DNA segments in ChIP-chip experiments had a length of hundreds of base pairs. Therefore, if a predicted E2F site is within a short distance from a probe with significant binding ratio, it is likely the E2F proteins bind to the predicted site. Among the 170 YY1-E2F motif pairs predicted by GBNet in the cluster H3 genes, we found that 79% of them were close to a probe (within 300 bp) with significant binding ratio of more than 2-fold (Table S4) (see Additional file 2 for more details). As a control, 104 genes which contain an YY1 site but do not satisfy the YY1-E2F spacing constraint were selected from the genome. Among these control genes, only 20% contain a probe (within 300 bp) with significant binding ratio of more than 2-fold. The statistical significance (p-value = 1.4e-22) was evaluated by Fisher's exact test between the two groups. This suggests that most of the predicted E2F sites by GBNet were bound by E2F proteins and the majority of the YY1-E2F distance constraints identified by GBNet were thus supported by the E2F ChIP-chip experiments.</p>
               <fig id="F7">
                  <title>
                     <p>Figure 7</p>
                  </title>
                  <caption>
                     <p>YY1 and E2F pairs predicted by GBNet were confirmed by ChIP-chip experiments</p>
                  </caption>
                  <text>
                     <p><b>YY1 and E2F pairs predicted by GBNet were confirmed by ChIP-chip experiments.</b> 79% of the 170 YY1-E2F pairs constrained by the distance were found to have probes with significant binding ratio change (more than 2-fold) within 300 bps.</p>
                  </text>
                  <graphic file="1471-2105-9-395-7"/>
               </fig>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion and conclusion</p>
         </st>
         <p>Combinatorial regulation of transcription factors is critical in gene expression control particularly in higher organisms. For the purpose of reconstructing transcription regulatory network, understanding the molecular mechanisms and deciphering the grammar of combinatorial regulation are the natural steps after finding the binding sites of TFs <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Identification of cooperative TFs and learning sequence constraints between their motifs can provide great insights into building mechanistic and quantitative models of transcriptional regulation <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         <p>We have developed a Bayesian network approach to find regulatory rules enriched in a foreground sequences, for example the promoters of a set of co-regulated genes, compared with the background sequences. This method can be applied to any genome as we showed its success in yeast and human here. We designed a powerful searching strategy in Bayesian network structure learning by employing Gibbs sampling and simulated annealing. Compared with the exhaustive enumeration, GBNet can find the optimal rules much more efficiently. The more candidate motifs under consideration, the more save of computational cost GBNet would achieve over enumeration. Given the improved searching strategy, it is not surprising that GBNet outperforms BBNet that employs greedy search for optimal network structure in all the datasets we have tested, including simulated, yeast and human data.</p>
         <p>In the present study, we were focused on the six sequence constraints using in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> and analyzing combinatorial regulations in proximal promoters. Obviously, there exist other sequence constraints, particularly those for the interactions between distal enhancers and promoters and those related to other regulatory elements such as silencers. New approaches are emerging to define all the regulatory elements, for example, using chromatin modification patterns <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> or protein-DNA interaction data to predict enhancers <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The accumulation of such knowledge can facilitate GBNet to learn rules that involve other regulatory elements or long-range regulatory interactions such as the looping interaction between distal enhancers and promoters. It is straightforward to search for other types of rules by GBNet without significantly increasing the computational cost.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>GBNet</p>
            </st>
            <p>The model fitness of the Bayesian network can be evaluated by Equation 1 in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, which is the posterior probability of the network structure given data <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. To minimize the round-off errors, we use the log-value of this posterior probability to define the Bayesian score in this paper:</p>
            <p>
               <display-formula id="M1">
                  <m:math name="1471-2105-9-395-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mrow>
                                 <m:mi>log</m:mi>
                                 <m:mo>&#8289;</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mn>10</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>P</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>N</m:mi>
                              <m:mi>s</m:mi>
                           </m:msub>
                           <m:mo>|</m:mo>
                           <m:mi>D</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mo>&#8722;</m:mo>
                           <m:msub>
                              <m:mi>N</m:mi>
                              <m:mi>p</m:mi>
                           </m:msub>
                           <m:msub>
                              <m:mrow>
                                 <m:mi>log</m:mi>
                                 <m:mo>&#8289;</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mn>10</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>K</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>+</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>j</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>q</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:msub>
                                    <m:mrow>
                                       <m:mi>log</m:mi>
                                       <m:mo>&#8289;</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mn>10</m:mn>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mi>&#915;</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>a</m:mi>
                                          <m:mi>j</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>&#915;</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>a</m:mi>
                                          <m:mi>j</m:mi>
                                       </m:msub>
                                       <m:mo>+</m:mo>
                                       <m:msub>
                                          <m:mi>N</m:mi>
                                          <m:mi>j</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                           </m:mstyle>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>k</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>0</m:mn>
                                 </m:mrow>
                                 <m:mi>r</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:msub>
                                    <m:mrow>
                                       <m:mi>log</m:mi>
                                       <m:mo>&#8289;</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mn>10</m:mn>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mi>&#915;</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>a</m:mi>
                                          <m:mrow>
                                             <m:mi>j</m:mi>
                                             <m:mi>k</m:mi>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo>+</m:mo>
                                       <m:msub>
                                          <m:mi>N</m:mi>
                                          <m:mrow>
                                             <m:mi>j</m:mi>
                                             <m:mi>k</m:mi>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>&#915;</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>a</m:mi>
                                          <m:mrow>
                                             <m:mi>j</m:mi>
                                             <m:mi>k</m:mi>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                           </m:mstyle>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabigdaXiabicdaWaqabaGccqGGOaakcqWGqbaucqGGOaakcqWGobGtdaWgaaWcbaGaem4CamhabeaakiabcYha8jabdseaejabcMcaPiabcMcaPiabg2da9iabgkHiTiabd6eaonaaBaaaleaacqWGWbaCaeqaaOGagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabigdaXiabicdaWaqabaGccqGGOaakcqWGlbWscqGGPaqkcqGHRaWkdaaeWbqaaiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIXaqmcqaIWaamaeqaaKqbaoaalaaabaGaeu4KdCKaeiikaGIaemyyae2aaSbaaeaacqWGQbGAaeqaaiabcMcaPaqaaiabfo5ahjabcIcaOiabdggaHnaaBaaabaGaemOAaOgabeaacqGHRaWkcqWGobGtdaWgaaqaaiabdQgaQbqabaGaeiykaKcaaaWcbaGaemOAaOMaeyypa0JaeGymaedabaGaemyCaehaniabggHiLdGcdaaeWbqaaiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIXaqmcqaIWaamaeqaaKqbaoaalaaabaGaeu4KdCKaeiikaGIaemyyae2aaSbaaeaacqWGQbGAcqWGRbWAaeqaaiabgUcaRiabd6eaonaaBaaabaGaemOAaOMaem4AaSgabeaacqGGPaqkaeaacqqHtoWrcqGGOaakcqWGHbqydaWgaaqaaiabdQgaQjabdUgaRbqabaGaeiykaKcaaaWcbaGaem4AaSMaeyypa0JaeGimaadabaGaemOCaihaniabggHiLdaaaa@8925@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <it>N</it><sub><it>s </it></sub>is network structure, <it>D </it>is data, &#915; (&#183;) is the gamma function, <it>N</it><sub><it>p </it></sub>is the number of parent nodes, log<sub>10</sub>(<it>K</it>) is a network parameter (see below), <it>q </it>is the number of possible parent states, <it>r </it>+ 1 is the number of possible child states, <it>a</it><sub><it>j </it></sub>= &#931; <it>a</it><sub><it>jk</it></sub>, <it>N</it><sub><it>j </it></sub>= &#931; <it>N</it><sub><it>jk</it></sub>, <it>N</it><sub><it>jk </it></sub>is the number of samples for child state <it>k </it>when parent state is <it>j</it>, <it>a</it><sub><it>jk </it></sub>is prior count. In BBNet, a greedy search algorithm was employed to search for the best network structure: a search stopped when adding a new parent node (sequence constraint) could not further improve the Bayesian score. This procedure is prone to get trapped in local optimum. To improve the searching efficiency, we implemented a Metropolis jumping in GBNet each time when a parent node (sequence constraint) was added or the functional depth of a motif was updated. In addition, simulated annealing was also exploited to search for the global optimum (see Fig. S2 for a comparison between the two search algorithms). A change to the Bayesian network structure was accepted by a probability of <inline-formula><m:math name="1471-2105-9-395-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>min</m:mi><m:mo>&#8289;</m:mo><m:mrow><m:mo>(</m:mo><m:mrow><m:mn>1</m:mn><m:mo>,</m:mo><m:mtext>&#160;</m:mtext><m:msup><m:mrow><m:mrow><m:mo>(</m:mo><m:mrow><m:mfrac><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:msup><m:mi>N</m:mi><m:mo>&#8242;</m:mo></m:msup><m:mi>s</m:mi></m:msub><m:mo>|</m:mo><m:mi>D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mrow><m:mi>P</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>N</m:mi><m:mi>s</m:mi></m:msub><m:mo>|</m:mo><m:mi>D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow></m:mfrac></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:mrow><m:mfrac><m:mn>1</m:mn><m:mi>T</m:mi></m:mfrac></m:mrow></m:msup></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGagiyBa0MaeiyAaKMaeiOBa42aaeWaaeaacqaIXaqmcqGGSaalcqqGGaaidaqadaqcfayaamaalaaabaGaemiuaaLaeiikaGIafmOta4KbauaadaWgaaqaaiabdohaZbqabaGaeiiFaWNaemiraqKaeiykaKcabaGaemiuaaLaeiikaGIaemOta40aaSbaaeaacqWGZbWCaeqaaiabcYha8jabdseaejabcMcaPaaaaOGaayjkaiaawMcaamaaCaaaleqajuaGbaWaaSaaaeaacqaIXaqmaeaacqWGubavaaaaaaGccaGLOaGaayzkaaaaaa@498D@</m:annotation></m:semantics></m:math></inline-formula>, where <inline-formula><m:math name="1471-2105-9-395-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:msup><m:mi>N</m:mi><m:mo>&#8242;</m:mo></m:msup><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOta4KbauaadaWgaaWcbaGaem4Camhabeaaaaa@2EA2@</m:annotation></m:semantics></m:math></inline-formula> and <it>N</it><sub><it>s </it></sub>are network structures after and before the change and <it>T </it>is the temperature. In simulated annealing, <it>T </it>was decreased exponentially as <it>T </it>&#8592; &#945;<sup>n</sup><it>T</it>, where <it>n </it>is the number of iterations and &#945; is the rate of change. The searching procedure stopped when there was no change detected at a specific temperature after a number of attempts or a sufficient number of temperature changes had been made. In this work, the initial temperature was set to 5.0 and &#945; = 0.9 for both yeast and human data. The simulation moved to the next temperature if either the number of iterations reached 20 or the number of structure changes reached 500. The maximum number of temperature changes was set to 20. In our tests, GBNet was always able to find the optimum using these parameters.</p>
            <p>The background sequences were selected as the following. For the yeast data, the same background as Beer and Tavazoie (2004) was used for a fair comparison. For the human YY1 data, the number of background sequences was set to five times of the size of the cluster under consideration. This background size was heuristically determined to achieve a balance between discrimination and statistical significance. All genes were ranked according to their correlation to the mean expression profile of the cluster in the ascending order and the least correlated or the most anti-correlated genes were selected as the background. The structural parameter, log<sub>10 </sub><it>K </it>in Eq.(1), helps avoid overfitting in learning the Bayesian network structure by penalizing the complex network structures <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The value of log<sub>10 </sub><it>K </it>in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> was used for the yeast data and a heuristic value of 5.0 was chosen for the human YY1 data.</p>
         </sec>
         <sec>
            <st>
               <p>Finding enriched co-factors using Fisher's exact test</p>
            </st>
            <p>Fisher's exact test can evaluate the significance of the association between two variables <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The test is implemented through the use of a 2 &#215; 2 contingency table. When testing the significance of motif enrichment for a cluster, we designed the contingency table <tblr tid="T3">3</tblr> as follows</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Contingency table </p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Within-cluster</p>
                     </c>
                     <c ca="center">
                        <p>Outside-cluster</p>
                     </c>
                     <c ca="center">
                        <p>Total</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Match motif</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>a</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>b</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p><it>a </it>+ <it>b</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Non-match</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>c</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>d</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p><it>c </it>+ <it>d</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p><it>a </it>+ <it>c</it></p>
                     </c>
                     <c ca="center">
                        <p><it>b </it>+ <it>d</it></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>n</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>where <it>a</it>, <it>b</it>, <it>c</it>, <it>d </it>are the numbers of genes in each category; <it>n </it>= <it>a</it>+<it>b</it>+<it>c</it>+<it>d </it>is the total number of genes under consideration. The same criterion as described above was used to select candidate motifs.</p>
         </sec>
         <sec>
            <st>
               <p>The YY1 ChIP-chip data</p>
            </st>
            <p>The YY1 ChIP-chip data was obtained from our previous study <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Briefly, the whole-genome promoter array was designed and synthesized by Nimblegen. 24134 promoters from human genome build 35 (HGS17) were represented on the array. 1500 bp sequence of each promoter (1300 bp upstream and 200 bp downstream of TSS) is covered by 15 oligo probes of 50 bps long. Two replicates of experiments were conducted in HeLa cells. Data were collected and processed as described previously <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>LS and WW conceived the experiments and wrote the manuscript. LS carried out the data analysis and implemented the software. JL participated in the data analysis. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Michael Beer, Olivie Elemento and Saeed Tavazoie for answering our questions on the details of their methods. We also thank Andrew McCamman for his support. This work is partially supported by a NIH grant to WW (GM072856). LS was partially supported by a postdoctoral fellowship from La Jolla Interfaces in Science.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Assessing computational tools for the discovery of transcription factor binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eskin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Favorov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <issue>1</issue>
            <fpage>137</fpage>
            <lpage>144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1053</pubid>
                  <pubid idtype="pmpid" link="fulltext">15637633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Mukherjee</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Jona</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>XS</fnm>
               </au>
               <au>
                  <snm>Muzzey</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <issue>12</issue>
            <fpage>1331</fpage>
            <lpage>1339</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1473</pubid>
                  <pubid idtype="pmpid" link="fulltext">15543148</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Transcription regulation and animal diversity</p>
            </title>
            <aug>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tjian</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>424</volume>
            <fpage>147</fpage>
            <lpage>151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01763</pubid>
                  <pubid idtype="pmpid" link="fulltext">12853946</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Genome-wide prediction and characterization of interactions between transcription factors in <it>Saccharomyces cerevisiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Yu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Masuda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Esumi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zack</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Qian</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>3</issue>
            <fpage>917</fpage>
            <lpage>927</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1361616</pubid>
                  <pubid idtype="pmpid" link="fulltext">16464824</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj487</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues</p>
            </title>
            <aug>
               <au>
                  <snm>Yu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zack</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Qian</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>17</issue>
            <fpage>4925</fpage>
            <lpage>4936</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1635265</pubid>
                  <pubid idtype="pmpid" link="fulltext">16982645</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl595</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>McGuire</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Masuda</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <issue>2</issue>
            <fpage>201</fpage>
            <lpage>208</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">327095</pubid>
                  <pubid idtype="pmpid" link="fulltext">14762058</pubid>
                  <pubid idtype="doi">10.1101/gr.1448004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Predicting transcription factor synergism</p>
            </title>
            <aug>
               <au>
                  <snm>Hannenhalli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>19</issue>
            <fpage>4278</fpage>
            <lpage>4284</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140535</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364607</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf535</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Identifying regulatory networks by combinatorial analysis of promoter elements</p>
            </title>
            <aug>
               <au>
                  <snm>Pilpel</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sudarsanam</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <issue>2</issue>
            <fpage>153</fpage>
            <lpage>159</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng724</pubid>
                  <pubid idtype="pmpid" link="fulltext">11547334</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Discovering functional transcription-factor combinations in the human cell cycle</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Shendure</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Research</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>848</fpage>
            <lpage>855</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1142475</pubid>
                  <pubid idtype="pmpid" link="fulltext">15930495</pubid>
                  <pubid idtype="doi">10.1101/gr.3394405</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Identifying cooperativity among transcription factors controlling the cell cycle in yeast</p>
            </title>
            <aug>
               <au>
                  <snm>Banerjee</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>23</issue>
            <fpage>7024</fpage>
            <lpage>7031</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">290262</pubid>
                  <pubid idtype="pmpid" link="fulltext">14627835</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg894</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Identifying combinatorial regulation of transcription factors and binding motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Kato</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hata</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Banerjee</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Futcher</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>8</issue>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">15287978</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Interacting models of cooperative gene regulation</p>
            </title>
            <aug>
               <au>
                  <snm>Das</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Banerjee</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>PNAS</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>46</issue>
            <fpage>16234</fpage>
            <lpage>16239</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">528978</pubid>
                  <pubid idtype="pmpid" link="fulltext">15534222</pubid>
                  <pubid idtype="doi">10.1073/pnas.0407365101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shapira</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Regev</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pe'er</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>34</volume>
            <issue>2</issue>
            <fpage>166</fpage>
            <lpage>176</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1165</pubid>
                  <pubid idtype="pmpid" link="fulltext">12740579</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>cis-regulatory logic of short-range transcriptional repression in Drosophila melanogaster</p>
            </title>
            <aug>
               <au>
                  <snm>Kulkarni</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Arnosti</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2005</pubdate>
            <volume>25</volume>
            <issue>9</issue>
            <fpage>3411</fpage>
            <lpage>3420</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1084297</pubid>
                  <pubid idtype="pmpid" link="fulltext">15831448</pubid>
                  <pubid idtype="doi">10.1128/MCB.25.9.3411-3420.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A universal framework for regulatory element discovery across all genomes and data types</p>
            </title>
            <aug>
               <au>
                  <snm>Elemento</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2007</pubdate>
            <volume>28</volume>
            <issue>2</issue>
            <fpage>337</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.molcel.2007.09.027</pubid>
                  <pubid idtype="pmpid" link="fulltext">17964271</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Predicting gene expression from sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Beer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2004</pubdate>
            <volume>117</volume>
            <fpage>185</fpage>
            <lpage>198</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(04)00304-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">15084257</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Predicting Gene Expression from Sequence: A Reexamination</p>
            </title>
            <aug>
               <au>
                  <snm>Yuan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Guo</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Shen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>PLoS Computational Biology</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <issue>11</issue>
            <fpage>e243</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2098866</pubid>
                  <pubid idtype="pmpid" link="fulltext">18052544</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0030243</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Characterization of the Transcriptional Regulator YY1</p>
            </title>
            <aug>
               <au>
                  <snm>Austen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Luscher</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Luscher-Firzlaff</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1997</pubdate>
            <volume>272</volume>
            <issue>3</issue>
            <fpage>1709</fpage>
            <lpage>1717</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.272.3.1709</pubid>
                  <pubid idtype="pmpid" link="fulltext">8999850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Everything you have ever wanted to know about Yin Yang 1</p>
            </title>
            <aug>
               <au>
                  <snm>Shi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>J-S</fnm>
               </au>
               <au>
                  <snm>Galvin</snm>
                  <fnm>KM</fnm>
               </au>
            </aug>
            <source>Biochimica et Biophysica Acta (BBA) &#8211; Reviews on Cancer</source>
            <pubdate>1997</pubdate>
            <volume>1332</volume>
            <issue>2</issue>
            <fpage>F49</fpage>
            <lpage>F66</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">9141463</pubid>
                  <pubid idtype="doi">10.1016/S0304-419X(96)00044-3</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Transcription factor YY1: structure, function, and therapeutic implications in cancer biology</p>
            </title>
            <aug>
               <au>
                  <snm>Gordon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Akopyan</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Garban</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bonavida</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Oncogene</source>
            <pubdate>2005</pubdate>
            <volume>25</volume>
            <issue>8</issue>
            <fpage>1125</fpage>
            <lpage>1142</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">16314846</pubid>
                  <pubid idtype="doi">10.1038/sj.onc.1209080</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Rewiring of the YY1 transcriptional program in human and mouse</p>
            </title>
            <aug>
               <au>
                  <snm>Shen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Affar</snm>
                  <fnm>EB</fnm>
               </au>
               <au>
                  <snm>Shi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <pubdate>2008</pubdate>
            <inpress/>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Inference of combinatorial regulation in yeast transcriptional networks: A case study of sporulation</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Nochomovitz</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Jolly</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>6</issue>
            <fpage>1998</fpage>
            <lpage>2003</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">15684073 </pubid>
                  <pubid idtype="doi">10.1073/pnas.0405537102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>HG_U133A/GNF1H and GNF1M Tissue Atlas Datasets</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wiltshire</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Batalov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lapp</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Soden</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hayakawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kreiman</snm>
                  <fnm>G</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>16</issue>
            <fpage>6062</fpage>
            <lpage>6067</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395923</pubid>
                  <pubid idtype="pmpid" link="fulltext">15075390</pubid>
                  <pubid idtype="doi">10.1073/pnas.0400782101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>TRANSFAC: transcriptional regulation, from patterns to profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Geffers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gossling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Haubrock</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Karas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>374</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165555</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520026</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg108</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Statistical Methods for research workers</p>
            </title>
            <aug>
               <au>
                  <snm>Fisher</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Oliver and Boyd</source>
            <pubdate>1954</pubdate>
         </bibl>
         <bibl id="B27">
            <title>
               <p>BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput: 2001</source>
            <pubdate>2001</pubdate>
            <fpage>127</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">11262934</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Functional redundancy of transcription factor-binding sites in the killer cell Ig-like receptor (KIR) gene promoter</p>
            </title>
            <aug>
               <au>
                  <snm>Presnell</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ramilo</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Lutz</snm>
                  <fnm>CT</fnm>
               </au>
            </aug>
            <source>Int Immunol</source>
            <pubdate>2006</pubdate>
            <volume>18</volume>
            <issue>8</issue>
            <fpage>1221</fpage>
            <lpage>1232</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/intimm/dxl043</pubid>
                  <pubid idtype="pmpid" link="fulltext">16818466</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Identification of functional elements in the murine Gabp alpha/ATP synthase coupling factor 6 bi-directional promoter</p>
            </title>
            <aug>
               <au>
                  <snm>Patton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Coombs</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>ME</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2006</pubdate>
            <volume>369</volume>
            <fpage>35</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.gene.2005.10.009</pubid>
                  <pubid idtype="pmpid" link="fulltext">16309857</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Interaction of YY1 with E2Fs, mediated by RYBP, provides a mechanism for specificity of E2F function</p>
            </title>
            <aug>
               <au>
                  <snm>Schlisio</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Halperin</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vidal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nevins</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>2002</pubdate>
            <volume>21</volume>
            <issue>21</issue>
            <fpage>5775</fpage>
            <lpage>5786</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">131074</pubid>
                  <pubid idtype="pmpid">12411495</pubid>
                  <pubid idtype="doi">10.1093/emboj/cdf577</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Cooperative regulation of p73 promoter by Yin Yang 1 and E2F1</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Murai</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kataoka</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Miyagishi</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Symp Ser (Oxf)</source>
            <pubdate>2007</pubdate>
            <volume>51</volume>
            <fpage>347</fpage>
            <lpage>348</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nass/nrm174</pubid>
                  <pubid idtype="pmpid" link="fulltext">18029729</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Positive and negative regulatory elements in the upstream region of the rat Cu/Zn-superoxide dismutase gene</p>
            </title>
            <aug>
               <au>
                  <snm>Chang</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Yoo</snm>
                  <fnm>HY</fnm>
               </au>
               <au>
                  <snm>Rho</snm>
                  <fnm>HM</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1999</pubdate>
            <volume>339</volume>
            <issue>Pt 2</issue>
            <fpage>335</fpage>
            <lpage>341</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1220162</pubid>
                  <pubid idtype="pmpid" link="fulltext">10191264</pubid>
                  <pubid idtype="doi">10.1042/0264-6021:3390335</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Downregulation of CuZn-superoxide dismutase contributes to [beta]-adrenergic receptor-mediated oxidative stress in the heart</p>
            </title>
            <aug>
               <au>
                  <snm>Srivastava</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chandrasekar</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Luo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hamid</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Prabhu</snm>
                  <fnm>SD</fnm>
               </au>
            </aug>
            <source>Cardiovascular Research</source>
            <pubdate>2007</pubdate>
            <volume>74</volume>
            <issue>3</issue>
            <fpage>445</fpage>
            <lpage>455</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cardiores.2007.02.016</pubid>
                  <pubid idtype="pmpid" link="fulltext">17362897</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>A comprehensive ChIP chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members</p>
            </title>
            <aug>
               <au>
                  <snm>Xu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Bieda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>VX</fnm>
               </au>
               <au>
                  <snm>Rabinovich</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Oberley</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Farnham</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <note>gr.6783507</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2045138</pubid>
                  <pubid idtype="pmpid" link="fulltext">17908821</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Heintzman</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Stuart</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Hon</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Barrera</snm>
                  <fnm>LO</fnm>
               </au>
               <au>
                  <snm>Van Calcar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Qu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>KA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <issue>3</issue>
            <fpage>311</fpage>
            <lpage>318</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1966</pubid>
                  <pubid idtype="pmpid" link="fulltext">17277777</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity</p>
            </title>
            <aug>
               <au>
                  <snm>Hallikas</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Palin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sinjushina</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Rautiainen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Partanen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ukkonen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Taipale</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2006</pubdate>
            <volume>124</volume>
            <issue>1</issue>
            <fpage>47</fpage>
            <lpage>59</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2005.10.042</pubid>
                  <pubid idtype="pmpid" link="fulltext">16413481</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>A high-resolution map of active promoters in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Barrera</snm>
                  <fnm>LO</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Qu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Richmond</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>436</volume>
            <issue>7052</issue>
            <fpage>876</fpage>
            <lpage>880</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1895599</pubid>
                  <pubid idtype="pmpid" link="fulltext">15988478</pubid>
                  <pubid idtype="doi">10.1038/nature03877</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>ChIP-chip for Genome-Wide Analysis of Protein Binding in Mammalian Cells</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Barrera</snm>
                  <fnm>LO</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Current Protocols in Molecular Biology</source>
            <pubdate>2007</pubdate>
            <volume>79</volume>
            <fpage>21</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">18265397</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
