<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-10-9</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Complex organizational structure of the genome revealed by genome-wide analysis of single and alternative promoters in <it>Drosophila melanogaster</it></p>
         </title>
         <aug>
            <au id="A1">
               <snm>Zhu</snm>
               <fnm>Qianqian</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>qzhu@buffalo.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Halfon</snm>
               <mi>S</mi>
               <fnm>Marc</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <insr iid="I5"/>
               <email>mshalfon@buffalo.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biochemistry, Buffalo, NY 14214, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Biostatistics, Buffalo, NY 14214, USA</p>
            </ins>
            <ins id="I3">
               <p>Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14214, USA</p>
            </ins>
            <ins id="I4">
               <p>New York State Center of Excellence in Bioinformatics and the Life Sciences, Buffalo, NY 14203, USA</p>
            </ins>
            <ins id="I5">
               <p>Department of Molecular and Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>1</issue>
         <fpage>9</fpage>
         <url>http://www.biomedcentral.com/1471-2164/10/9</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19128496</pubid>
               <pubid idtype="doi">10.1186/1471-2164-10-9</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>10</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>07</day>
               <month>1</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>07</day>
               <month>1</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Zhu and Halfon; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The promoter is a critical necessary transcriptional <it>cis</it>-regulatory element. In addition to its role as an assembly site for the basal transcriptional apparatus, the promoter plays a key part in mediating temporal and spatial aspects of gene expression through differential binding of transcription factors and selective interaction with distal enhancers. Although many genes have multiple promoters, little attention has been focused on how these relate to one another; nor has much study been directed at relationships between promoters of adjacent genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We have undertaken a systematic investigation of <it>Drosophila </it>promoters. We divided promoters into three groups: unique promoters, first alternative promoters (the most 5' of a gene's multiple promoters), and downstream alternative promoters (the remaining alternative promoters 3' to the first). We observed distinct nucleotide distribution and sequence motif preferences among these three classes. We also investigated the promoters of neighboring genes and found that a greater than expected number of adjacent genes have similar sequence motif profiles, which may allow the genes to be regulated in a coordinated fashion. Consistent with this, there is a positive correlation between similar promoter motifs and related gene expression profiles for these genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>Our results suggest that different regulatory mechanisms may apply to each of the three promoter classes, and provide a mechanism for "gene expression neighborhoods," local clusters of co-expressed genes. As a whole, our data reveal an unexpected complexity of genomic organization at the promoter level with respect to both alternative and neighboring promoters.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Coordinated regulation of gene expression is a fundamental process that depends on the binding of transcription factors to a gene's <it>cis</it>-regulatory sequences. Absolutely required for transcription initiation of metazoan protein-coding genes is the core promoter, the region of DNA 35&#8211;40 bp upstream and downstream of the transcription start site (TSS) <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The core promoter contains sequence elements, referred to as "core promoter motifs," which interact with the basal transcription machinery, including RNA polymerase II and the TFIID complex <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. In recent years, it has become clear that the core promoter, rather than playing a passive role in the spatial and temporal regulation of gene expression, is an important active partner in these events <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. For instance, different promoter sequences are found preferentially associated with certain functional classes of genes, with genes expressed at particular developmental stages, and with genes expressed in the germ line versus the soma <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. Various tissue-specific members of the TATA box-binding protein (TBP) family, such as the TBP-related factors (TRFs), bind preferentially to certain core promoters <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. There is also substantial evidence for preferred or specific promoter-enhancer interactions, whereby a distal <it>cis</it>-regulatory module (CRM, or "enhancer") can stimulate activity from one promoter, but not another <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>.</p>
         <p>A number of mechanisms have been demonstrated to restrict the activity of a CRM to a particular promoter, including insulator elements <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, insulator-bypass or promoter targeting elements <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, short-range repression <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, chromatin-mediated silencing <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B15">15</abbr></abbrgrp>, and preferential interaction with promoters containing certain core promoter motifs <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. The relative prevalence of each of these mechanisms is unknown, as in most cases is a detailed understanding of how they function. In particular, the molecular basis underlying core promoter preference has not been clearly defined.</p>
         <p>The existence of CRM-promoter specificity is all the more remarkable given that it is maintained despite the fact that there are sometimes other promoters closer to, or even interposed between, a CRM and its target. In fact, the latter may be a much more common scenario than typically credited, as it can occur not only with respect to the regulation of different genes, but also with respect to alternative promoters of the same gene. In humans, it is estimated that upwards of 50% of all genes have at least one alternative promoter <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B20">20</abbr></abbrgrp>, and there is growing evidence that alternative promoter usage plays important roles in both development and disease <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. It is unknown how frequently such alternative promoters are regulated by distinct CRMs, but the number could be large; Kimura <it>et al</it>. <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> suggest that over 1800 sets of alternative promoters are regulated in a tissue-specific fashion.</p>
         <p>Except for the case of bidirectional promoters (those that regulate divergently transcribed genes; <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>), few studies have focused specifically on promoters of neighboring genes or on alternative promoters, and little is known about the mechanisms that direct promoter usage choice. Baek <it>et al</it>. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> recently analyzed a subset of human promoters by dividing them into the four categories of CpG-island containing and non-containing single and alternative promoters, and observed differences in sequence properties, evolutionary conservation, biological roles, and degree of usage. Their data suggest that there may be differences among promoters depending on their relative position in the gene, with more upstream promoters being more highly expressed and more CpG-rich than the more downstream promoters. Interestingly, they found that the TATA box and DPE core promoter motifs were more common in single than in alternative promoters. However, a similar study by Kimura <it>et al</it>. <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> found little difference in the frequency of the TATA box between the two groups, although they observed a large difference in the prevalence of CpG islands. Differences in the full set of promoters used and in how the promoters were grouped&#8211;the latter study did not look separately at the CpG-containing and non-containing promoters&#8211;may account for the discrepancies in the reported results. A number of other sequence motifs, of unknown functional significance, were also seen to be differentially represented among the promoter classes <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. These studies suggest that there might be fundamental differences in the structure and function of single versus alternative promoters that could have broad implications for understanding how transcription is coordinated within the genome.</p>
         <p>As a means to developing an estimate of how important the sequence of the promoter might be in dictating promoter usage choice and in mediating CRM-promoter specificity, and as a prelude to experimental studies of the mechanisms of CRM-promoter interactions, we undertook a global bioinformatics analysis of <it>Drosophila melanogaster </it>promoters. We found that there are marked differences in nucleotide composition and motif prevalence between single promoters and alternative promoters, and between the most 5' alternative promoters and more downstream alternative promoters. We also observed that adjacent genes on the chromosome are more likely than expected to have promoters with a similar motif profile, and that this similarity in promoter configuration correlates with co-regulated gene expression. Our results suggest that promoter composition may play a larger-than-appreciated role in coordinating gene expression both between nearby genes and between multiple transcripts of the same gene.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>In order to conduct a comprehensive genome-wide study of promoters, we looked at all protein coding genes annotated in the <it>Drosophila </it>genome annotation release 5.5 <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. We considered all annotated TSSs to be true TSSs. Multiple TSSs that were less than 18 bp apart were considered as sharing the same promoter (see Methods). All together, we obtained 16,469 promoters from the genome, and separated these into three mutually exclusive classes (Fig. <figr fid="F1">1A</figr>; see Methods): <it>unique promoters </it>(UPs), i.e., promoters for genes having only a single promoter; <it>first alternative promoters </it>(FAPs), defined as the most 5' of a gene's multiple promoters (with respect to the coding strand); and <it>downstream alternative promoters </it>(DAPs), which are any alternative promoters 3' to the FAPs. There were 11660, 1955, and 2854 promoters in UPs, FAPs and DAPs respectively (Fig. <figr fid="F1">1B</figr>). Approximately 14% of genes had alternative promoters, with an average of 2.46 promoters/gene (range 2&#8211;13) for those with more than one promoter.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Promoter sets used for this study</p>
            </caption>
            <text>
               <p><b>Promoter sets used for this study</b>. (A) Schematic view of the three types of promoters: unique promoters (UPs), first alternative promoters (FAP), and downstream alternative promoters (DAP). (B) Number of promoters in each class when using the "all promoters" data set. (C) Number of promoters in each class when using the "high quality" data set. (D) Number of promoters in each class when using the "cap supported" data set.</p>
            </text>
            <graphic file="1471-2164-10-9-1"/>
         </fig>
         <p>Although there are undoubtedly errors in the genome annotation with respect to TSSs [e.g., <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>], we reasoned that these errors would be few relative to the total number of annotated genes and that therefore, while they might contribute noise to our analysis, they would not mask any clearly significant results. However, to ensure the robustness of our results, we also generated two higher confidence sets of promoters: a "high quality" set of 9898 promoters based on FlyBase transcript evidence annotations and a "cap-supported" set of 5389 promoters in which all transcripts have their 5' ends accurately mapped based on cDNA isolation using a 5' cap-dependent method (see Methods). We separated these smaller promoter sets into UPs, FAPs, and DAPs, as we did with the full promoter set, and performed all analysis in parallel with the three sets (Fig. <figr fid="F1">1C, 1D</figr>). In all cases, the results were essentially the same (in a minority of cases with the two smaller-sized high-confidence datasets, values fell below the conservative statistical thresholds we had set, but trends were consistently maintained; Figure S1 [see Additional file <supplr sid="S1">1</supplr>], Tables S2 and S3 [see Additional file <supplr sid="S2">2</supplr>], and data not shown).</p>
         <suppl id="S1">
            <title>
               <p>Additional file 1</p>
            </title>
            <text>
               <p><b>Figures S1 and S2</b>. Mononucleotide distributions of the three groups of promoters when using "high quality" fly promoters (S1A), cap-supported fly promoters (S1B) and human promoter sets (S2).</p>
            </text>
            <file name="1471-2164-10-9-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional file 2</p>
            </title>
            <text>
               <p><b>Tables S1-S4. </b>Table S1: <it>Drosophila </it>promoter motifs used in this study; Table S2: Motif preferences when using different motif position requirements; Table S3: Motif preferences when using regular expressions versus position weight matrices to represent motifs; Table S4: Human promoter motifs used in this study.</p>
            </text>
            <file name="1471-2164-10-9-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <sec>
            <st>
               <p>Nucleotide distribution differs among the three promoter classes</p>
            </st>
            <p>As a starting point for comparisons among the three classes of promoters, we analyzed the distribution of the four nucleotides in each class by calculating the mean frequency of each nucleotide in a 10 bp sliding window across the TSS region, ranging from -500 bp (outside the proximal and core promoter regions) to +100 bp (downstream of the core promoter). Promoters whose sequences overlapped that of an adjacent promoter of a different class were excluded from analysis. Similar to FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, who looked only at UPs, we observed an AT peak around -200 bp. However, although this peak is clearly present for both UPs and FAPs, it is absent for the DAPs, which have a markedly higher GC level between -300 bp and -100 bp. Although the nucleotide distributions of UPs and FAPs are similar to one another from -500 bp to -200 bp, they begin to separate afterward: around the TSSs, UPs have more A but less C than the FAPs, and downstream of the TSS, the UPs are more GC rich than the FAPs (Fig. <figr fid="F2">2</figr>, Fig. S1). The point at which the promoter classes begin to diverge is consistent with previous findings for human genes that the region beginning at approximately -350 bp comprises an extended promoter region with an important positive regulatory role <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The mononucleotide distribution of the three different classes of fly promoters</p>
               </caption>
               <text>
                  <p><b>The mononucleotide distribution of the three different classes of fly promoters</b>. Shown are unique promoters (black), first alternative promoters (red), the downstream alternative promoters (green). The mean frequency of each nucleotide was calculated in a 10 bp window sliding across the promoter region in 1 bp steps.</p>
               </text>
               <graphic file="1471-2164-10-9-2"/>
            </fig>
            <p>Consistent with the trends we observed in the distribution plots, we found that the GC content differs significantly among all three classes of promoters with DAP > FAP > UP (DAP versus UP: mean 0.413 (standard deviation (SD) 0.060) versus 0.379 (0.062), Kolmogorov-Smirnov test <it>P </it>&#8776; 0; DAP versus FAP: 0.413 (0.060) versus 0.397 (0.065), Kolmogorov-Smirnov test <it>P </it>&#8776; 7.46e-11; FAP versus UP: 0.397 (0.065) versus 0.379 (0.062), Kolmogorov-Smirnov test <it>P </it>&#8776; 4.44e-16). Note that the GC content was calculated after masking the coding regions in the promoters to avoid bias caused by the higher GC content of coding sequences <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. The nucleotide distributions indicate that all three promoter classes have distinct characteristics, which are most pronounced when comparing the DAPs to the others.</p>
         </sec>
         <sec>
            <st>
               <p>Distribution of promoter motifs in the different promoter classes</p>
            </st>
            <p>The functional units of the core promoter are the "core promoter elements," the sequences that mediate binding of the general transcription factors to the promoter. Like most transcription factor binding sites, the sequences of these elements form a family of related subsequences known as a sequence "motif." For brevity, and because in this study we do not explicitly evaluate binding but deal only with DNA sequence, we refer here to both the sequence motifs themselves and to instances of these subsequences in promoter regions as "promoter motifs." Although the relationship between various core promoter motifs and tissue- and stage-specific gene regulation has been studied previously, as have possible associations between certain motifs and genes of particular functional classes <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>, there has been no systematic investigation of motif distributions in single versus alternative promoters, or among differently positioned alternative promoters. Therefore, we searched all 16,469 promoters for the presence of the 15 promoter motifs identified by FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> in their analysis of <it>Drosophila </it>UPs that are overrepresented in core promoters or the extended promoter region up to -130 bp (Table S1 [see Additional file <supplr sid="S2">2</supplr>]; see Methods). A full listing of the genomic coordinates of the mapped motifs and their sequences is provided [see Additional files <supplr sid="S3">3</supplr>, <supplr sid="S4">4</supplr>, <supplr sid="S5">5</supplr>].</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><b>Promoter motif coordinates from our primary study</b>. This file contains all of the mapped motifs from our primary study formatted for upload to the Generic Genome Browser (GBrowse) as a custom annotation.</p>
               </text>
               <file name="1471-2164-10-9-S3.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p><b>Promoter motif coordinates using "combined range"</b>. This file contains all of the mapped motifs using the motif definitions and position requirements "combined range" formatted for upload to the Generic Genome Browser (GBrowse) as a custom annotation.</p>
               </text>
               <file name="1471-2164-10-9-S4.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p><b>Promoter motif coordinates from Patser</b>. This file contains all of the mapped motifs using Patser, as described in the text, formatted for upload to the Generic Genome Browser (GBrowse) as a custom annotation.</p>
               </text>
               <file name="1471-2164-10-9-S5.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>For each of the 15 motifs, we looked at their relative distributions in UPs, FAPs, and DAPs; 13 of them show a significant occurrence bias among the three promoter classes (Table <tblr tid="T1">1</tblr>). We see differences between unique and alternative promoters, between 5' promoters (UP or FAP) and downstream promoters, and between first alternative and more downstream alternative promoters. For example, TATA/DMp1 is found in 5.8% of UPs, 5.1-fold and 6.1-fold higher than its occurrence in the two classes of alternative promoters, FAPs and DAPs, respectively. NDM2 is significantly underrepresented in UPs as compared to the two classes of alternative promoters. DMv1, DMv2, DMv3, NDM3 and E-box/NDM5 are underrepresented in the DAPs but do not differ significantly between the UPs and FAPs. INR/DMp2 is more prevalent in DAPs than in UPs, whereas INR1/DMp3 and DPE1/DMp5 are less common in UPs than in FAPs. The presence of DMv4 and DRE/NDM4 differs significantly among all three classes of promoters with FAP > UP > DAP. GAGA/NDM1's occurrence is also significantly different among the three classes with DAP > FAP > UP. The occurrence biases we observed do not appear to correlate with the stage-specific motif usage noted by FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. For instance, TATA and GAGA are both preferentially associated with adult-expressed genes, but while TATA/DMp1 is overrepresented in UPs, GAGA/NDM1 is more prevalent in DAPs.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Presence of the 15 fly promoter motifs in the three promoter classes</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Occurrence percentage<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>p value<sup>b</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="right">
                        <p>UPs</p>
                     </c>
                     <c ca="right">
                        <p>FAPs</p>
                     </c>
                     <c ca="center">
                        <p>DAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. FAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. DAPs</p>
                     </c>
                     <c ca="center">
                        <p>FAPs vs. DAPs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TATA/DMp1</p>
                     </c>
                     <c ca="right">
                        <p>5.77</p>
                     </c>
                     <c ca="right">
                        <p>1.13</p>
                     </c>
                     <c ca="center">
                        <p>0.95</p>
                     </c>
                     <c ca="center">
                        <p>9.73e-24</p>
                     </c>
                     <c ca="center">
                        <p>5.21e-36</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>INR/DMp2</p>
                     </c>
                     <c ca="right">
                        <p>10.22</p>
                     </c>
                     <c ca="right">
                        <p>11.97</p>
                     </c>
                     <c ca="center">
                        <p>12.58</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>3.36e-04</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>INR1/DMp3</p>
                     </c>
                     <c ca="right">
                        <p>0.52</p>
                     </c>
                     <c ca="right">
                        <p>1.43</p>
                     </c>
                     <c ca="center">
                        <p>0.74</p>
                     </c>
                     <c ca="center">
                        <p>4.54e-05</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DPE/DMp4</p>
                     </c>
                     <c ca="right">
                        <p>0.69</p>
                     </c>
                     <c ca="right">
                        <p>0.92</p>
                     </c>
                     <c ca="center">
                        <p>1.02</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DPE1/DMp5</p>
                     </c>
                     <c ca="right">
                        <p>0.39</p>
                     </c>
                     <c ca="right">
                        <p>1.07</p>
                     </c>
                     <c ca="center">
                        <p>0.81</p>
                     </c>
                     <c ca="center">
                        <p>3.32e-04</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv1</p>
                     </c>
                     <c ca="right">
                        <p>1.56</p>
                     </c>
                     <c ca="right">
                        <p>2.46</p>
                     </c>
                     <c ca="center">
                        <p>0.70</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>2.31e-04</p>
                     </c>
                     <c ca="center">
                        <p>6.23e-07</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv2</p>
                     </c>
                     <c ca="right">
                        <p>0.85</p>
                     </c>
                     <c ca="right">
                        <p>0.87</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>7.26e-10</p>
                     </c>
                     <c ca="center">
                        <p>2.17e-07</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv3</p>
                     </c>
                     <c ca="right">
                        <p>3.17</p>
                     </c>
                     <c ca="right">
                        <p>3.48</p>
                     </c>
                     <c ca="center">
                        <p>1.65</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>4.46e-06</p>
                     </c>
                     <c ca="center">
                        <p>6.86e-05</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv4</p>
                     </c>
                     <c ca="right">
                        <p>3.39</p>
                     </c>
                     <c ca="right">
                        <p>5.58</p>
                     </c>
                     <c ca="center">
                        <p>1.23</p>
                     </c>
                     <c ca="center">
                        <p>7.10e-06</p>
                     </c>
                     <c ca="center">
                        <p>2.69e-11</p>
                     </c>
                     <c ca="center">
                        <p>5.58e-18</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv5</p>
                     </c>
                     <c ca="right">
                        <p>1.41</p>
                     </c>
                     <c ca="right">
                        <p>0.97</p>
                     </c>
                     <c ca="center">
                        <p>0.88</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GAGA/NDM1</p>
                     </c>
                     <c ca="right">
                        <p>1.23</p>
                     </c>
                     <c ca="right">
                        <p>2.86</p>
                     </c>
                     <c ca="center">
                        <p>4.87</p>
                     </c>
                     <c ca="center">
                        <p>4.14e-07</p>
                     </c>
                     <c ca="center">
                        <p>2.68e-29</p>
                     </c>
                     <c ca="center">
                        <p>4.55e-04</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NDM2</p>
                     </c>
                     <c ca="right">
                        <p>2.92</p>
                     </c>
                     <c ca="right">
                        <p>4.76</p>
                     </c>
                     <c ca="center">
                        <p>6.03</p>
                     </c>
                     <c ca="center">
                        <p>4.87e-05</p>
                     </c>
                     <c ca="center">
                        <p>3.87e-14</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NDM3</p>
                     </c>
                     <c ca="right">
                        <p>1.11</p>
                     </c>
                     <c ca="right">
                        <p>0.97</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>8.40e-13</p>
                     </c>
                     <c ca="center">
                        <p>3.55e-08</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DRE/NDM4</p>
                     </c>
                     <c ca="right">
                        <p>8.34</p>
                     </c>
                     <c ca="right">
                        <p>14.22</p>
                     </c>
                     <c ca="center">
                        <p>4.59</p>
                     </c>
                     <c ca="center">
                        <p>3.16e-15</p>
                     </c>
                     <c ca="center">
                        <p>1.04e-12</p>
                     </c>
                     <c ca="center">
                        <p>2.35e-31</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E-box/NDM5</p>
                     </c>
                     <c ca="right">
                        <p>5.44</p>
                     </c>
                     <c ca="right">
                        <p>4.45</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>2.73e-62</p>
                     </c>
                     <c ca="center">
                        <p>3.07e-35</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>None<sup>c</sup></p>
                     </c>
                     <c ca="right">
                        <p>61.81</p>
                     </c>
                     <c ca="right">
                        <p>54.78</p>
                     </c>
                     <c ca="center">
                        <p>69.90</p>
                     </c>
                     <c ca="center">
                        <p>5.27e-09</p>
                     </c>
                     <c ca="center">
                        <p>4.73e-16</p>
                     </c>
                     <c ca="center">
                        <p>1.61e-26</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>percentage of the promoters in a class having the specific motif.</p>
                  <p><sup>b</sup>calculated using Fisher's exact test. Holm's method was used for multiple hypothesis testing. ns, not significant.</p>
                  <p><sup>c</sup>percentage of promoters without any known motifs.</p>
               </tblfn>
            </tbl>
            <p>As certain combinations of promoter motifs have been shown to preferentially co-occur&#8211;for example, TATA and INR, and INR and DPE <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>&#8211;we also looked at the presence of motif combinations in the different promoter classes (see Methods). Of the 14 significantly enriched motif combinations, six show significantly different occurrence among the three promoter classes (Table <tblr tid="T2">2</tblr>). Overall, the pattern of motif combinations correlates with the occurrence of individual motifs. For example, DMv3, DRE/NDM4 and E-box/NDM5 are significantly underrepresented in the DAPs, and the interactions between DMv3 and DRE/NDM4 and between DRE/NDM4 and E-box/NDM5 are also underrepresented in DAPs.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Presence of significant motif interactions in the three fly promoter classes</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Occurrence percentage<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>p value<sup>b</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Motif interaction</p>
                     </c>
                     <c ca="center">
                        <p>UPs</p>
                     </c>
                     <c ca="center">
                        <p>FAPs</p>
                     </c>
                     <c ca="center">
                        <p>DAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. FAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. DAPs</p>
                     </c>
                     <c ca="center">
                        <p>FAPs vs. DAPs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TATA_INR/DMp1_DMp2</p>
                     </c>
                     <c ca="center">
                        <p>1.03</p>
                     </c>
                     <c ca="center">
                        <p>0.36</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>1.57e-09</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>INR_GAGA/DMp2_NDM1</p>
                     </c>
                     <c ca="center">
                        <p>0.28</p>
                     </c>
                     <c ca="center">
                        <p>0.82</p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                     <c ca="center">
                        <p>1.43e-03</p>
                     </c>
                     <c ca="center">
                        <p>1.92e-05</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>INR_NDM2/DMp2_NDM2</p>
                     </c>
                     <c ca="center">
                        <p>0.56</p>
                     </c>
                     <c ca="center">
                        <p>1.13</p>
                     </c>
                     <c ca="center">
                        <p>1.65</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>5.84e-08</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv1_DRE/DMv1_NDM4</p>
                     </c>
                     <c ca="center">
                        <p>0.25</p>
                     </c>
                     <c ca="center">
                        <p>0.87</p>
                     </c>
                     <c ca="center">
                        <p>0.18</p>
                     </c>
                     <c ca="center">
                        <p>1.20e-04</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>6.83e-04</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DMv3_DRE/DMv3_NDM4</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                     <c ca="center">
                        <p>2.10</p>
                     </c>
                     <c ca="center">
                        <p>0.56</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>5.66e-04</p>
                     </c>
                     <c ca="center">
                        <p>1.98e-06</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DRE_E-box/NDM4_NDM5</p>
                     </c>
                     <c ca="center">
                        <p>0.97</p>
                     </c>
                     <c ca="center">
                        <p>0.77</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>2.95e-11</p>
                     </c>
                     <c ca="center">
                        <p>1.33e-06</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>percentage of the promoters in a class having the specific motif interaction.</p>
                  <p><sup>b</sup>calculated using Fisher's exact test. Holm's method was used for multiple hypothesis testing. ns, not significant.</p>
               </tblfn>
            </tbl>
            <p>One concern in this type of analysis is that the definition of, or method of searching for, the motifs may affect the results by altering the number and locations of the identified sequences. We therefore repeated our analysis using two alternative methods for calculating the position requirement for the motifs, (Table S1 [see Additional file <supplr sid="S2">2</supplr>]), and also by locating motif instances using Patser <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and the position weight matrices defined by Ohler <it>et al</it>. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. (While we use the weight matrices defined by Ohler, this should not be confused with use of the promoter predictions obtained in <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. We find that there is limited correspondence between these predictions and the promoter positions obtained from the <it>Drosophila </it>annotation when requiring a prediction to fall &lt; 18 bp from an annotated TSS&#8211;our cutoff for matching a promoter to a TSS&#8211;as opposed to the more relaxed 500 bp window allowed by Ohler; this is true even for the "high-quality" and "cap-supported" data.) All three methods gave qualitatively similar results with respect to the occurrence biases of individual motifs among the three different classes of promoters (Tables S2 and S3 [see Additional file <supplr sid="S2">2</supplr>]). That is, although the different methods occasionally found different absolute numbers of motif instances (e.g., Patser found more TATA boxes than regular expression methods), the motif occurrence biases among the respective promoter classes were the same.</p>
         </sec>
         <sec>
            <st>
               <p>Human promoter classes also have distinct nucleotide compositions and motif preferences</p>
            </st>
            <p>We performed an analysis similar to that which we did for <it>Drosophila </it>using 4506 validated human promoters <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Nucleotide distribution plots show that both T and C levels vary among the three sets of promoters over the length of the promoter region, and A and G levels vary near the TSSs (Figure S2 [see Additional file <supplr sid="S1">1</supplr>]). The average GC content in human UPs is significantly higher than that in FAPs and DAPs (0.595 (0.101) versus 0.581 (0.123), Kolmogorov-Smirnov test <it>P </it>= 2.99e-06; 0.595 (0.101) versus 0.560 (0.127), Kolmogorov-Smirnov test <it>P </it>= 0). Thus while the specific nucleotide compositions of the promoters are different between flies and humans, in both species there are pronounced differences in nucleotide preference among the three promoter classes.</p>
            <p>To determine whether human promoters also display motif preferences among the three promoter classes, as we observed in <it>Drosophila</it>, we mapped the locations of eight motifs previously identified as being overrepresented in human promoter sequences <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> (Table S4 [see Additional file <supplr sid="S2">2</supplr>]). Once again, we found significant differences in motif usage among the promoter classes (Table <tblr tid="T3">3</tblr>). In contrast to the fly promoter motifs, which collectively are distributed evenly among the promoter classes, most of the human promoter motifs have a higher occurrence frequency in UPs, and the four motifs showing occurrence bias in different classes of promoters all differ significantly between unique and alternative promoters. This compares well with the results of Baek <it>et al</it>. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> showing a six-fold higher frequency of the TATA box in CpG-poor UPs. Our results suggest that not only does motif usage differ among human promoter classes, but that the motifs preferred in alternative promoters have not yet been identified in human. Consistent with this idea, we note that a much greater proportion of alternative promoters, as compared to unique promoters, lack any of the identified motifs we focused on in this study (53% vs. 34%).</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Presence of the eight human promoter motifs in the three promoter classes</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Occurrence percentage<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>p value<sup>b</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="right">
                        <p>UPs</p>
                     </c>
                     <c ca="right">
                        <p>FAPs</p>
                     </c>
                     <c ca="right">
                        <p>DAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. FAPs</p>
                     </c>
                     <c ca="center">
                        <p>UPs vs. DAPs</p>
                     </c>
                     <c ca="center">
                        <p>FAPs vs. DAPs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CCAAT</p>
                     </c>
                     <c ca="right">
                        <p>12.25</p>
                     </c>
                     <c ca="right">
                        <p>10.16</p>
                     </c>
                     <c ca="right">
                        <p>9.78</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SP1</p>
                     </c>
                     <c ca="right">
                        <p>37.91</p>
                     </c>
                     <c ca="right">
                        <p>34.35</p>
                     </c>
                     <c ca="right">
                        <p>30.69</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>1.33e-04</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CLUS1</p>
                     </c>
                     <c ca="right">
                        <p>1.82</p>
                     </c>
                     <c ca="right">
                        <p>0.81</p>
                     </c>
                     <c ca="right">
                        <p>0.50</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>USF</p>
                     </c>
                     <c ca="right">
                        <p>2.76</p>
                     </c>
                     <c ca="right">
                        <p>1.94</p>
                     </c>
                     <c ca="right">
                        <p>1.61</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CREB</p>
                     </c>
                     <c ca="right">
                        <p>3.02</p>
                     </c>
                     <c ca="right">
                        <p>2.42</p>
                     </c>
                     <c ca="right">
                        <p>3.47</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TATA</p>
                     </c>
                     <c ca="right">
                        <p>6.60</p>
                     </c>
                     <c ca="right">
                        <p>1.94</p>
                     </c>
                     <c ca="right">
                        <p>3.22</p>
                     </c>
                     <c ca="center">
                        <p>6.97e-07</p>
                     </c>
                     <c ca="center">
                        <p>1.47e-04</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NRF-1</p>
                     </c>
                     <c ca="right">
                        <p>10.23</p>
                     </c>
                     <c ca="right">
                        <p>3.71</p>
                     </c>
                     <c ca="right">
                        <p>3.34</p>
                     </c>
                     <c ca="center">
                        <p>2.65e-08</p>
                     </c>
                     <c ca="center">
                        <p>1.89e-11</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ETS</p>
                     </c>
                     <c ca="right">
                        <p>17.28</p>
                     </c>
                     <c ca="right">
                        <p>5.97</p>
                     </c>
                     <c ca="right">
                        <p>6.44</p>
                     </c>
                     <c ca="center">
                        <p>1.07e-14</p>
                     </c>
                     <c ca="center">
                        <p>1.26e-16</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>None<sup>c</sup></p>
                     </c>
                     <c ca="right">
                        <p>33.59</p>
                     </c>
                     <c ca="right">
                        <p>51.77</p>
                     </c>
                     <c ca="right">
                        <p>53.59</p>
                     </c>
                     <c ca="center">
                        <p>3.75e-17</p>
                     </c>
                     <c ca="center">
                        <p>9.32e-25</p>
                     </c>
                     <c ca="center">
                        <p>ns</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>percentage of the promoters in a class having the specific motif.</p>
                  <p><sup>b</sup>calculated using Fisher's exact test. Holm's method was used for multiple hypothesis testing. ns, not significant.</p>
                  <p><sup>c</sup>percentage of promoters without any known motifs.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Promoter motif usage differs based on promoter position within a gene</p>
            </st>
            <p>The preceding analyses provide a general picture of promoter motif distributions among the different promoter classes, but do not tell us how motif profiles vary among the alternative promoters of the same gene. However, this information could provide important insights into alternative promoter use and evolution. For example, the presence of highly similar sets of motifs would suggest the possibility of extensive co-regulation of the alternative promoters. Conversely, very different motif compositions would be more consistent with differential regulation in which a distal regulatory element could interact with only one or a subset of the promoters. Landry <it>et al</it>. <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> suggest a number of ways that alternative promoters may have evolved, one of which is through local sequence duplication. This scenario would likely be reflected in higher-than-expected motif similarity, even given subsequent mutation away from the original sequence.</p>
            <p>In order to evaluate the degree of relationship among the alternative promoters of individual genes, we compared which of the 15 promoter motifs are present between the first and second promoters of all <it>Drosophila </it>genes with alternative promoters. Each pair of promoters was then assigned one of three levels of motif similarity: <it>similar</it>, <it>intermediate</it>, or <it>different </it>(see Methods). Genes for which one or both of the two promoters contained no known motifs were excluded from the analysis.</p>
            <p>In general, we find that the motif compositions between pairs of first and second alternative promoters are highly dissimilar. However, in comparing alternative promoter pairs, we noticed that in some genes the two promoters are close enough in position that our motif mapping rules assign the same motif to both promoters. Without experimental data, we are unable to determine whether or not such motifs, which we refer to as "putatively shared motifs," are actually used by both promoters, or just one of the pair. If we assume that putatively shared motifs are in fact used by both promoters, we find that although for the majority of the genes the first two promoters differ in motif composition, more promoter pairs are similar than we would expect at random (Table <tblr tid="T4">4</tblr>, "shared motifs allowed" and Figure S3 [see Additional file <supplr sid="S6">6</supplr>]). If we consider that a putatively shared motif is used in only one of the two promoters, however, we find that although the degree of difference in promoter motif profiles between two alternative promoters of the same gene is still different from the random expectation, the statistical support for this conclusion is weaker and not significant in all three promoter datasets (Table <tblr tid="T4">4</tblr>, "shared motifs not allowed" and Figure S3 [see Additional file <supplr sid="S6">6</supplr>]). Thus, while overall the motif composition between alternative promoters of the same gene is dissimilar, to what extent this dissimilarity is significantly less than what we would expect to see for a random pair of promoters depends on how the putatively shared motifs are actually used in the respective promoters. The answer to this question must await detailed experimental investigation.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Motif similarity between the first and second promoters of the same gene<sup>a</sup></p>
               </caption>
               <tblbdy cols="16">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Similar<sup>b</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Intermediate<sup>b</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Different<sup>b</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p># of used pairs</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p># of omitted pairs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>all<sup>c</sup></p>
                     </c>
                     <c ca="center">
                        <p>high quality<sup>d</sup></p>
                     </c>
                     <c ca="center">
                        <p>cap<sup>e</sup></p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Observed value (shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>34.07</p>
                     </c>
                     <c ca="center">
                        <p>31.52</p>
                     </c>
                     <c ca="center">
                        <p>36.04</p>
                     </c>
                     <c ca="center">
                        <p>17.70</p>
                     </c>
                     <c ca="center">
                        <p>16.30</p>
                     </c>
                     <c ca="center">
                        <p>23.42</p>
                     </c>
                     <c ca="center">
                        <p>48.23</p>
                     </c>
                     <c ca="center">
                        <p>52.17</p>
                     </c>
                     <c ca="center">
                        <p>40.54</p>
                     </c>
                     <c ca="center">
                        <p>226</p>
                     </c>
                     <c ca="center">
                        <p>92</p>
                     </c>
                     <c ca="center">
                        <p>111</p>
                     </c>
                     <c ca="center">
                        <p>1182</p>
                     </c>
                     <c ca="center">
                        <p>445</p>
                     </c>
                     <c ca="center">
                        <p>278</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Observed value (shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>12.65</p>
                     </c>
                     <c ca="center">
                        <p>11.59</p>
                     </c>
                     <c ca="center">
                        <p>15.19</p>
                     </c>
                     <c ca="center">
                        <p>15.66</p>
                     </c>
                     <c ca="center">
                        <p>15.94</p>
                     </c>
                     <c ca="center">
                        <p>21.52</p>
                     </c>
                     <c ca="center">
                        <p>71.69</p>
                     </c>
                     <c ca="center">
                        <p>72.46</p>
                     </c>
                     <c ca="center">
                        <p>63.29</p>
                     </c>
                     <c ca="center">
                        <p>166</p>
                     </c>
                     <c ca="center">
                        <p>69</p>
                     </c>
                     <c ca="center">
                        <p>79</p>
                     </c>
                     <c ca="center">
                        <p>1242</p>
                     </c>
                     <c ca="center">
                        <p>468</p>
                     </c>
                     <c ca="center">
                        <p>310</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Random mean<sup>f</sup></p>
                     </c>
                     <c ca="center">
                        <p>9.50</p>
                     </c>
                     <c ca="center">
                        <p>8.34</p>
                     </c>
                     <c ca="center">
                        <p>8.67</p>
                     </c>
                     <c ca="center">
                        <p>10.83</p>
                     </c>
                     <c ca="center">
                        <p>12.51</p>
                     </c>
                     <c ca="center">
                        <p>14.31</p>
                     </c>
                     <c ca="center">
                        <p>79.66</p>
                     </c>
                     <c ca="center">
                        <p>79.15</p>
                     </c>
                     <c ca="center">
                        <p>77.03</p>
                     </c>
                     <c ca="center">
                        <p>226</p>
                     </c>
                     <c ca="center">
                        <p>92</p>
                     </c>
                     <c ca="center">
                        <p>111</p>
                     </c>
                     <c ca="center">
                        <p>1182</p>
                     </c>
                     <c ca="center">
                        <p>445</p>
                     </c>
                     <c ca="center">
                        <p>278</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>g </sup>(random >= observed value, shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>2e-04</p>
                     </c>
                     <c ca="center">
                        <p>0.1399</p>
                     </c>
                     <c ca="center">
                        <p>0.0021</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>g </sup>(random &lt;= observed value, shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.9227</p>
                     </c>
                     <c ca="center">
                        <p>0.9991</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>g </sup>(random >= observed value, shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>0.0438</p>
                     </c>
                     <c ca="center">
                        <p>0.1286</p>
                     </c>
                     <c ca="center">
                        <p>0.0078</p>
                     </c>
                     <c ca="center">
                        <p>0.0041</p>
                     </c>
                     <c ca="center">
                        <p>0.1399</p>
                     </c>
                     <c ca="center">
                        <p>0.0108</p>
                     </c>
                     <c ca="center">
                        <p>0.9991</p>
                     </c>
                     <c ca="center">
                        <p>0.9622</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>g </sup>(random &lt;= observed value, shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>0.9562</p>
                     </c>
                     <c ca="center">
                        <p>0.8714</p>
                     </c>
                     <c ca="center">
                        <p>0.9922</p>
                     </c>
                     <c ca="center">
                        <p>0.9959</p>
                     </c>
                     <c ca="center">
                        <p>0.8601</p>
                     </c>
                     <c ca="center">
                        <p>0.9892</p>
                     </c>
                     <c ca="center">
                        <p>9e-04</p>
                     </c>
                     <c ca="center">
                        <p>0.0378</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>genes with exactly two alternative promoters. Similar results were obtained for genes with multiple alternative promoters (data not shown).</p>
                  <p><sup>b</sup>percentage of the alternative promoter pairs defined as similar, intermediate, and different (see Methods).</p>
                  <p><sup>c</sup>The promoter set contains all promoters from <it>Drosophila </it>genome annotation release 5.5.</p>
                  <p><sup>d</sup>The promoter set contains only high quality promoters in fly genome; see Methods.</p>
                  <p><sup>e</sup>The promoter set contains only cap-supported promoters in fly genome; see Methods.</p>
                  <p><sup>f</sup>mean of 10,000 randomizations. </p>
                  <p><sup>g</sup>empirical p-value calculated as the proportion of sampled randomizations where the percentage of the promoter pairs in each of the three similarity levels was no less than/no greater than the observed values</p>
               </tblfn>
            </tbl>
            <suppl id="S6">
               <title>
                  <p>Additional file 6</p>
               </title>
               <text>
                  <p><b>Figure S3</b>. Motif dissimilarity distributions between alternative promoters of the same gene and between neighboring unique promoters.</p>
               </text>
               <file name="1471-2164-10-9-S6.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>As mentioned above, promoter pairs that did not contain any of the 15 motifs in at least one promoter&#8211;approximately 84% of the potential promoter pairs (Table <tblr tid="T4">4</tblr>)&#8211;were omitted because we had no basis for determining how similar or different two such promoters were. Moreover, given the limited number of known promoter motifs, it is likely that for any of the promoter pairs there are additional but unidentified relevant sequence motifs. In order to get around these twin difficulties, we compared how similar the full set of promoter pairs were to one another using <it>D2z </it>scores. The <it>D2z </it>score is an alignment-free sequence comparison metric which compares <it>k</it>-mer word distributions between two sequences <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. The alternative promoter pairs of genes with exactly two promoters were more likely to have a higher <it>D2z </it>score than random expectation (odds ratio for having <it>D2z </it>score higher than 85 percentile of random expectation = 1.44; one sided Fisher's exact <it>P </it>= 1.72e-05). Similar results were obtained for genes with multiple promoters (data not shown). This is not merely a result of the score distribution being driven by the subset of promoters with known motifs; breaking down the data into pairs for which both promoters contained known motifs, one promoter contained known motifs, or neither promoter contained known motifs revealed no differences in the score distribution for each subset (Fisher's exact <it>P </it>= 0.3594). These results indicate that for the full promoter data set, even without explicitly considering defined promoter motifs, alternative promoters of the same gene are more similar than random expectation, consistent with the results of our motif-based analysis of the smaller, motif-containing promoter subset.</p>
         </sec>
         <sec>
            <st>
               <p>Similarities in motif composition between promoters of neighboring genes correlates with gene co-expression</p>
            </st>
            <p>Not only is the analysis of alternative promoter pairs complicated by the possibility of shared motifs, but in most cases, we do not possess good data on promoter-specific gene expression. On the other hand, questions about alternative promoter usage are paralleled by those with respect to neighboring genes: how do more distal regulatory elements select the proper promoter to activate when confronted by two or more promoters in relative proximity to one another? Unlike for alternative promoters, in the case of neighboring genes we have both unambiguous core promoter motif assignments and extensive gene expression data. Therefore, in addition to comparing promoter motif profiles among alternative promoters of the same gene, we looked at the motif composition between the promoters of neighboring genes. So as not to confound our analysis with choices as to which alternative promoters to consider, we focused on neighboring genes with single promoters only [see Additional file <supplr sid="S7">7</supplr>], and compared the profiles of the 15 promoter motifs in the same way that we analyzed the motif composition of alternative promoter pairs. Although the motif profiles for the majority of neighboring genes are different, we found that the number of promoters with similar motifs is significantly higher than the random expectation (Table <tblr tid="T5">5</tblr> and Figure S3 [see Additional file <supplr sid="S6">6</supplr>]; 270/1516 vs. 129/1516, <it>P </it>&#8776; 0). There were a small number of neighboring unique promoters&#8211;most of which are bi-directional promoters&#8211;which, like we saw with some of the alternative promoters, could potentially share motifs. However, removing the putatively shared motifs from one or the other of the neighboring promoters did not significantly change the result (Table <tblr tid="T5">5</tblr> and Figure S3 [see Additional file <supplr sid="S6">6</supplr>]). In other words, the promoters of neighboring genes are more closely related to one another than we would expect to see by chance alone.</p>
            <suppl id="S7">
               <title>
                  <p>Additional file 7</p>
               </title>
               <text>
                  <p><b>neighboring single-promoter genes</b>. This file contains the names and coordinates of the pairs of neighboring single-promoter genes used for the analysis presented in Table <tblr tid="T5">5</tblr> and Figure <figr fid="F3">3</figr>.</p>
               </text>
               <file name="1471-2164-10-9-S7.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Motif similarity between neighboring unique promoters</p>
               </caption>
               <tblbdy cols="16">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Similar<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Intermediate<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Different<sup>a</sup></p>
                     </c>
                     <c cspan="3" ca="center">
                        <p># of used pairs</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p># of omitted pairs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>all<sup>b</sup></p>
                     </c>
                     <c ca="center">
                        <p>high quality<sup>c</sup></p>
                     </c>
                     <c ca="center">
                        <p>cap<sup>d</sup></p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                     <c ca="center">
                        <p>all</p>
                     </c>
                     <c ca="center">
                        <p>high quality</p>
                     </c>
                     <c ca="center">
                        <p>cap</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Observed value (shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>17.81</p>
                     </c>
                     <c ca="center">
                        <p>16.44</p>
                     </c>
                     <c ca="center">
                        <p>15.47</p>
                     </c>
                     <c ca="center">
                        <p>16.36</p>
                     </c>
                     <c ca="center">
                        <p>16.90</p>
                     </c>
                     <c ca="center">
                        <p>21.13</p>
                     </c>
                     <c ca="center">
                        <p>65.83</p>
                     </c>
                     <c ca="center">
                        <p>66.67</p>
                     </c>
                     <c ca="center">
                        <p>63.40</p>
                     </c>
                     <c ca="center">
                        <p>1516</p>
                     </c>
                     <c ca="center">
                        <p>1083</p>
                     </c>
                     <c ca="center">
                        <p>530</p>
                     </c>
                     <c ca="center">
                        <p>8224</p>
                     </c>
                     <c ca="center">
                        <p>4200</p>
                     </c>
                     <c ca="center">
                        <p>983</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Observed value (shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>16.64</p>
                     </c>
                     <c ca="center">
                        <p>15.06</p>
                     </c>
                     <c ca="center">
                        <p>14.34</p>
                     </c>
                     <c ca="center">
                        <p>15.50</p>
                     </c>
                     <c ca="center">
                        <p>15.91</p>
                     </c>
                     <c ca="center">
                        <p>19.96</p>
                     </c>
                     <c ca="center">
                        <p>67.86</p>
                     </c>
                     <c ca="center">
                        <p>69.03</p>
                     </c>
                     <c ca="center">
                        <p>65.70</p>
                     </c>
                     <c ca="center">
                        <p>1484</p>
                     </c>
                     <c ca="center">
                        <p>1056</p>
                     </c>
                     <c ca="center">
                        <p>516</p>
                     </c>
                     <c ca="center">
                        <p>8256</p>
                     </c>
                     <c ca="center">
                        <p>4227</p>
                     </c>
                     <c ca="center">
                        <p>997</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Random mean<sup>e</sup></p>
                     </c>
                     <c ca="center">
                        <p>8.48</p>
                     </c>
                     <c ca="center">
                        <p>8.25</p>
                     </c>
                     <c ca="center">
                        <p>7.60</p>
                     </c>
                     <c ca="center">
                        <p>10.45</p>
                     </c>
                     <c ca="center">
                        <p>10.96</p>
                     </c>
                     <c ca="center">
                        <p>11.96</p>
                     </c>
                     <c ca="center">
                        <p>81.07</p>
                     </c>
                     <c ca="center">
                        <p>80.79</p>
                     </c>
                     <c ca="center">
                        <p>80.44</p>
                     </c>
                     <c ca="center">
                        <p>1516</p>
                     </c>
                     <c ca="center">
                        <p>1083</p>
                     </c>
                     <c ca="center">
                        <p>530</p>
                     </c>
                     <c ca="center">
                        <p>8224</p>
                     </c>
                     <c ca="center">
                        <p>4200</p>
                     </c>
                     <c ca="center">
                        <p>983</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>f </sup>(random >= observed value, shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>f </sup>(random &lt;= observed value, shared motifs allowed)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>f </sup>(random >= observed value, shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="16">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p-value<sup>f </sup>(random &lt;= observed value, shared motifs not allowed)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                     <c ca="center">
                        <p>--</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>percentage of promoter pairs defined as similar, intermediate, and different (see Methods).</p>
                  <p><sup>b</sup>The promoter set contains all promoters from <it>Drosophila </it>genome annotation release 5.5.</p>
                  <p><sup>c</sup>The promoter set contains only high quality promoters in fly genome; see Methods.</p>
                  <p><sup>d</sup>The promoter set contains only cap-supported promoters in fly genome; see Methods.</p>
                  <p><sup>e</sup>mean of 10,000 randomizations.</p>
                  <p><sup>f</sup>empirical p-value calculated as the proportion of sampled randomizations where the percentage of the promoter pairs in each of the three similarity levels was no less than/no greater than the observed values.</p>
               </tblfn>
            </tbl>
            <p>One possible reason for neighboring genes to have similar promoter organizations would be if the two genes were the result of a local sequence duplication event. Indeed, we observed higher sequence similarity in both transcribed regions and in the promoter regions from -130 bp to +50 bp (in which all 15 motifs reside; Table S1 [see Additional file <supplr sid="S2">2</supplr>]) for neighboring unique genes that have similar motifs, compared to those with different motifs (Fig. <figr fid="F3">3</figr>). 20.5% of genes and 14.1% of promoters have greater than 65% sequence alignment for promoters with similar motif profiles, versus 2.2% and 0.5%, respectively, for promoters with different profiles (one-sided Fisher's exact <it>P </it>values = 1.52e-22 and 1.18e-21). For neighboring unique promoters with similar motifs, but not for those with disparate motifs, promoter sequence similarity is highly correlated with gene sequence similarity (<it>r </it>= 0.70, versus <it>r </it>= 0.07), indicating that not just the known motifs but the sequences of the promoter region in general are highly related. Note that the majority of genes (79%) with similar promoter motif profiles do not appear to be the result of gene duplication and that even then, promoter sequences have diverged much more rapidly than the gene sequences. Thus while potentially a contributing factor, gene duplication by itself cannot explain the unexpectedly high incidence of similar promoter motif profiles in neighboring genes.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Sequence similarity of neighboring genes and their promoters</p>
               </caption>
               <text>
                  <p><b>Sequence similarity of neighboring genes and their promoters</b>. (A) The gene sequence similarity of neighboring unique gene pairs whose motifs are similar (black bars) or different (white bars). (B) The promoter sequence (-130 to +50 bp) similarity of neighboring unique gene pairs whose motifs are similar (black bars) or different (white bars).</p>
               </text>
               <graphic file="1471-2164-10-9-3"/>
            </fig>
            <p>We wondered whether the high number of neighboring genes with similar promoter make-ups might be indicative of their being coordinately regulated, e.g., by interaction of their promoters with the same enhancer. We separated the neighboring gene pairs into two groups, those for which the promoters have a similar motif profile, and those whose promoters contain a different set of motifs. We then compared the expression patterns of the genes belonging to the two different groups using gene expression data from 13 tissue-specific gene expression profiles contained in FlyAtlas <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. For each gene pair, we calculated the degree of co-expression over the 13 tissues and found that neighboring genes whose promoters have a similar motif composition are significantly more likely to be co-expressed than neighboring gene pairs with different motifs (odds ratio of having degree of co-expression greater than 0.5 = 2.18; one sided Fisher's exact <it>P </it>= 1.04e-04).</p>
            <p>We also calculated how well correlated the <it>level </it>of expression was for genes of both groups. For each tissue, the genes were ranked according to their mean expression level, and the Pearson correlation coefficient of the expression ranks across all 13 tissues for any two genes was obtained. Neighboring genes whose promoters have similar motif profiles were more highly correlated in their expression level than neighboring genes having different motifs (odds ratio for having a correlation coefficient in the upper quartile (>0.5) = 1.93; one sided Fisher's exact <it>P </it>= 2.97e-04).</p>
            <p>In order to be certain that these observed correlations between gene co-expression and promoter motif composition are not an artifact resulting from their high sequence similarity, we repeated our analysis using just those neighboring gene pairs with less than 65% gene sequence similarity. In this group as well, gene pairs with similar promoter motif profiles are more likely to be co-expressed in same tissues and are more highly correlated in expression level than those with different motifs (one sided Fisher's exact <it>P </it>= 1.90e-04 and 0.002 respectively).</p>
            <p>Our data showing that approximately 18% of neighboring genes with unique promoters have similar promoter motif compositions, and that these genes tend to have correlated expression, is consistent with previous data on co-regulation of physically near-by <it>Drosophila </it>genes <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. In particular, our results are reminiscent of the finding by Spellman and Rubin <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> that roughly 20% of <it>Drosophila </it>genes fall into clusters of adjacent genes with similar expression profiles. We find that neighboring genes within one of these clusters are 63% more likely to have similar promoter motif profiles than those not within clusters (odds ratio = 1.82; one sided Fisher's exact <it>P </it>= 0.001). Thus at least some of the phenomenon described by Spellman and Rubin <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> may be attributable to similarity in core promoter motifs among the genes in a co-expression cluster.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Genome organization at the promoter level</p>
            </st>
            <p>We have systematically compared the nucleotide composition and promoter motif profiles of UPs, FAPs, and DAPs throughout the <it>Drosophila </it>genome. Our results demonstrate that the three types of promoters have distinct sequence and motif preferences. Intriguingly, although consistent with results from human promoters reported by Baek <it>et al</it>. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, we observe clear differences based not only on whether a promoter is single or alternative, but also on the relative position of an alternative promoter among all of the alternative promoters. Eight of the 15 core and proximal promoter motifs we looked at occur differentially between FAPs and DAPs, and the DAPs have a distinct nucleotide profile. Thus, the genome appears to distinguish promoter types and positions. Our results suggest that each class may be subject to different modes of regulation or interact with differently constituted basal transcription complexes, and demonstrate that the genome has a complex organizational structure at the promoter level.</p>
            <p>It is worth noting that there is no universally accepted method for accurately identifying genomic subsequences as being relevant instances of a particular defined sequence motif, and as such, a strictly in silico analysis will always be dependent on choices of method and parameters. To guard against this, we used two different motif identification methods and three choices of range parameter for inclusion of a motif as part of the promoter. We also used three different promoter sets (four for some analyses which included the EPD set), none of which is likely to be completely accurate and which will tend to variously include too many non-promoters (false positives) or too few real promoters (false negatives). Reassuringly, we found that there were few substantive differences in our results when considering different methods and datasets; moreover, trends tended to be clearly preserved with the main differences appearing to stem from diminished statistical power when using the smaller data sets. Overall, we find our conclusions to be robust to choice of motif identification methods and search parameters, and promoter data sets.</p>
            <p>Our analysis of a more limited number of validated human promoters revealed that just as in <it>Drosophila</it>, the different classes of promoters have distinct characteristics, and suggests that our observations point to a general metazoan organizational principle. For example, similar to what we found in the fly genome, the TATA motif is significantly overrepresented in human UPs (this study; <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>). In fact, all four significant human motifs are overrepresented in UPs, and there is a high percentage of alternative promoters that do not contain any known motifs. Thus promoter motifs in the human genome in general, and those specific to alternative promoters in particular, appear to be still awaiting identification. One reason that these motifs have proven difficult to define may lie in the abundance of alternative promoters in the human genome; the differences in motif composition and GC content among UPs, FAPs, and DAPs demonstrated by our results could contribute considerable noise to computational motif discovery attempts unless each promoter class is considered separately (see below).</p>
         </sec>
         <sec>
            <st>
               <p>Finding additional promoter motifs</p>
            </st>
            <p>More than half of the promoters in fly genome do not contain any of the 15 motifs we used for this study (Table <tblr tid="T1">1</tblr> and Figure S4 [see Additional file <supplr sid="S8">8</supplr>]). These promoters are problematic for purposes of promoter comparison, and some fraction of them may not in fact be true promoters, but rather might represent errors in the genome annotation that we used. However, four lines of evidence suggest that the majority of these are genuine promoters. One, we see a similar fraction of promoters lacking the known motifs when we use our more selective "high-quality" and "cap-supported" promoter sets. Two, analysis of the experimentally-verified <it>Drosophila </it>promoters in the Eukaryotic Promoter Database (EPD) <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> also reveals a high proportion without known motifs (Figure S4 [see Additional file <supplr sid="S8">8</supplr>]; because the EPD dataset is significantly smaller than the other three datasets we used and does not provide representative coverage of the entire genome, we did not use it for most of the analyses reported here). Three, we see similar results overall for our analysis of human promoters, which relies exclusively on experimentally-verified promoter sequences. Four, our results using the <it>D2z </it>score demonstrate that the promoters without known motifs behave identically to those that have known motifs. As a result, we believe that these represent true promoters and that a considerable number of promoter motifs remain to be identified. Indeed, a recent computational study has identified additional candidate <it>Drosophila </it>upstream promoter motifs <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, and it will be interesting to determine how these distribute with respect to the known motifs and to one another, and among the various promoter classes. Notably, all of the promoter motif discovery conducted to date has been performed on UPs <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B34">34</abbr></abbrgrp> or on UPs and FAPs jointly <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. As our data show that promoter motifs vary among the different single and alternative promoter classes, motifs specific for FAPs and DAPs may therefore be underrepresented among those that are known. Preliminary studies in our laboratory suggest that targeting motif discovery efforts to specific promoter subsets will be an effective strategy for identifying new motifs (J. Spix, QZ, and MSH, unpublished results).</p>
            <suppl id="S8">
               <title>
                  <p>Additional file 8</p>
               </title>
               <text>
                  <p><b>Figure S4</b>. Distribution of the number of mapped motifs in individual promoters.</p>
               </text>
               <file name="1471-2164-10-9-S8.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Promoters and gene expression neighborhoods</p>
            </st>
            <p>The majority of adjacent promoters, either from neighboring genes or from alternative promoters of the same gene, have a highly dissimilar motif profile. As promoter motifs have been implicated in helping to mediate specific enhancer-promoter interactions <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>, these differences are likely to represent one of the mechanisms used by the genome to prevent inappropriate gene activation by nearby CRMs. Nevertheless, neighboring genes are significantly more likely than expected to have highly similar promoters, with a strong correlation between motif similarity and strength of gene expression. Thus, a key role of the promoter may be in regulating levels of gene expression. Neighboring genes with similar promoters also show a concomitant increase in tissue co-expression, raising the possibility that they are either coordinately regulated by shared CRMs [e.g. <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>], or by individual CRMs that bind a similar complement of transcription factors <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Neighborhoods of co-regulated genes have been observed in many eukaryotes, including yeast, worm, fly, and human <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. A frequently proposed mechanism for this phenomenon is the presence of local chromatin domains; that is, that a local "open" chromatin configuration favorable for transcription&#8211;perhaps due to strong activation of one of the genes in the neighborhood&#8211;leads to spurious activation of other nearby genes, as opposed to regulated activation of each gene <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. In contrast, our data suggest that promoter sequences constitute a significant component of gene co-expression neighborhoods and point to a higher degree of genomic organization and regulation than seen for a chromatin-centric model. Note that the two mechanisms need not be mutually exclusive; for instance, the related promoters might all contribute to a strong local change in chromatin conformation that would affect even those neighboring genes with different promoter make-ups. Detailed experimental investigation along with careful mapping of chromatin modifications throughout the co-expression neighborhoods will be required to tease apart the various contributions of individual regulatory <it>cis</it>-<it>trans </it>interactions and epigenetic modifications. However, our results suggest a much greater role for the promoter in mediating locally coordinated gene expression than heretofore appreciated.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>Our systematic investigation of <it>Drosophila </it>promoters demonstrates that there are distinct sequence characteristics among unique, first alternative, and downstream alternative promoters, suggesting that different regulatory mechanisms may act preferentially on each class. We also show that neighboring genes are unexpectedly likely to have similar promoter compositions, which correlates with an increased degree of gene coexpression and suggests a mechanism for the previously observed phenomenon of gene co-expression neighborhoods. Taken together, these data reveal a high degree of complex genome organization at the level of promoter sequences.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Promoter datasets</p>
            </st>
            <p><it>Drosophila </it>release 5.5 genomic sequences and annotation were downloaded from FlyBase <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The given start position of each mRNA which has strand information was considered as the position of the TSS, and the sequences from -500 bp to +100 bp were extracted as the extended promoter regions. The upstream region of one TSS was shorter than 500 bp and therefore not included in the dataset. We considered transcripts within 18 bp of one another to correspond to the same promoter based on an analysis of the spacing between multiple TSSs of individual genes (Figure S5A [see Additional file <supplr sid="S9">9</supplr>]). In such cases, the mean position of the multiple TSSs was used to define a representative TSS. Then the extended promoter region was defined as -500 bp to +100 bp around the representative TSS. Promoters were separated into three different groups based on their relative positions in the gene. As no significant differences were observed between downstream alternative promoters (DAPs) irrespective of their order following the most 5' promoter (data not shown), we grouped these promoters together into a single class in order to take advantage of the increased statistical power of the larger sample size. Genes whose promoters could not be assigned unambiguously to one of the three groups were removed.</p>
            <suppl id="S9">
               <title>
                  <p>Additional file 9</p>
               </title>
               <text>
                  <p><b>Figure S5</b>. Histogram of distances between alternative TSSs for the same gene in the fly genome when all promoters in the genome (A) or only cap-supported promoters (B) were considered.</p>
               </text>
               <file name="1471-2164-10-9-S9.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>To obtain the "high quality" <it>Drosophila </it>promoter set, we used the transcript evidence rank provided by FlyBase in annotation release 5.5. We considered a transcript, and thus its TSS, to be strongly supported if it had a FlyBase evidence score of nine or more. In order to achieve such a score, a transcript must have one or more aligned cDNA sequences that are fully consistent with the annotation, plus at least one of the following: one or more consistent aligned EST sequences; intersection of an annotated exon with a region of aligned protein similarity; or a gene prediction fully consistent with the annotation. We extracted the promoters only from these strongly supported transcripts. To make sure the separation of promoters into UPs, FAPs and DAPs was reliable, we also required that at least the first two transcripts of a gene were strongly supported.</p>
            <p>To obtain the cap-supported <it>Drosophila </it>promoters, we used cDNA and 5' EST sequences from four 5' cap-trapped cDNA libraries (RE, RH, TA, and TB) <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr></abbrgrp>. The genomic coordinates of the 5' end of these cDNA and 5' ESTs were extracted from the fly genome annotation by requiring the first 5 bp of the 5' end to map to the genome. We considered a transcript, and thus its TSS, to be cap-supported if the TSS is within 10 bp of the 5' end coordinates of a cap-trapped cDNA or EST. To make sure the separation of promoters into UPs, FAPs and DAPs was reliable, we also required that at least the first two transcripts of a gene were strongly supported. Transcripts within 17 bp of one another were considered as corresponding to the same promoter based on an analysis of the spacing between multiple cap-supported TSSs of individual genes (Figure S5B [see Additional file <supplr sid="S9">9</supplr>]).</p>
            <p>The coordinates of the single and alternative human promoters identified by Baek <it>et al</it>. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> were obtained by mapping the corresponding transcripts to EST, mRNA and RefGene of human genome annotation (hg17) in UCSC Genome Browser <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. We grouped transcripts that belonged to the same gene according to Baek <it>et al</it>. to get all transcripts of one gene. We removed transcripts that were inconsistent between Baek <it>et al</it>. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and current human annotation. Sequences from -500 bp to +100 bp of the first nucleotide of the relevant transcripts were defined as the promoter regions. A total of 4506 human promoters were obtained.</p>
         </sec>
         <sec>
            <st>
               <p>Mononucleotide distribution and GC content</p>
            </st>
            <p>The program <it>freak </it>from the EMBOSS software suite <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> was used to calculate the frequency of each nucleotide in a 10 bp window moving along the promoter sequence in 1 bp steps. GC content was calculated using the EMBOSS <it>geecee </it>program. Coding regions were masked so that only non-coding sequences were considered. For both analyses, any promoters whose sequences overlapped were excluded for a total n = 11633 UPs, 1058 FAPs, and 1850 DAPs for <it>Drosophila </it>promoters and n = 3078 UPs, 509 FAPs, and 696 DAPs for human promoters.</p>
         </sec>
         <sec>
            <st>
               <p>Promoter Motifs</p>
            </st>
            <p>The consensus sequences and strand specificities of the 15 <it>Drosophila </it>promoter motifs are given by FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Both strands of the promoter sequence were searched for the degenerate consensus sequences using the EMBOSS program <it>fuzznuc </it>allowing zero mismatches. We computed the distribution of each motif relative to the TSS in 20 bp bins for promoters using R <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. The valid range of each motif was taken to be the bins whose frequencies were above two standard deviations of the mean frequency of all 30 bins along the promoter regions <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B35">35</abbr></abbrgrp>. We calculated the valid ranges ("subset range") of every motif on each of the three types of promoters (UPs/FAPs/DAPs). When no valid range for a motif in a promoter class could be detected (i.e., frequency was equal to background), the motif was considered to be absent from the promoter class and the valid range was set to zero. In the case where two adjacent bins, or two bins with a single intervening bin satisfied the above criteria, the entire region was used as the valid range. Because the numbers of FAPs and DAPs was relatively small for some of the promoter data sets, ranges were also calculated by considering all three classes of promoters together, for purpose of comparison ("combined range"). We also used position requirements for the motifs from FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> ("literature-based"). For each analysis, hits that matched the consensus sequence but which were on the wrong strand, or which fell outside of the valid range, were considered false positive hits and were excluded from analysis.</p>
            <p>For weight-matrix based motif searching, position-specific scoring matrices (PSSMs) corresponding to the ten motifs identified by Ohler <it>et al</it>. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> were used to scan the promoter sequences with Patser <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Motif ten was removed from further analysis because it did not match to any of the 15 motifs identified by FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> (Table S1 [see Additional file <supplr sid="S2">2</supplr>]). We imposed strand and position requirements as described above for determining whether to accept identified motifs as true hits. Cutoff values for Patser were chosen as follows. We first converted the PSSMs from Ohler <it>et al</it>. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> to position-specific probability matrices (PSPMs). We also generated a PSPM from each set of sequences found by Patser at each cutoff value from e-03 to e-12. The PSPMs for each motif were clustered based on Euclidian distance using the PAM function in R <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> and the clustering result with the maximum average silhouette was chosen. The highest Patser P-value that fell within the same cluster as the original PSPM from Ohler <it>et al</it>. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> was used as the cutoff for that specific motif (Table S1 [see Additional file <supplr sid="S2">2</supplr>]).</p>
            <p>The eight human promoter motifs were originally identified by FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Strand and range information for these motifs were taken directly from the literature. The consensus sequences of these eight motifs were taken from Fig. 9 of FitzGerald <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Promoter motif interactions</p>
            </st>
            <p>For every possible combination of two fly promoter motifs, we counted the number of all promoters in the genome that contain both motifs, the number of promoters containing only one of the two motifs, and the number of promoters containing neither of the two motifs to make a 2 &#215; 2 contingency table. Fisher's exact test was then used on each of the contingency tables to test the association between the two corresponding motifs. 14 of a total 105 motif combinations showed significant positive associations after correcting for multiple hypothesis tests using Holm's method.</p>
         </sec>
         <sec>
            <st>
               <p>Motif profile similarity</p>
            </st>
            <p>Motifs in all promoters were compiled into a 16469 &#215; 15 matrix in which each row represented a promoter and each column, a motif. Motif presence was indicated by "one," absence by "zero," so that the motif occurrence matrix contained only binary values. We calculated the distance in the promoter motifs between any two promoters by using the <it>dist </it>function in R <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> with the asymmetric binary distance measure. The distance between any two promoters is therefore defined as the number of motifs which only occur in one of the two promoters divided by the total number of motifs that occur in at least one of the two promoters. When the distance was less than 0.2, the motifs in the two promoters were called <it>similar; </it>distances of greater than or equal to 0.8 were defined as <it>different</it>. Promoter pairs in which either of the promoters did not contain one of the 15 identified motifs were omitted from the analysis.</p>
            <p>When two alternative promoters are very close in position, our motif mapping rules could sometimes assign the same motif to both promoters. To obtain results for which such shared motifs were not allowed, we ignored the occurrence of the shared motifs on the first of a pair alternative promoters and compared the motif profiles as described above. Similar results were obtained when ignoring the shared motifs on the second promoter of each pair (data not shown).</p>
            <p>To calculate random expectations, we broke the pairing between alternative promoters of the same gene, or neighboring unique promoters, and randomized the pairing 10,000 times for each type.</p>
         </sec>
         <sec>
            <st>
               <p>D2z score</p>
            </st>
            <p><it>D2z </it>scores were calculated using a promoter region of -300 bp to +100 bp from TSSs, as <it>D2z </it>score has standard normal distribution when the sequence length is at least 400 bp <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. The <it>D2z </it>score between promoter sequences was calculated using word length five and background Markov Model order zero, the parameter setting which performed best in distinguishing functionally related regulatory sequences from not-related sequences <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. If the sequences of an alternative promoter pair belonging to the same gene overlapped in position, the pair was removed from analysis. The <it>D2z </it>scores of the incorrectly paired promoter sequences were used as random expectation.</p>
         </sec>
         <sec>
            <st>
               <p>Gene expression data</p>
            </st>
            <p>Gene expression data for 13 tissues were obtained from FlyAtlas <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. We considered a gene to be expressed in a tissue if it was called present in greater than 50% of the replicate microarrays reported in FlyAtlas. The tissue expression similarity of two genes was defined as the fraction of the 13 tissues in which the genes are either both present or both absent.</p>
            <p>To calculate expression level correlations, we first ranked all genes according to their mean expression level in each of the 13 tissues using the <it>rank </it>function in R (with the rank of ties equal to average rank) and then computed the Pearson correlation coefficient of the ranks for the two genes being compared.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence similarity</p>
            </st>
            <p>Sequence similarity was assessed using Dialign <abbrgrp><abbr bid="B57">57</abbr></abbrgrp> and reported as the maximum fraction of aligned residues between the two sequences.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>TSS: transcription start site; TBP: TATA box-binding protein; TRFs: TBP-related factors; CRM: <it>cis</it>-regulatory module; UPs: unique promoters; FAPs: first alternative promoters; DAPs: downstream alternative promoters; EPD: the Eukaryotic Promoter Database; PSSM: Position-specific scoring matrix; PSPM: position-specific probability matrix.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>Both authors conceived and designed the study and wrote the manuscript. QZ performed the analysis.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Dr. Jeffery Miecznikowski and Dr. Daniel Gaile for advice on statistics, Dr. Long Li for collecting the human promoters, and Dr. Michael Buck for comments on the manuscript. The randomization study utilized the high-performance Bioinformatics cluster at the University at Buffalo Center for Computational Research. This work was supported in part by NIH grant K22 HG002489 to MSH.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>THE RNA POLYMERASE II CORE PROMOTER</p>
            </title>
            <aug>
               <au>
                  <snm>Smale</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Kadonaga</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Annual Review of Biochemistry</source>
            <pubdate>2003</pubdate>
            <volume>72</volume>
            <issue>1</issue>
            <fpage>449</fpage>
            <lpage>479</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.biochem.72.121801.161520</pubid>
                  <pubid idtype="pmpid" link="fulltext">12651739</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The General Transcription Machinery and General Cofactors</p>
            </title>
            <aug>
               <au>
                  <snm>Thomas</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Chiang</snm>
                  <fnm>C-M</fnm>
               </au>
            </aug>
            <source>Crit Rev Biochem Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>41</volume>
            <issue>3</issue>
            <fpage>105</fpage>
            <lpage>178</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1080/10409230600648736</pubid>
                  <pubid idtype="pmpid">16858867</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Core promoters: active contributors to combinatorial gene regulation</p>
            </title>
            <aug>
               <au>
                  <snm>Smale</snm>
                  <fnm>ST</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2001</pubdate>
            <volume>15</volume>
            <issue>19</issue>
            <fpage>2503</fpage>
            <lpage>2508</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.937701</pubid>
                  <pubid idtype="pmpid" link="fulltext">11581155</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Diversified transcription initiation complexes expand promoter selectivity and tissue-specific gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Hochheimer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tjian</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2003</pubdate>
            <volume>17</volume>
            <issue>11</issue>
            <fpage>1309</fpage>
            <lpage>1320</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.1099903</pubid>
                  <pubid idtype="pmpid" link="fulltext">12782648</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Large-Scale Discovery of Promoter Motifs in Drosophila melanogaster</p>
            </title>
            <aug>
               <au>
                  <snm>Down</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Su</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJP</fnm>
               </au>
            </aug>
            <source>PLoS Computational Biology</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <issue>1</issue>
            <fpage>e7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1779301</pubid>
                  <pubid idtype="pmpid" link="fulltext">17238282</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0030007</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Promoter features related to tissue specificity as measured by Shannon entropy</p>
            </title>
            <aug>
               <au>
                  <snm>Schug</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schuller</snm>
                  <fnm>W-P</fnm>
               </au>
               <au>
                  <snm>Kappen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Salbaum</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Bucan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stoeckert</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>4</issue>
            <fpage>R33</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1088961</pubid>
                  <pubid idtype="pmpid" link="fulltext">15833120</pubid>
                  <pubid idtype="doi">10.1186/gb-2005-6-4-r33</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Genome-wide analysis of mammalian promoter architecture and evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Katayama</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shimokawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ponjavic</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Semple</snm>
                  <fnm>CAM</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Engstrom</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Forrest</snm>
                  <fnm>ARR</fnm>
               </au>
               <au>
                  <snm>Alkema</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Plessy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kodzius</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ravasi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kasukawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fukuda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanamori-Katayama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kitazume</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kawaji</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kai</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Konno</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mottagui-Tabar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Arner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chesi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gustincich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Persichetti</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Grimmond</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Wells</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Orlando</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Wahlestedt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Harbers</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <issue>6</issue>
            <fpage>626</fpage>
            <lpage>635</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1789</pubid>
                  <pubid idtype="pmpid" link="fulltext">16645617</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Comparative genomics of Drosophila and human core promoters</p>
            </title>
            <aug>
               <au>
                  <snm>FitzGerald</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sturgill</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Shyakhtenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Vinson</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>7</issue>
            <fpage>R53</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1779564</pubid>
                  <pubid idtype="pmpid" link="fulltext">16827941</pubid>
                  <pubid idtype="doi">10.1186/gb-2006-7-7-r53</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Distant liaisons: long-range enhancer-promoter interactions in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Dorsett</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Current Opinion in Genetics &amp; Development</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <issue>5</issue>
            <fpage>505</fpage>
            <lpage>514</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(99)00002-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">10508687</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Enhancer-promoter specificity mediated by DPE or TATA core promoter motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Butler</snm>
                  <fnm>JEF</fnm>
               </au>
               <au>
                  <snm>Kadonaga</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2001</pubdate>
            <volume>15</volume>
            <issue>19</issue>
            <fpage>2515</fpage>
            <lpage>2519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">312797</pubid>
                  <pubid idtype="pmpid" link="fulltext">11581157</pubid>
                  <pubid idtype="doi">10.1101/gad.924301</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Insulators: exploiting transcriptional and epigenetic mechanisms</p>
            </title>
            <aug>
               <au>
                  <snm>Gaszner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Felsenfeld</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>9</issue>
            <fpage>703</fpage>
            <lpage>713</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1925</pubid>
                  <pubid idtype="pmpid" link="fulltext">16909129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The promoter targeting sequence facilitates and restricts a distant enhancer to a single promoter in the Drosophila embryo</p>
            </title>
            <aug>
               <au>
                  <snm>Lin</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2003</pubdate>
            <volume>130</volume>
            <issue>3</issue>
            <fpage>519</fpage>
            <lpage>526</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1242/dev.00227</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490558</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The Promoter Targeting Sequence mediates epigenetically heritable transcription memory</p>
            </title>
            <aug>
               <au>
                  <snm>Lin</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2004</pubdate>
            <volume>18</volume>
            <issue>21</issue>
            <fpage>2639</fpage>
            <lpage>2651</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">525544</pubid>
                  <pubid idtype="pmpid" link="fulltext">15520283</pubid>
                  <pubid idtype="doi">10.1101/gad.1230004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Short-range transcriptional repressors mediate both quenching and direct repression within complex loci in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Gray</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1996</pubdate>
            <volume>10</volume>
            <issue>6</issue>
            <fpage>700</fpage>
            <lpage>710</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.10.6.700</pubid>
                  <pubid idtype="pmpid" link="fulltext">8598297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Epigenetic Silencing Mechanisms in Budding Yeast and Fruit Fly: Different Paths, Same Destinations</p>
            </title>
            <aug>
               <au>
                  <snm>Pirrotta</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gross</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Molecular Cell</source>
            <pubdate>2005</pubdate>
            <volume>18</volume>
            <issue>4</issue>
            <fpage>395</fpage>
            <lpage>398</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.molcel.2005.04.013</pubid>
                  <pubid idtype="pmpid" link="fulltext">15893722</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Different core promoters possess distinct regulatory activities in the Drosophila embryo</p>
            </title>
            <aug>
               <au>
                  <snm>Ohtsuki</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>HN</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1998</pubdate>
            <volume>12</volume>
            <issue>4</issue>
            <fpage>547</fpage>
            <lpage>556</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">316525</pubid>
                  <pubid idtype="pmpid" link="fulltext">9472023</pubid>
                  <pubid idtype="doi">10.1101/gad.12.4.547</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Compatibility between enhancers and promoters determines the transcriptional specificity of gooseberry and gooseberry neuro in the Drosophila embryo</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Noll</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1994</pubdate>
            <volume>13</volume>
            <issue>2</issue>
            <fpage>400</fpage>
            <lpage>406</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">394821</pubid>
                  <pubid idtype="pmpid">8313885</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Promoter specificity mediates the independent regulation of neighboring genes</p>
            </title>
            <aug>
               <au>
                  <snm>Merli</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bergstrom</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Cygan</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Blackman</snm>
                  <fnm>RK</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1996</pubdate>
            <volume>10</volume>
            <issue>10</issue>
            <fpage>1260</fpage>
            <lpage>1270</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.10.10.1260</pubid>
                  <pubid idtype="pmpid" link="fulltext">8675012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Enhancer-Promoter Communication at the yellow Gene of Drosophila melanogaster: Diverse Promoters Participate in and Regulate trans Interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>Ct</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2006</pubdate>
            <volume>174</volume>
            <issue>4</issue>
            <fpage>1867</fpage>
            <lpage>1880</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1698615</pubid>
                  <pubid idtype="pmpid" link="fulltext">17057235</pubid>
                  <pubid idtype="doi">10.1534/genetics.106.064121</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Diversification of transcriptional modulation: Large-scale identification and characterization of putative alternative promoters of human genes</p>
            </title>
            <aug>
               <au>
                  <snm>Kimura</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wakamatsu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ota</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nishikawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>J-i</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tsuritani</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wakaguri</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ishii</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sugiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Isono</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Irie</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kushida</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yoneyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Otsuka</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kanda</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yokoi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kondo</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wagatsuma</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Murakawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ishida</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ishibashi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Takahashi-Fujii</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tanase</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nagai</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kikuchi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Isogai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>55</fpage>
            <lpage>65</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1356129</pubid>
                  <pubid idtype="pmpid" link="fulltext">16344560</pubid>
                  <pubid idtype="doi">10.1101/gr.4039406</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>The functional consequences of alternative promoter use in mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Davuluri</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Plass</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>THM</fnm>
               </au>
            </aug>
            <source>Trends in Genetics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <issue>4</issue>
            <fpage>167</fpage>
            <lpage>177</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2008.01.008</pubid>
                  <pubid idtype="pmpid" link="fulltext">18329129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Transcription factor binding and modified histones in human bidirectional promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Lin</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Trinklein</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Xi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <issue>6</issue>
            <fpage>818</fpage>
            <lpage>827</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1891341</pubid>
                  <pubid idtype="pmpid" link="fulltext">17568000</pubid>
                  <pubid idtype="doi">10.1101/gr.5623407</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Comprehensive Annotation of Bidirectional Promoters Identifies Co-Regulation among Breast and Ovarian Cancer Genes</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>MQ</fnm>
               </au>
               <au>
                  <snm>Koehly</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>LL</fnm>
               </au>
            </aug>
            <source>PLoS Computational Biology</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <issue>4</issue>
            <fpage>e72</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1853124</pubid>
                  <pubid idtype="pmpid" link="fulltext">17447839</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0030072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Baek</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ewing</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <issue>2</issue>
            <fpage>145</fpage>
            <lpage>155</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781346</pubid>
                  <pubid idtype="pmpid" link="fulltext">17210929</pubid>
                  <pubid idtype="doi">10.1101/gr.5872707</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>FlyBase: integration and improvements to query tools</p>
            </title>
            <aug>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Strelets</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>The FlyBase</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <issue>Database issue</issue>
            <fpage>D588</fpage>
            <lpage>D593</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238994</pubid>
                  <pubid idtype="pmpid" link="fulltext">18160408</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Biological function of unannotated transcription during the early development of Drosophila melanogaster</p>
            </title>
            <aug>
               <au>
                  <snm>Manak</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Dike</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sementchenko</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kapranov</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Biemar</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ghosh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Piccolboni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gingeras</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <issue>10</issue>
            <fpage>1151</fpage>
            <lpage>1158</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1875</pubid>
                  <pubid idtype="pmpid" link="fulltext">16951679</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Trinklein</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Anton</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Genome Research</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>10</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1356123</pubid>
                  <pubid idtype="pmpid" link="fulltext">16344566</pubid>
                  <pubid idtype="doi">10.1101/gr.4222606</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Codon Usage Bias and Base Composition of Nuclear Genes in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Moriyama</snm>
                  <fnm>EN</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1993</pubdate>
            <volume>134</volume>
            <issue>3</issue>
            <fpage>847</fpage>
            <lpage>858</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1205521</pubid>
                  <pubid idtype="pmpid" link="fulltext">8349115</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome</p>
            </title>
            <aug>
               <au>
                  <snm>Bergman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rincon-Limas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gnirke</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stapleton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>de Jong</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botas</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>research0086.0081</fpage>
            <lpage>0086.0020</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12537575</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0086</pubid>
                  <pubid idtype="pmcid">151188</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Halfon</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>6</issue>
            <fpage>R101</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2394749</pubid>
                  <pubid idtype="pmpid" link="fulltext">17550599</pubid>
                  <pubid idtype="doi">10.1186/gb-2007-8-6-r101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The features of Drosophila core promoters revealed by statistical analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Gershenzon</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Trifonov</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ioshikhes</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>1</issue>
            <fpage>161</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1538597</pubid>
                  <pubid idtype="pmpid" link="fulltext">16790048</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-161</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Ohler</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>20</issue>
            <fpage>5943</fpage>
            <lpage>5950</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1635271</pubid>
                  <pubid idtype="pmpid" link="fulltext">17068082</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl608</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Identifying DNA and protein patterns with statistically significant alignments of multiple sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Hertz</snm>
                  <fnm>GZ</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <issue>7</issue>
            <fpage>563</fpage>
            <lpage>577</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.7.563</pubid>
                  <pubid idtype="pmpid" link="fulltext">10487864</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Computational analysis of core promoters in the Drosophila genome</p>
            </title>
            <aug>
               <au>
                  <snm>Ohler</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Liao</snm>
                  <fnm>G-c</fnm>
               </au>
               <au>
                  <snm>Niemann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>research0087.0081</fpage>
            <lpage>0087.0012</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12537576</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0087</pubid>
                  <pubid idtype="pmcid">151189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Clustering of DNA Sequences in Human Promoters</p>
            </title>
            <aug>
               <au>
                  <snm>FitzGerald</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Shlyakhtenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mir</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Vinson</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>1562</fpage>
            <lpage>1574</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">509265</pubid>
                  <pubid idtype="pmpid" link="fulltext">15256515</pubid>
                  <pubid idtype="doi">10.1101/gr.1953904</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Complex controls: the role of alternative promoters in mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Landry</snm>
                  <fnm>J-R</fnm>
               </au>
               <au>
                  <snm>Mager</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Wilhelm</snm>
                  <fnm>BT</fnm>
               </au>
            </aug>
            <source>Trends in Genetics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>11</issue>
            <fpage>640</fpage>
            <lpage>648</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2003.09.014</pubid>
                  <pubid idtype="pmpid" link="fulltext">14585616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>A statistical method for alignment-free comparison of regulatory sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Kantorovitz</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Robinson</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <issue>13</issue>
            <fpage>i249</fpage>
            <lpage>255</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btm211</pubid>
                  <pubid idtype="pmpid" link="fulltext">17646303</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Using FlyAtlas to identify better Drosophila melanogaster models of human disease</p>
            </title>
            <aug>
               <au>
                  <snm>Chintapalli</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dow</snm>
                  <fnm>JAT</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <issue>6</issue>
            <fpage>715</fpage>
            <lpage>720</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng2049</pubid>
                  <pubid idtype="pmpid" link="fulltext">17534367</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Large clusters of co-expressed genes in the Drosophila genome</p>
            </title>
            <aug>
               <au>
                  <snm>Boutanaev</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Kalmykova</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Shevelyov</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Nurminsky</snm>
                  <fnm>DI</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <issue>6916</issue>
            <fpage>666</fpage>
            <lpage>669</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01216</pubid>
                  <pubid idtype="pmpid" link="fulltext">12478293</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Modelling the correlation between the activities of adjacent genes in drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Thygesen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Zwinderman</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>1</issue>
            <fpage>10</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">547897</pubid>
                  <pubid idtype="pmpid" link="fulltext">15659243</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-10</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>A Gene Expression Map for the Euchromatic Genome of Drosophila melanogaster</p>
            </title>
            <aug>
               <au>
                  <snm>Stolc</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gauhar</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Mason</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Halasz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>van Batenburg</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Rifkin</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Hua</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Herreman</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tongprasit</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Barbano</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>KP</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <issue>5696</issue>
            <fpage>655</fpage>
            <lpage>660</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1101312</pubid>
                  <pubid idtype="pmpid" link="fulltext">15499012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>A survey of ovary-, testis-, and soma-biased gene expression in Drosophila melanogaster adults</p>
            </title>
            <aug>
               <au>
                  <snm>Parisi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nuttall</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Minor</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Naiman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Doctolero</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vainer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Malley</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Eastman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>6</issue>
            <fpage>R40</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">463073</pubid>
                  <pubid idtype="pmpid" link="fulltext">15186491</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-6-r40</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Evidence for large domains of similarly expressed genes in the Drosophila genome</p>
            </title>
            <aug>
               <au>
                  <snm>Spellman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Journal of Biology</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <issue>1</issue>
            <fpage>5</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117248</pubid>
                  <pubid idtype="pmpid" link="fulltext">12144710</pubid>
                  <pubid idtype="doi">10.1186/1475-4924-1-5</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>EPD in its twentieth year: towards complete promoter coverage of selected model organisms</p>
            </title>
            <aug>
               <au>
                  <snm>Schmid</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Perier</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Praz</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>suppl_1</issue>
            <fpage>D82</fpage>
            <lpage>85</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16381980</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj146</pubid>
                  <pubid idtype="pmcid">1347508</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Cohen</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Mitra</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <issue>2</issue>
            <fpage>183</fpage>
            <lpage>186</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/79896</pubid>
                  <pubid idtype="pmpid" link="fulltext">11017073</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Chromosomal clustering of a human transcriptome reveals regulatory background</p>
            </title>
            <aug>
               <au>
                  <snm>Vogel</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>von Heydebreck</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Purmann</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sperling</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>1</issue>
            <fpage>230</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1261156</pubid>
                  <pubid idtype="pmpid" link="fulltext">16171528</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-230</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Transcriptome coexpression map of human embryonic stem cells</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Shin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Loring</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mattson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zhan</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>1</issue>
            <fpage>103</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1523211</pubid>
                  <pubid idtype="pmpid" link="fulltext">16670017</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans</p>
            </title>
            <aug>
               <au>
                  <snm>Roy</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Stuart</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Lund</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SK</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>418</volume>
            <issue>6901</issue>
            <fpage>975</fpage>
            <lpage>979</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12214599</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Clustering of housekeeping genes provides a unified model of gene order in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Lercher</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Urrutia</snm>
                  <fnm>AO</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>31</volume>
            <issue>2</issue>
            <fpage>180</fpage>
            <lpage>183</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng887</pubid>
                  <pubid idtype="pmpid" link="fulltext">11992122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains</p>
            </title>
            <aug>
               <au>
                  <snm>Caron</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schaik</snm>
                  <fnm>Bv</fnm>
               </au>
               <au>
                  <snm>Mee</snm>
                  <fnm>Mvd</fnm>
               </au>
               <au>
                  <snm>Baas</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Riggins</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sluis</snm>
                  <fnm>Pv</fnm>
               </au>
               <au>
                  <snm>Hermus</snm>
                  <fnm>M-C</fnm>
               </au>
               <au>
                  <snm>Asperen</snm>
                  <fnm>Rv</fnm>
               </au>
               <au>
                  <snm>Boon</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Voute</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Heisterkamp</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kampen</snm>
                  <fnm>Av</fnm>
               </au>
               <au>
                  <snm>Versteeg</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>291</volume>
            <issue>5507</issue>
            <fpage>1289</fpage>
            <lpage>1292</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1056794</pubid>
                  <pubid idtype="pmpid" link="fulltext">11181992</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Domain-wide regulation of gene expression in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Gierman</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Indemans</snm>
                  <fnm>MHG</fnm>
               </au>
               <au>
                  <snm>Koster</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Goetze</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Seppen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Geerts</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>van Driel</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Versteeg</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <issue>9</issue>
            <fpage>1286</fpage>
            <lpage>1295</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1950897</pubid>
                  <pubid idtype="pmpid" link="fulltext">17693573</pubid>
                  <pubid idtype="doi">10.1101/gr.6276007</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>A Drosophila full-length cDNA resource</p>
            </title>
            <aug>
               <au>
                  <snm>Stapleton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brokstein</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Champe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guarin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>research0080.0081</fpage>
            <lpage>0080.0088</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12537569</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0080</pubid>
                  <pubid idtype="pmcid">151182</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>The Drosophila Gene Collection: Identification of Putative Full-Length cDNAs for 70% of D. melanogaster Genes</p>
            </title>
            <aug>
               <au>
                  <snm>Stapleton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Liao</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Brokstein</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Shiraki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Champe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Research</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>8</issue>
            <fpage>1294</fpage>
            <lpage>1300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186637</pubid>
                  <pubid idtype="pmpid" link="fulltext">12176937</pubid>
                  <pubid idtype="doi">10.1101/gr.269102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>The UCSC Genome Browser Database</p>
            </title>
            <aug>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>YT</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>51</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165576</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519945</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>EMBOSS: The European Molecular Biology Open Software Suite</p>
            </title>
            <aug>
               <au>
                  <snm>Rice</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Longden</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bleasby</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends in Genetics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>6</issue>
            <fpage>276</fpage>
            <lpage>277</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02024-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10827456</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>R: A language and environment for statistical computing</p>
            </title>
            <aug>
               <au>
                  <cnm>R_Development_Core_Team</cnm>
               </au>
            </aug>
            <source>R Foundation for Statistical Computing</source>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B57">
            <title>
               <p>DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Morgenstern</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>211</fpage>
            <lpage>218</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.3.211</pubid>
                  <pubid idtype="pmpid" link="fulltext">10222408</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
