<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-8-67</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Identification of plant promoter constituents by analysis of local distribution of short sequences</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Yamamoto</snm>
               <mi>Y</mi>
               <fnm>Yoshiharu</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>yyoshi@gene.nagoya-u.ac.jp</email>
            </au>
            <au id="A2">
               <snm>Ichida</snm>
               <fnm>Hiroyuki</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>ichida@riken.jp</email>
            </au>
            <au id="A3">
               <snm>Matsui</snm>
               <fnm>Minami</fnm>
               <insr iid="I4"/>
               <email>minami@riken.jp</email>
            </au>
            <au id="A4">
               <snm>Obokata</snm>
               <fnm>Junichi</fnm>
               <insr iid="I2"/>
               <email>obokata@gene.nagoya-u.ac.jp</email>
            </au>
            <au id="A5">
               <snm>Sakurai</snm>
               <fnm>Tetsuya</fnm>
               <insr iid="I5"/>
               <email>stetsuya@psc.riken.jp</email>
            </au>
            <au id="A6">
               <snm>Satou</snm>
               <fnm>Masakazu</fnm>
               <insr iid="I3"/>
               <email>msatou@gsc.riken.go.jp</email>
            </au>
            <au id="A7">
               <snm>Seki</snm>
               <fnm>Motoaki</fnm>
               <insr iid="I3"/>
               <email>mseki@rtc.riken.go.jp</email>
            </au>
            <au id="A8">
               <snm>Shinozaki</snm>
               <fnm>Kazuo</fnm>
               <insr iid="I5"/>
               <email>sinozaki@rtc.riken.go.jp</email>
            </au>
            <au id="A9">
               <snm>Abe</snm>
               <fnm>Tomoko</fnm>
               <insr iid="I1"/>
               <email>tomoabe@riken.jp</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Application and Development Group, RIKEN FRS, Hirosawa 2-1, Wako, Saitama 351-0198, Japan</p>
            </ins>
            <ins id="I2">
               <p>Center for Gene Research, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464-8602, Japan</p>
            </ins>
            <ins id="I3">
               <p>Graduate School of Science and Technology, Chiba University, Matsudo 648, Matsudo, Chiba 271-8510, Japan</p>
            </ins>
            <ins id="I4">
               <p>RIKEN Genomic Sciences Center, Suehirocho 1-7-22, Tsurumiku, Yokohama, Kanagawa 230-0045, Japan</p>
            </ins>
            <ins id="I5">
               <p>RIKEN Plant Science Center, Suehirocho 1-7-22, Tsurumiku, Yokohama, Kanagawa 230-0045, Japan</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>67</fpage>
         <url>http://www.biomedcentral.com/1471-2164/8/67</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17346352</pubid>
               <pubid idtype="doi">10.1186/1471-2164-8-67</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>21</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>08</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>08</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Yamamoto et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Plant promoter architecture is important for understanding regulation and evolution of the promoters, but our current knowledge about plant promoter structure, especially with respect to the core promoter, is insufficient. Several promoter elements including TATA box, and several types of transcriptional regulatory elements have been found to show local distribution within promoters, and this feature has been successfully utilized for extraction of promoter constituents from human genome.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>LDSS (Local Distribution of Short Sequences) profiles of short sequences along the plant promoter have been analyzed <it>in silico</it>, and hundreds of hexamer and octamer sequences have been identified as having localized distributions within promoters of <it>Arabidopsis thaliana </it>and rice. Based on their localization patterns, the identified sequences could be classified into three groups, pyrimidine patch (Y Patch), TATA box, and REG (Regulatory Element Group). Sequences of the TATA box group are consistent with the ones reported in previous studies. The REG group includes more than 200 sequences, and half of them correspond to known <it>cis</it>-elements. The other REG subgroups, together with about a hundred uncategorized sequences, are suggested to be novel <it>cis</it>-regulatory elements. Comparison of LDSS-positive sequences between <it>Arabidopsis </it>and rice has revealed moderate conservation of elements and common promoter architecture. In addition, a dimer motif named the YR Rule (C/T A/G) has been identified at the transcription start site (-1/+1). This rule also fits both <it>Arabidopsis </it>and rice promoters.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>LDSS was successfully applied to plant genomes and hundreds of putative promoter elements have been extracted as LDSS-positive octamers. Identified promoter architecture of monocot and dicot are well conserved, but there are moderate variations in the utilized sequences.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The determination of complete genome sequences has allowed analysis by various statistical methods that have furthered understanding of the function of genomes. Analysis of promoter structure is one of the most important issues. Understanding of promoter structure allows predictions concerning promoter positions and expression profiles, and sheds light on hidden transcriptional networks.</p>
         <p>Several functional elements have been identified as promoter constituents for precise and regulated transcriptional initiation: TATA box, Initiator (Inr) motif, Downstream Promoter Element (DPE, found from <it>drosophila</it>), TFIIB-Recognition Element (BRE), and so-called <it>cis</it>-regulatory elements <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. In addition, some mammalian promoters are associated with CpG islands <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>, which is related to the Sp1 recognition site <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and have some relationship with gene regulation by DNA-methylation <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B7">7</abbr></abbrgrp>. Human transcriptional regulatory elements are reported to make clusters (modules) at the promoter region as well as the 3' end of a gene <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Transcription start sites (TSS) in plant promoters have a CG-compositional strand bias, or GC-skew, where C is more frequently observed in the (+) strand than G <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. Some of these features are well understood and some are not, but all these features are useful to understand individual promoters. Some of the above features have been utilized for promoter prediction <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. Although these studies obtain certain success, our current knowledge of promoters is still insufficient <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>Availability of microarray data on co-regulated gene expression on a genomic scale has enabled the prediction of novel <it>cis</it>-elements involved in gene regulation. Several approaches have been developed for this detection of consensus sequences in a co-regulated promoter set (Gibbs Motif Sampling <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>, MEME <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>), and detection of over-represented sequence in co-regulated promoters with a set of reference sequences <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. These approaches are also applicable to chromatin immunoprecipitation (ChIP) data <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. In addition, identification of conserved promoter sequences by comparative genomics supports the prediction of regulatory elements <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>.</p>
         <p>Studies on plant transcription factors and functional <it>cis</it>-regulatory elements have been summarized in several databases, and the collective information of <it>cis</it>-elements and/or transfactor-binding DNA sequences are utilized for interpretation of plant promoters (PLACE: <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, AGRIS: <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, AthaMap: <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>). Basis of these databases are published articles reporting analyses of individual promoters or transfactors, rather than large scale genomic analyses. Therefore, lack of large scale functional analyses of transcription factors in plant science is reflected in these databases as well.</p>
         <p>In contrast to the above fact-based approaches, <it>in silico </it>prediction of plant promoter elements by survey of the <it>Arabidopsis </it>genome is also reported. Molina and Grotewold applied the MEME and Gibbs sampling methods to <it>Arabidopsis </it>core promoter regions with genomic scale, and detected several motifs including a plant TATA motif and microsatellites <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>.</p>
         <p>Recent studies on mammalian promoter elements have revealed that some of them have localized appearance along the promoter region, exemplified by the TATA box <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, and binding sites for NRF-1, Sp1, CREB, ATF, and E2F <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. These studies evoke the idea that localized distribution is a signature of a functional element of the promoter. Recently, this feature was successfully utilized for extraction of functional sequences from human promoters <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Large-scale deletion analysis of human promoters suggested that there is some relationship between presence of functional elements and distance from TSS <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
         <p>In this report, we have detected hundreds of short sequences showing localized distribution in plant promoters by comprehensive analyses of short sequences. The extracted sequences are mentioned as "LDSS (Local Distribution of Short Sequence)-positive" in this work. These sequences includes TATA boxes, various regulatory sequences identified in previous studies, a novel sequence group that would be a general component of a core promoter, and also many novel sequences that share many characteristics with regulatory sequences. Our analyses have also revealed conservation of the promoter architecture between monocot and dicot plants.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Patterns of distribution of peaks</p>
            </st>
            <p>Typically, DNA elements recognized by a protein (complex) is within the range of 5 to 15 bp long <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Within this range, we decided to analyze localization patterns of hexamer and octamer sequences. Our results suggest that sequences longer than 9 bps would not provide enough number of appearance to survive statistical analysis.</p>
            <p>For each hexamer sequence, a distribution profile in relation to distance from the TSS was analyzed for <it>Arabidopsis thaliana</it>. Looking through all the distribution profiles, we noticed that there are quite a few patterns. Most sequences have a flat distribution profile with no special tendency (Fig. <figr fid="F1">1</figr>, GAAGAG). Sometimes the base line has a slight slope with a higher frequency toward the TSS. There are also groups with peaks, and they can be classified according to the peak position. We refer these sequences as LDSS-positive.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Examples of distribution of peaks</p>
               </caption>
               <text>
                  <p><b>Examples of distribution of peaks</b>. Several examples of hexamer analysis against <it>Arabidopsis </it>promoters are shown. The vertical axis indicates the total count of the whole promoter database. Gray and solid lines show raw and average with 15 bin (width of window), respectively. Instead of the promoter database, a set of 3,000 random fragments of 1 kb length from the <it>Arabidopsis </it>genome were used for the occurrence analysis as negative controls (shown as "random genome" in the bottom columns).</p>
               </text>
               <graphic file="1471-2164-8-67-1"/>
            </fig>
            <p>One example of a LDSS-positive sequence, (Fig. <figr fid="F1">1</figr>, CTCTTC) has a peak of appearance at the TSS. Its complementary sequence (Fig. <figr fid="F1">1</figr>, GAAGAG) has a distinct distribution profile, showing that its appearance is sensitive to the direction of transcription. Although hexamers with this type of distribution profile tend to have only C and T in the sequence (see later), there seems to be weak sequence preference, and not all the sequences filled with C and T show a peak-positive distribution (Fig. <figr fid="F1">1</figr>, CCTTTT is a peak-negative example).</p>
            <p>A second example (Fig. <figr fid="F1">1</figr>, CTATAA) is a TATA box-related sequence. This has a peak around -35 bp, and the peak is very sharp. The complementary sequence showed a different pattern with no peak (Fig. <figr fid="F1">1</figr>, TTATAG).</p>
            <p>A third example (Fig. <figr fid="F1">1</figr>, TGGGCC) has a relatively wide and low peak. Complementary sequence of this sequence shows the same peak (Fig. <figr fid="F1">1</figr>, GGCCCA). Peak position and direction-insensitivity suggest that sequences with this type of distribution profile are so-called <it>cis</it>-regulatory sequences involved in transcriptional regulation <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. In fact, TGGGCC in Figure <figr fid="F1">1</figr> is reported to be necessary for meristematic expression in <it>Arabidopsis</it>, and mutation to TG<ul>AA</ul>CC abolished the expression (Element II of <it>Arabidopsis </it>PCNA-2, <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>). Interestingly, distribution of the mutated sequence does not have any peaks (Fig. <figr fid="F1">1</figr>, TGAACC), demonstrating a good correlation between functionality and peak distribution. In addition, one base substitution, TG<ul>A</ul>GCC, also caused the loss of the peak (Fig. <figr fid="F1">1</figr>). It is common that one base substitution drastically changes the distribution profile (data not shown).</p>
            <p>As controls, a set of random genomic sequences of 1 kb length was used for the distribution analysis instead of the promoter database. When sequences with distribution patterns of peak-positive sequences were applied to this analysis, they were found to have no peaks in the random genome fragments (Fig. <figr fid="F1">1</figr>, CTCTTC/random genome, CTATAA/random genome, TGGGCC/random genome).</p>
            <p>Beside LDSS-positive elements, there are many LDSS-negative sequences. Among them, frequently observed sequences beyond the theoretical occurrence rate (0.24 per a 1 kb region) are rich in AT and might promote promoter context, and rare sequences are rich in GC and they might disturb promoter function when located within the promoter region. Therefore, it might be possible to utilize these LDSS-negative sequences as well for evaluation of promoter context.</p>
         </sec>
         <sec>
            <st>
               <p>Parameters for peak evaluation</p>
            </st>
            <p>Figure <figr fid="F2">2A</figr> shows a close-up of a typical distribution profile of the regulator type. In order to detect peak-positive sequence, we calculated several parameters. Curve fitting with Gaussian did not give good results (data not shown), because the peak shape is not symmetrical, as seen in the figure. Through analysis of distribution profiles of all the hexamers, we noticed that all of the observed peaks were located downstream of -200 bp. This enabled a base line to be established (Base in the figure) as an average of occurrence between -1,000 and -500 bp. Then we calculated the Relative Peak Height (RPH), and Relative Peak Area (RPA) for evaluation of peak strength. Fluctuation around the base line between -1,000 and -500 bp was also evaluated (see figure legend).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Parameters for peak detection</p>
               </caption>
               <text>
                  <p><b>Parameters for peak detection</b>. (A) Graph is a distribution profile of CACGTG in <it>Arabidopsis </it>promoters. Average with 15 bin is shown. The dotted line indicates the Base Line, which is an average of -1,000 to -500. The light grey area shows Peak Area. The dark grey area is &#916;area, an indication of the fluctuation from the Base Line from -1,000 to -500. In addition, the following parameters have been defined: Relative Peak Area (RPA) = Peak Area/total area; Relative Peak Height (RPH) = peak height/Base Line; Peak Area/basal fluctuation = Peak Area/&#916;area per peak width; Peak height/SD = peak height/standard deviation of occurrence from -1000 to -500. Several parameters of this graph are shown in Table 2 (CACGTG). (B) All the hexamers were analyzed to obtain various parameters, and (Peak Area/basal fluctuation) and peak position were calculated. The graph shows the results. Each dot shows the data of an individual hexamer. Among the 4,096 hexamers (grey dots), 247 peak positive hexamers have been selected (solid dots). The graph demonstrates that hexamers with a significant value have a peak position from -200 to -13 (the most downstream position after smoothing).</p>
               </text>
               <graphic file="1471-2164-8-67-2"/>
            </fig>
            <p>Figure <figr fid="F2">2B</figr> shows the relationship between peak position and a parameter of peak strength. As shown, all the strong peaks locate downstream of -200 bp while weak peaks are scattered throughout the promoters. One important point of the figure is the continuous distribution of hexamers across the vertical axis. The continuous nature was also observed when RPH or RPA was represented in the graph on the vertical axis (data not shown). These results mean that there is no clear way to separate peaky and flat groups. In this study, we took a strategy to list sequences with strong peaks, leaving out a flat group and a group with ambiguous peaks.</p>
            <p>Considering peak height, peak area, and fluctuation from the base line, we selected 247 sequences from all the hexamers as peak positive (Fig. <figr fid="F2">2B</figr>, black dots, Table S1 [see Additional file <supplr sid="S1">1</supplr>]).</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Complete list of LDSS-positive hexamers of Arabidopsis (Table S1.pdf). Contains hexamer sequences and parameters.</p>
               </text>
               <file name="1471-2164-8-67-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Peak-positive hexamers can be classified according to their peak position</p>
            </st>
            <p>The LDSS-positive hexamers identified were then classified into three major groups as mentioned above. The first group, including CTCTTC, in Figure <figr fid="F1">1</figr>, localize from -100 to -13 bp. They typically have a peak at the most downstream region of the promoter (position -13, Table <tblr tid="T1">1</tblr>), but peak positions distribute from -13 to -60. Most of their sequences are composed of only C and T, we refer to this group as Y Patch (Y for pyrimidine). As shown in the table, Y Patch sequences are found in the majority of <it>Arabidopsis </it>promoters.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Y Patch and TATA Box identified from <it>Arabidopsis </it>hexamer analysis</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>Sequence</p>
                     </c>
                     <c ca="center">
                        <p>Peak position<sup>1 </sup>(bp)</p>
                     </c>
                     <c ca="center">
                        <p>Peak width<sup>2 </sup>(bp)</p>
                     </c>
                     <c ca="center">
                        <p>#promoter<sup>3</sup></p>
                     </c>
                     <c ca="center">
                        <p>Relative Peak Height (RPH)</p>
                     </c>
                     <c ca="center">
                        <p>Relative Peak Area (RPA)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y Patch</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TCTCTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>158</p>
                     </c>
                     <c ca="center">
                        <p>6,741</p>
                     </c>
                     <c ca="center">
                        <p>10.96</p>
                     </c>
                     <c ca="center">
                        <p>0.25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CCTCTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>107</p>
                     </c>
                     <c ca="center">
                        <p>3,106</p>
                     </c>
                     <c ca="center">
                        <p>8.13</p>
                     </c>
                     <c ca="center">
                        <p>0.20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CTTCTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="center">
                        <p>5,916</p>
                     </c>
                     <c ca="center">
                        <p>7.64</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CTCCTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>81</p>
                     </c>
                     <c ca="center">
                        <p>3,180</p>
                     </c>
                     <c ca="center">
                        <p>7.23</p>
                     </c>
                     <c ca="center">
                        <p>0.12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CTCTTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>91</p>
                     </c>
                     <c ca="center">
                        <p>5,393</p>
                     </c>
                     <c ca="center">
                        <p>7.02</p>
                     </c>
                     <c ca="center">
                        <p>0.14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CTCTCC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>108</p>
                     </c>
                     <c ca="center">
                        <p>3,153</p>
                     </c>
                     <c ca="center">
                        <p>6.95</p>
                     </c>
                     <c ca="center">
                        <p>0.16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TCCCTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>93</p>
                     </c>
                     <c ca="center">
                        <p>2,140</p>
                     </c>
                     <c ca="center">
                        <p>6.13</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TTCTTC</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>8,829</p>
                     </c>
                     <c ca="center">
                        <p>5.78</p>
                     </c>
                     <c ca="center">
                        <p>0.11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TTCTCT</p>
                     </c>
                     <c ca="center">
                        <p>-13</p>
                     </c>
                     <c ca="center">
                        <p>109</p>
                     </c>
                     <c ca="center">
                        <p>8,314</p>
                     </c>
                     <c ca="center">
                        <p>5.77</p>
                     </c>
                     <c ca="center">
                        <p>0.12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TATA Box</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TATAAA</p>
                     </c>
                     <c ca="center">
                        <p>-35</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>10,704</p>
                     </c>
                     <c ca="center">
                        <p>9.0</p>
                     </c>
                     <c ca="center">
                        <p>0.10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TATATA</p>
                     </c>
                     <c ca="center">
                        <p>-36</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>10,315</p>
                     </c>
                     <c ca="center">
                        <p>6.38</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ATATAA</p>
                     </c>
                     <c ca="center">
                        <p>-35</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>10,062</p>
                     </c>
                     <c ca="center">
                        <p>6.14</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ATAAAT</p>
                     </c>
                     <c ca="center">
                        <p>-35</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>10,572</p>
                     </c>
                     <c ca="center">
                        <p>5.14</p>
                     </c>
                     <c ca="center">
                        <p>0.05</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TAAATA</p>
                     </c>
                     <c ca="center">
                        <p>-34</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>9,801</p>
                     </c>
                     <c ca="center">
                        <p>4.65</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ATATAT</p>
                     </c>
                     <c ca="center">
                        <p>-35</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>10,412</p>
                     </c>
                     <c ca="center">
                        <p>3.84</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TTATAA</p>
                     </c>
                     <c ca="center">
                        <p>-36</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>9,172</p>
                     </c>
                     <c ca="center">
                        <p>3.36</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TTATAT</p>
                     </c>
                     <c ca="center">
                        <p>-36</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>9,639</p>
                     </c>
                     <c ca="center">
                        <p>3.10</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>1</sup>In this analysis, -13 is the position for an average from -20 to -6 that covers a region from -20 to -1, so -13 is the most downstream position.</p>
                  <p><sup>2</sup>Peak width at the bottom of the peak.</p>
                  <p><sup>3</sup>Number of promoters containing the element out of 15,607 <it>Arabidopsis </it>promoters (-1,000 to -1). Number of promoters containing an element within the peak area can be roughly estimated by <ul>#promoters &#215; RPA</ul>. For example, TATAAA is found in approx. 1,070 promoters within the peak area (10,704 &#215; 0.10).</p>
               </tblfn>
            </tbl>
            <p>The second group contains TATA box-related sequences. An example is shown as CTATAA in Figure <figr fid="F1">1</figr>. The characteristics of this group are high peak height, narrow peak width, and stringent peak position (Table <tblr tid="T1">1</tblr>, TATA Box). Similar to Y Patch, the TATA box group sequences are also found in the majority of <it>Arabidopsis </it>promoters, although promoters with the TATA Box within the peak are is about 1,000 or less for each sequence.</p>
            <p>The third group, including TGGGCC in Figure <figr fid="F1">1</figr>, is referred to as REG, for Regulatory Element Group, in this study. The peak positions of this group locate around -80 bp, and they have a wide peak width in comparison with that of the TATA box group (Table <tblr tid="T2">2</tblr>). Another feature of the group is high coverage of Peak Area against total area. This means high specificity of localization within a promoter. As shown in Relative Peak Area (RPA) of the table, around 50% to 30% of a REG sequence is found in the peak area. These ratios are much higher than those of the Y Patch (25 to 10%) or TATA box (11 to 5%) groups. Compared to these, the number of promoters containing a REG sequence is smaller, consistent with the idea that each REG is not a component of the general core promoter but a specific regulator of gene expression. In fact, Table <tblr tid="T2">2</tblr> contains several known <it>cis</it>-regulatory elements, including Element II of <it>Arabidopsis </it>PCNA-2 (GGCCCA, TGGGCC, and AGCCCA) <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> and G-box/ABRE (CACGTG, CGTGGC, CCACGT, and GCCACG) <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>REGs identified from <it>Arabidopsis </it>hexamer analysis</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>Sequence</p>
                     </c>
                     <c ca="center">
                        <p>Peak position (bp)</p>
                     </c>
                     <c ca="center">
                        <p>Peak width (bp)</p>
                     </c>
                     <c ca="center">
                        <p>#promoter</p>
                     </c>
                     <c ca="center">
                        <p>Relative Peak Height (RPH)</p>
                     </c>
                     <c ca="center">
                        <p>Relative Peak Area (RPA)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>AGGCCC</p>
                     </c>
                     <c ca="center">
                        <p>-76</p>
                     </c>
                     <c ca="center">
                        <p>326</p>
                     </c>
                     <c ca="center">
                        <p>2,005</p>
                     </c>
                     <c ca="center">
                        <p>14.78</p>
                     </c>
                     <c ca="center">
                        <p>0.54</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GGCCCA</p>
                     </c>
                     <c ca="center">
                        <p>-73</p>
                     </c>
                     <c ca="center">
                        <p>347</p>
                     </c>
                     <c ca="center">
                        <p>1,225</p>
                     </c>
                     <c ca="center">
                        <p>12.26</p>
                     </c>
                     <c ca="center">
                        <p>0.53</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GGGCCT</p>
                     </c>
                     <c ca="center">
                        <p>-106</p>
                     </c>
                     <c ca="center">
                        <p>240</p>
                     </c>
                     <c ca="center">
                        <p>1,764</p>
                     </c>
                     <c ca="center">
                        <p>10.31</p>
                     </c>
                     <c ca="center">
                        <p>0.47</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TGGGCC</p>
                     </c>
                     <c ca="center">
                        <p>-107</p>
                     </c>
                     <c ca="center">
                        <p>262</p>
                     </c>
                     <c ca="center">
                        <p>2,867</p>
                     </c>
                     <c ca="center">
                        <p>9.29</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GGGCCC</p>
                     </c>
                     <c ca="center">
                        <p>-91</p>
                     </c>
                     <c ca="center">
                        <p>256</p>
                     </c>
                     <c ca="center">
                        <p>711</p>
                     </c>
                     <c ca="center">
                        <p>9.51</p>
                     </c>
                     <c ca="center">
                        <p>0.44</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GCCCAT</p>
                     </c>
                     <c ca="center">
                        <p>-76</p>
                     </c>
                     <c ca="center">
                        <p>320</p>
                     </c>
                     <c ca="center">
                        <p>2,925</p>
                     </c>
                     <c ca="center">
                        <p>8.41</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GCCCAA</p>
                     </c>
                     <c ca="center">
                        <p>-72</p>
                     </c>
                     <c ca="center">
                        <p>366</p>
                     </c>
                     <c ca="center">
                        <p>3,068</p>
                     </c>
                     <c ca="center">
                        <p>7.78</p>
                     </c>
                     <c ca="center">
                        <p>0.42</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>AGCCCA</p>
                     </c>
                     <c ca="center">
                        <p>-85</p>
                     </c>
                     <c ca="center">
                        <p>284</p>
                     </c>
                     <c ca="center">
                        <p>2,963</p>
                     </c>
                     <c ca="center">
                        <p>7.53</p>
                     </c>
                     <c ca="center">
                        <p>0.39</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CACGTG</p>
                     </c>
                     <c ca="center">
                        <p>-80</p>
                     </c>
                     <c ca="center">
                        <p>273</p>
                     </c>
                     <c ca="center">
                        <p>3,039</p>
                     </c>
                     <c ca="center">
                        <p>6.85</p>
                     </c>
                     <c ca="center">
                        <p>0.38</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>AAGCCC</p>
                     </c>
                     <c ca="center">
                        <p>-86</p>
                     </c>
                     <c ca="center">
                        <p>299</p>
                     </c>
                     <c ca="center">
                        <p>2,593</p>
                     </c>
                     <c ca="center">
                        <p>7.48</p>
                     </c>
                     <c ca="center">
                        <p>0.37</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CGGCCC</p>
                     </c>
                     <c ca="center">
                        <p>-62</p>
                     </c>
                     <c ca="center">
                        <p>189</p>
                     </c>
                     <c ca="center">
                        <p>732</p>
                     </c>
                     <c ca="center">
                        <p>7.66</p>
                     </c>
                     <c ca="center">
                        <p>0.36</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CCACGT</p>
                     </c>
                     <c ca="center">
                        <p>-83</p>
                     </c>
                     <c ca="center">
                        <p>260</p>
                     </c>
                     <c ca="center">
                        <p>2,367</p>
                     </c>
                     <c ca="center">
                        <p>5.66</p>
                     </c>
                     <c ca="center">
                        <p>0.35</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ATGGGC</p>
                     </c>
                     <c ca="center">
                        <p>-97</p>
                     </c>
                     <c ca="center">
                        <p>295</p>
                     </c>
                     <c ca="center">
                        <p>2,836</p>
                     </c>
                     <c ca="center">
                        <p>6.29</p>
                     </c>
                     <c ca="center">
                        <p>0.35</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CGTGGC</p>
                     </c>
                     <c ca="center">
                        <p>-97</p>
                     </c>
                     <c ca="center">
                        <p>251</p>
                     </c>
                     <c ca="center">
                        <p>1,459</p>
                     </c>
                     <c ca="center">
                        <p>5.96</p>
                     </c>
                     <c ca="center">
                        <p>0.35</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>TAGGCC</p>
                     </c>
                     <c ca="center">
                        <p>-75</p>
                     </c>
                     <c ca="center">
                        <p>311</p>
                     </c>
                     <c ca="center">
                        <p>1,435</p>
                     </c>
                     <c ca="center">
                        <p>6.18</p>
                     </c>
                     <c ca="center">
                        <p>0.34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CGTGTC</p>
                     </c>
                     <c ca="center">
                        <p>-79</p>
                     </c>
                     <c ca="center">
                        <p>289</p>
                     </c>
                     <c ca="center">
                        <p>1,909</p>
                     </c>
                     <c ca="center">
                        <p>5.57</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>AAGGCC</p>
                     </c>
                     <c ca="center">
                        <p>-77</p>
                     </c>
                     <c ca="center">
                        <p>287</p>
                     </c>
                     <c ca="center">
                        <p>1,935</p>
                     </c>
                     <c ca="center">
                        <p>6.27</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GCGCGT</p>
                     </c>
                     <c ca="center">
                        <p>-59</p>
                     </c>
                     <c ca="center">
                        <p>244</p>
                     </c>
                     <c ca="center">
                        <p>632</p>
                     </c>
                     <c ca="center">
                        <p>5.56</p>
                     </c>
                     <c ca="center">
                        <p>0.32</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GCCACG</p>
                     </c>
                     <c ca="center">
                        <p>-83</p>
                     </c>
                     <c ca="center">
                        <p>215</p>
                     </c>
                     <c ca="center">
                        <p>1,411</p>
                     </c>
                     <c ca="center">
                        <p>6.64</p>
                     </c>
                     <c ca="center">
                        <p>0.31</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ACGCGC</p>
                     </c>
                     <c ca="center">
                        <p>-65</p>
                     </c>
                     <c ca="center">
                        <p>190</p>
                     </c>
                     <c ca="center">
                        <p>655</p>
                     </c>
                     <c ca="center">
                        <p>5.08</p>
                     </c>
                     <c ca="center">
                        <p>0.31</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>GGGCCG</p>
                     </c>
                     <c ca="center">
                        <p>-85</p>
                     </c>
                     <c ca="center">
                        <p>196</p>
                     </c>
                     <c ca="center">
                        <p>711</p>
                     </c>
                     <c ca="center">
                        <p>6.01</p>
                     </c>
                     <c ca="center">
                        <p>0.30</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>CACGCG</p>
                     </c>
                     <c ca="center">
                        <p>-138</p>
                     </c>
                     <c ca="center">
                        <p>182</p>
                     </c>
                     <c ca="center">
                        <p>884</p>
                     </c>
                     <c ca="center">
                        <p>5.22</p>
                     </c>
                     <c ca="center">
                        <p>0.30</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>In addition to these three groups, there is also small number of exceptional hexamers with peak positions in the core promoter (-13 to -60). They might constitute a minor type(s) within the core promoter (Table S1, "others" [see Additional file <supplr sid="S1">1</supplr>]. See also Table S2 and S3 for these elements). The complete list of the extracted sequences is shown in Table S1. The table shows 103 Y Patch, 39 TATA-related, 38 REG, and 22 unclassified hexamer sequences.</p>
         </sec>
         <sec>
            <st>
               <p>Directional preference relative to transcription</p>
            </st>
            <p>Subsequently, we examined if the orientation of the hexamers is critical. The identified hexamers were tested to determine if their complementary sequences were also included or not. If the complementary sequence was also found in this positive group, the original sequence is considered as direction-insensitive, and if not, direction-sensitive. As shown in Figure <figr fid="F3">3</figr>, the downstream region from -50, that is known to be the core promoter region <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and includes the Y Patch and TATA box groups, is occupied with direction-sensitive sequences ("uniq" in the figure), while the upstream region, containing the REG group, is rich in direction-insensitive sequences ("comp" in the figure). These findings are consistent with the established idea that the core promoter determines position and direction of transcription, and <it>cis</it>-elements are direction insensitive. These findings further support the idea that the Y Patch and TATA box sequences are core promoter elements and REG sequences are the <it>cis</it>-elements <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Directional preference of LDSS-positive hexamers</p>
               </caption>
               <text>
                  <p><b>Directional preference of LDSS-positive hexamers</b>. When the corresponding complementary sequence was not found in the LDSS-positive group, the hexamer was counted as "uniq", which means orientation-sensitive. When found, the sequence was counted as "comp", meaning direction-insensitive. The number of both hexamers were counted according to the peak position from the TSS, and summarized in a bar graph. The inset graph is an enlargement to show more detail around the TSS.</p>
               </text>
               <graphic file="1471-2164-8-67-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Comparison of <it>Arabidopsis </it>and rice promoters</p>
            </st>
            <p>Subsequently, we analyzed the distribution of octamer sequences. The average of octamer appearance rates is 15.7-fold less than the one of hexamers, consistent with a mathematical expectation of 16-fold difference (data not shown). Because rare sequences tend to show more fluctuations by chance, statistical evaluation was more critical for octamer analysis. We prepared random distribution populations and used them for statistical evaluation of each octamer (Figure S1 [see Additional file <supplr sid="S2">2</supplr>]). In this study, we have set a p value of 1 &#215; 10<sup>-5 </sup>as a threshold. In addition, data of the complementary sequences was merged only for REG detection to increase total count of an octamer in the database. Through the octamer analyses, we have identified 350 and 418 LDSS-positive core elements (Table S2 [see Additional file <supplr sid="S3">3</supplr>] and S3 [see Additional file <supplr sid="S4">4</supplr>]), and 308 and 242 REG sequences from <it>Arabidopsis </it>and rice, respectively (Table S4 [see Additional file <supplr sid="S5">5</supplr>] and S5 [see Additional file <supplr sid="S6">6</supplr>]). Sum of the p values for all the extracted octamers of individual species were around 1 &#215; 10<sup>-3 </sup>each, so false-positive sequences by pure random distribution are not likely to be included in the lists.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>Characteristics of random distribution (FigS1.pdf). Contains graphs to show relationship between a LDSS parameter and a size of population (Total Area).</p>
               </text>
               <file name="1471-2164-8-67-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Arabidopsis core octamers (Table S2.pdf). Contains octamer sequences and parameters.</p>
               </text>
               <file name="1471-2164-8-67-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Rice core octamers (Table S3.pdf). Contains octamer sequences and parameters.</p>
               </text>
               <file name="1471-2164-8-67-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Arabidopsis REG octamers (Table S4.pdf). Contains octamer sequences and parameters.</p>
               </text>
               <file name="1471-2164-8-67-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S6">
               <title>
                  <p>Additional file 6</p>
               </title>
               <text>
                  <p>Rice REG octamers (Table S5.pdf). Contains octamer sequences and parameters.</p>
               </text>
               <file name="1471-2164-8-67-S6.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>For comparison of <it>Arabidopsis </it>and rice elements, Relative Peak Height (RPH) values of all the positive octamers in either of the two promoter databases were represented (Fig. <figr fid="F4">4A</figr>). If a sequence has the same RPH value in the <it>Arabidopsis </it>and rice databases, a dot appears on the diagonal line. As shown in the figure, we found that RPH values are moderately conserved between <it>Arabidopsis </it>and rice (Fig. <figr fid="F4">4B</figr>). The figure also indicates that a considerable number of the sequences have a large difference in the parameter between <it>Arabidopsis </it>and rice. Figure <figr fid="F4">4B</figr> shows Venn diagram of the number of positive octamers in <it>Arabidopsis </it>and rice. As shown in the figure, approximately 30 to 50% of the octamers are conserved between <it>Arabidopsis </it>and rice for both core groups of Y Patch and TATA box, and the REG group. Presence of all the three categories in <it>Arabidopsis </it>and rice, and sequential conservation as shown in the figure indicate that promoter architecture of these plant species is essentially conserved. On the other hand, divergence of the positive sequences might reflect differentiation of the corresponding <it>trans</it>-factors between these species.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Comparison of <it>Arabidopsis </it>and rice octamers</p>
               </caption>
               <text>
                  <p><b>Comparison of <it>Arabidopsis </it>and rice octamers</b>. (A) 987 octamers that are LDSS-positive in either <it>Arabidopsis </it>or rice promoters were selected and their Relative Peak Height (RPH) was compared and expressed as a scatter plot. Each dot is data from an individual octamer sequence. (B) LDSS-positive octamer sequences of <it>Arabidopsis </it>and rice were compared, and common sequences found in both sets were identified. The figure shows the number of octamer sequences. Classification into the Y and TATA groups were done based on distribution profiles as shown in Figure 5. The REG group has a peak position between -51 and -200.</p>
               </text>
               <graphic file="1471-2164-8-67-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Classification of <it>Arabidopsis </it>LDSS-positive octamers by distribution profiles</p>
            </st>
            <p>All the LDSS-positive sequences from <it>Arabidopsis </it>were subjected to clustering analysis according to their distribution profiles. As expected from previous hexamer analyses, major clusters are REGs, TATA box, and Y Patch (Fig. <figr fid="F5">5</figr>). As shown in the figure, distribution profiles within each group (clusters) are quite similar, suggesting functional conservation within each group. The observed clear classification of the LDSS-positive sequences, represented in Figure <figr fid="F5">5</figr>, suggest that the local distribution is a quite useful feature in extraction of putative functional elements in the promoter.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Clustering of LDSS-positive sequences based on distribution profiles</p>
               </caption>
               <text>
                  <p><b>Clustering of LDSS-positive sequences based on distribution profiles</b>. Distribution profiles of each LDSS-positive octamer of <it>Arabidopsis </it>were subjected to hierarchical clustering. Three major clusters are shown.</p>
               </text>
               <graphic file="1471-2164-8-67-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Clustering of <it>Arabidopsis </it>REGs based on presence and absence in promoters</p>
            </st>
            <p>Subsequently, we did classification of 308 <it>Arabidopsis </it>REGs with the aid of the promoter database. For each promoter, number of appearance for each REG was scored, and two-dimensional REG-promoter clustering was performed. This REG-promoter association has revealed that 10,334 out of 12,951 <it>Arabidopsis </it>promoters have at least one REG at the region of -400 to -40 bp. This high coverage (80%) is due to the long list of REG sequences.</p>
            <p>This 2D clustering puts co-localized REGs proximal, and promoters with similar REG compositions also come close. Two promoter clusters are shown in Figure <figr fid="F6">6A</figr> and <figr fid="F6">6B</figr>. One cluster of promoters (A) are rich in GCCCA-containing REGs, and another cluster (B) have ACGT-containing REGs. GCCCA-containing REGs is the same kind as TGGGCC (Figure <figr fid="F1">1</figr>) and known to show cell cycle-dependent expression and meristematic expression (Group 1, Table <tblr tid="T3">3</tblr>). Interestingly, this promoter group is rich in ribosomal proteins. As shown in Figure <figr fid="F6">6A</figr>, as high as 38% (6 out of 16) of the annotated promoters are for ribosomal proteins (Fig. <figr fid="F6">6A</figr>, blue). In contrast, ribosomal promoters are not included in the ACGT-containing promoter clusters (Fig. <figr fid="F6">6B</figr>). Instead, the latter cluster is rich in photosynthesis-related genes and stress-responsive genes, both of which would show environmental responses. In fact, as many as 34 out of 38 genes in this cluster with expression data are responsive to light (Fig. <figr fid="F6">6B</figr>, green) or abiotic stress including salt, drought, and cold (Fig. <figr fid="F6">6B</figr>, red and orange), according to public microarray data <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>. Although this clustering is not so accurate as to distinguish between light and stress responses, it has been proved to classify genes with respect to gene expression with a certain range of accuracy. The results are reasonable because <it>cis</it>-elements for light response (G-box: CACGTG, <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>) and stress response (ABRE: ACGTGTC, <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>) are related sequences both of which belong to the ACGT motif for environmental responses (Group 2, Table <tblr tid="T3">3</tblr>). Therefore, clustering of promoters appears reasonable, although the accuracy may not be enough for pinpoint speculation of gene function.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>REG-promoter clustering</p>
               </caption>
               <text>
                  <p><b>REG-promoter clustering</b>. For each <it>Arabidopsis </it>promoter, number of each octamer REG within a region from -400 to -40 bp was scored, and subjected to 2D hierarchical clustering. The vertical axis shows promoters and the horizontal axis does REGs. The matrix means number of REG sequences. Two small promoter clusters are shown in the figure together with the whole REGs. (A) A part of promoter cluster rich in GCCCA motif for meristematic expression. Ribosomal proteins are shown in blue. (B) A part of promoter cluster rich in ACGT motif for environmental response. Promoter names are expressed in color according to expression data from AtGenExpress. Red: abiotic stress-positive, orange: abiotic stress-negative, green: light-positive, black: no response to abiotic stress or light, grey: no expression data found. (C) An example of clustered REGs. A part of the ACGT cluster shown in the top of Panel A is enlarged. ACGT in the octamers are highlighted with orange.</p>
               </text>
               <graphic file="1471-2164-8-67-6"/>
            </fig>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Classification of octamer REGs</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="center">
                        <p>Group</p>
                     </c>
                     <c ca="center">
                        <p>Motif<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>Motif name</p>
                     </c>
                     <c ca="center">
                        <p>Comment</p>
                     </c>
                     <c ca="center">
                        <p>Trans factor</p>
                     </c>
                     <c ca="center">
                        <p>Expression</p>
                     </c>
                     <c ca="center">
                        <p>Reference</p>
                     </c>
                     <c ca="center">
                        <p>At<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>Rice<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>At &amp; Rice<sup>2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>GCCCA</p>
                     </c>
                     <c ca="left">
                        <p>Element II of <it>Arabidopsis </it>PCNA-2, Site IIa of rice PCNA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>PCF1, PCF2, TCP20</p>
                     </c>
                     <c ca="left">
                        <p>cell cycle/meristematic expression</p>
                     </c>
                     <c ca="left">
                        <p>[35, 60]</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                     <c ca="center">
                        <p>71</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>ACGT</p>
                     </c>
                     <c ca="left">
                        <p>"ACGT Core", G-box, ABRE,</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>bZIP family (GBF, TGA1, etc.), PIF3</p>
                     </c>
                     <c ca="left">
                        <p>environmental response (light, UV, drought, ABA)</p>
                     </c>
                     <c ca="left">
                        <p>[36, 61]</p>
                     </c>
                     <c ca="center">
                        <p>33</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>ACGCGC</p>
                     </c>
                     <c ca="left">
                        <p>CGCG box</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>AtSR1(CaMBP)</p>
                     </c>
                     <c ca="left">
                        <p>stress response?</p>
                     </c>
                     <c ca="left">
                        <p>[62]</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>CCGAC</p>
                     </c>
                     <c ca="left">
                        <p>DRE</p>
                     </c>
                     <c ca="left">
                        <p>DRE core</p>
                     </c>
                     <c ca="center">
                        <p>DREB/CBF</p>
                     </c>
                     <c ca="left">
                        <p>stress response</p>
                     </c>
                     <c ca="left">
                        <p>[39]</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>AACCG(G/A)</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c ca="left">
                        <p>overlapping with GT1 box (TTAACC)</p>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>AAACG(C/G)</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ACCCCT</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>ACCCT</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>ACGGGC</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>CCATGG</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>CCAACGG</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>GGGACCC</p>
                     </c>
                     <c ca="left">
                        <p>novel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="left">
                        <p>not known</p>
                     </c>
                     <c ca="left">
                        <p>this study</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Rest</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>74</p>
                     </c>
                     <c ca="center">
                        <p>66</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Total</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>308</p>
                     </c>
                     <c ca="center">
                        <p>242</p>
                     </c>
                     <c ca="center">
                        <p>90</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>1</sup>Number of octamer sequences. This classification is not completely mutually exclusive.</p>
               </tblfn>
            </tbl>
            <p>Clustering of REGs turned out to be reliable as well, and thus useful for REG classification. According to this 2D method, overlapping REGs (e.g., C<ul>ACGTGGA</ul> and <ul>ACGTGGA</ul>T, Fig. <figr fid="F6">6C</figr>) have a bias toward coexistence by chance. However, similar but mutually exclusive sequences (e.g., <ul>ACGTGGA</ul>T and <ul>ACGTGGA</ul>A, Fig. <figr fid="F6">6C</figr>) are also clustered into the same group, suggesting that REGs with the same role are clustered together. This is explained by existence of multiple copies of the same kind of a <it>cis</it>-element in a promoter as different octamer expression. Figure <figr fid="F7">7</figr> shows the whole tree of <it>Arabidopsis </it>REGs. This figure demonstrates that REGs with related sequences are clustered together with high reliability. According to these results, 12 motifs have been extracted from <it>Arabidopsis </it>REGs (Fig. <figr fid="F7">7</figr>), and are summarized in Table <tblr tid="T3">3</tblr>.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Clustering of REGs</p>
               </caption>
               <text>
                  <p><b>Clustering of REGs</b>. Aided by REG-promoter clustering, <it>Arabidopsis </it>REGs were subjected to classification. Colored dots in the figure mean presence of the corresponding motif in the REG sequence. The tree is the same as one in Figure 6A.</p>
               </text>
               <graphic file="1471-2164-8-67-7"/>
            </fig>
            <p>One group has a GGCCCA core sequence that is known as Site IIa or Element II (Group 1, Table <tblr tid="T3">3</tblr>). Element II is necessary for cell cycle-related expression and for meristematic expression <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Many sequences containing GGCCCA in the center of an octamer were found in REG group of both <it>Arabidopsis </it>and rice (Table <tblr tid="T4">4</tblr>). As seen in the table, this group is a good indicator of conservation.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Several REG groups were identified from <it>Arabidopsis </it>and rice octamer analysis</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="center">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Rice</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>*GGCCCA*</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>AGGCCCAA#</p>
                     </c>
                     <c ca="left">
                        <p>AGGCCCAA#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>AGGCCCAC#</p>
                     </c>
                     <c ca="left">
                        <p>AGGCCCAC#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>AGGCCCAG#</p>
                     </c>
                     <c ca="left">
                        <p>AGGCCCAG#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>AGGCCCAT#</p>
                     </c>
                     <c ca="left">
                        <p>AGGCCCAT#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CGGCCCAA#</p>
                     </c>
                     <c ca="left">
                        <p>CGGCCCAA#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CGGCCCAT#</p>
                     </c>
                     <c ca="left">
                        <p>CGGCCCAC</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>GGGCCCAA#</p>
                     </c>
                     <c ca="left">
                        <p>CGGCCCAG</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>GGGCCCAG#</p>
                     </c>
                     <c ca="left">
                        <p>CGGCCCAT#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>GGGCCCAT#</p>
                     </c>
                     <c ca="left">
                        <p>GGGCCCAA#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>TGGCCCAA</p>
                     </c>
                     <c ca="left">
                        <p>GGGCCCAC</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>TGGCCCAG#</p>
                     </c>
                     <c ca="left">
                        <p>GGGCCCAG#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>TGGCCCAT#</p>
                     </c>
                     <c ca="left">
                        <p>GGGCCCAT#</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TGGCCCAC</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TGGCCCAG#</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TGGCCCAT#</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>**ACGT**, *ACGT***</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>ACACGTCA</p>
                     </c>
                     <c ca="left">
                        <p>ACACGTGG#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>ACACGTGA</p>
                     </c>
                     <c ca="left">
                        <p>CACGTCAC#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>ACACGTGG#</p>
                     </c>
                     <c ca="left">
                        <p>CACGTCTC</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTCAC#</p>
                     </c>
                     <c ca="left">
                        <p>CACGTGGC#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTCAG</p>
                     </c>
                     <c ca="left">
                        <p>CACGTGGG#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTCAT</p>
                     </c>
                     <c ca="left">
                        <p>CACGTGTC#</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTCTC#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGAC</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGCG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGGA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGGC#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGGG#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGGT</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGTA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGTC#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGTG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CACGTGTT</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CCACGTAG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CCACGTCA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>CCACGTCG</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>GACGTCGT</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>REGs found in both <it>Arabidopsis </it>and rice are indicated with a sharp (hash) symbol. An asterisk indicates any base and is used to restrict the position of the motif in the octamer sequence.</p>
               </tblfn>
            </tbl>
            <p>Another group shown in the table has the bZIP protein-binding motif containing ACGT core sequence. This group mediates various environmental signals <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Both species have this group in common, but <it>Arabidopsis </it>has wider variations than rice (Table <tblr tid="T4">4</tblr>).</p>
            <p>Classification of <it>Arabidopsis </it>and rice REGs are shown in Table <tblr tid="T3">3</tblr>. The largest group is the Group 1, which includes Element II of the <it>Arabidopsis </it>PCNA-2 involved in cell cycle-related expression, as mentioned above. As shown in the table, this group is well conserved between <it>Arabidopsis </it>and rice and has many members for both species. There are several other REG groups, some of which are rich in only <it>Arabidopsis </it>and some are found from both (several examples in Table <tblr tid="T4">4</tblr> and summarized in Table <tblr tid="T3">3</tblr>). Comparison between <it>Arabidopsis </it>and rice suggests both conserved and differentiated types of REGs.</p>
            <p>The identified <it>Arabidopsis </it>REG sequences were referred to the PLACE database that is a collection of reported plant <it>cis</it>-regulatory elements <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. The comparison revealed that 155 out of 308 <it>Arabidopsis </it>REGs show 100 % match with at least one of the <it>Arabidopsis </it>PLACE entries, giving an estimation that 50 % of the REGs are of established <it>cis</it>-regulatory elements (Table S6 [see Additional file <supplr sid="S7">7</supplr>]). These results again provide strong evidence for biologically meaningful extraction of sequences by the LDSS method. From another point of view, 21 out of 48 <it>Arabidopsis </it>PLACE entries have been found in the REG list (Table <tblr tid="T5">5</tblr>). Comparison with another <it>cis</it>-element database, AGRIS <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>, resulted in lower match than PLACE (27%) among <it>Arabidopsis </it>motif entries shorter than 9 bps (data not shown). These results suggest that not all of the <it>cis</it>-regulatory elements are detected by the LDSS strategy. One of the valuable finding of this analysis is the identification of a large number of novel REGs.</p>
            <suppl id="S7">
               <title>
                  <p>Additional file 7</p>
               </title>
               <text>
                  <p>Relationship between Arabidopsis REG and PLACE entry (Table S6.xls). A table showing which octamer REG corresponds to which PLACE entry, and vice versa.</p>
               </text>
               <file name="1471-2164-8-67-S7.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>PLACE <it>cis</it>-elements found and not found in <it>Arabidopsis </it>REGs</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>PLACE discription</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>sequence</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>found</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p><b>ACGTATERD1 </b>ACGT sequence required for etiolation-induced expression of erd1 (early responsive to dehydration) in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>ACGT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p><b>ABRELATERD1 </b>ABRE-like sequence (from -199 to -195) required for etiolation-induced expression of erd1 (early responsive to dehydration) in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>ACGTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p><b>LTRECOREATCOR15 </b>Core of low temperature responsive element (LTRE) of cor15a gene in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>CCGAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p><b>SORLIP1AT </b>one of "Sequences Over-Represented in Light-Induced Promoters (SORLIPs) in Arabidopsis; Computationally identified phyA-induced motifs;</p>
                     </c>
                     <c ca="center">
                        <p>GCCAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p><b>SORLIP2AT </b>one of "Sequences Over-Represented in Light-Induced Promoters (SORLIPs) in Arabidopsis; Computationally identified phyA-induced motifs;</p>
                     </c>
                     <c ca="center">
                        <p>GGGCC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p><b>WBOXATNPR1 </b>"W-box" found in promoter of Arabidopsis NPR1 gene; They were recognized specifically by salicylic acid (SA)-induced WRKY DNA binding proteins;</p>
                     </c>
                     <c ca="center">
                        <p>TTGAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p><b>CACGTGMOTIF </b>"CACGTG motif"; "G-box; Binding site of Arabidopsis GBF4;</p>
                     </c>
                     <c ca="center">
                        <p>CACGTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p><b>MYB2CONSENSUSAT </b>MYB recognition site found in the promoters of the dehydration-responsive gene rd22 and many other genes in Arabidopsis; Y = C/T; K = G/T;</p>
                     </c>
                     <c ca="center">
                        <p>YAACKG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p><b>MYBCORE </b>Binding site for all animal MYB and at least two plant MYB proteins ATMYB1 and ATMYB2, both isolated from Arabidopsis; ATMYB2 is involved in regulation of genes that are responsive to water stress in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>CNGTTR</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p><b>SITEIIATCYTC </b>"Site II element" found in the promoter regions of cytochrome genes (Cytc-1, Cytc-2) in Arabidopsis; Y = C/T;</p>
                     </c>
                     <c ca="center">
                        <p>TGGGCY</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p><b>ACGTABREMOTIFA2OSEM </b>Experimentally determined sequence requirement of ACGT-core of motif A in ABRE of the rice gene, OSEM; DRE and ABRE are interdependent in the ABA-responsive expression of the rd29A in Arabidopsis; K = G/T;</p>
                     </c>
                     <c ca="center">
                        <p>ACGTGKC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p><b>DPBFCOREDCDC3 </b>A novel class of bZIP transcription factors, DPBF-1 and 2 (Dc3 promoter-binding factor-1 and 2) binding core sequence; Found in the carrot Dc3 gene promoter; Dc3 expression is normally embryo-specific, and also can be induced by ABA; The Arabidopsis abscisic acid response gene ABI5 encodes a bZIP transcription factor; abi5 mutant have a pleiotropic defects in ABA response; ABI5 regulates a subset of late embryogenesis-abundant genes; GIA1 (growth-insensitivity to ABA) is identical to ABI5;</p>
                     </c>
                     <c ca="center">
                        <p>ACACNNG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p><b>GADOWNAT </b>Sequence present in 24 genes in the GA-down regulated d1 cluster found in Arabidopsis seed germination;</p>
                     </c>
                     <c ca="center">
                        <p>ACGTGTC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p><b>WUSATAg </b>Target sequence of WUS in the intron of AGAMOUS gene in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>TTAATGG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p><b>CDA1ATCAB2 </b>CDA-1 (CAB2 DET1-associated factor 1) binding site in DtRE (dark response element) f of chlorophyll a/b-binding protein2 (CAB2) gene in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>CAAAACGC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p><b>EMBP1TAEM </b>Binding site of trans-acting factor EMBP-1; wheat Em gene; Binding site of ABFs; ABFs (ABRE binding factors) were isolated from Arabidopsis by a yeast one-hybrid screening system; Involved in ABA-mediated stress-signaling pathway;</p>
                     </c>
                     <c ca="center">
                        <p>CACGTGGC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p><b>HEXAT </b>"Hex motif" ; Binding site of Arabidopsis bZIP protein TGA1 and G box binding factor GBF1; G-Box-like element;</p>
                     </c>
                     <c ca="center">
                        <p>TGACGTGG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p><b>UPRMOTIFIAT </b>"Motif I" in the conserved UPR (unfolded protein response) cis-acting element in Arabidopsis genes coding for SAR1B, HSP-90, SBR-like, Ca-ATPase 4, CNX1, PDI, etc.;</p>
                     </c>
                     <c ca="center">
                        <p>CCACGTCA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p><b>RAV1AAT </b>Binding consensus sequence of Arabidopsis transcription factor, RAV1; The expression level of RAV1 were relatively high in rosette leaves and roots;</p>
                     </c>
                     <c ca="center">
                        <p>CAACA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p><b>DRECRTCOREAT </b>Core motif of DRE/CRT (dehydration-responsive element/C-repeat) cis-acting element found in many genes in Arabidopsis and in rice; R = G/A;</p>
                     </c>
                     <c ca="center">
                        <p>RCCGAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p><b>ELRECOREPCRP1 </b>ElRE (Elicitor Responsive Element) core of parsley (P.c.) PR1 genes; consensus sequence of elements W1 and W2 of parsley PR1-1 and PR1-2 promoters; Box W1 and W2 are the binding site of WRKY1 and WRKY2, respectively; W-box found in thioredoxin h5 gene in Arabidopsis (Laloi et al.);</p>
                     </c>
                     <c ca="center">
                        <p>TTGACC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>not found</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p><b>ARR1AT</b>"ARR1-binding element" found in Arabidopsis; ARR1 is a response regulator; N = G/A/C/T;</p>
                     </c>
                     <c ca="center">
                        <p>NGATT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p><b>ARFAT </b>ARF (auxin response factor) binding site found in the promoters of primary/early auxin response genes of Arabidopsis; AuxRE; Binding site of Arabidopsis ARF1 (Auxin response factor1);</p>
                     </c>
                     <c ca="center">
                        <p>TGTCTC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="left">
                        <p><b>HEXAMERATH4 </b>hexamer motif of Arabidopsis histone H4 promoter;</p>
                     </c>
                     <c ca="center">
                        <p>CCGTCG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="left">
                        <p><b>IBOX </b>"I box"; "I-box"; Conserved sequence upstream of light-regulated genes; Sequence found in the promoter region of rbcS of tomato and Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>GATAAG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p><b>MYB1AT </b>MYB recognition site found in the promoters of the dehydration-responsive gene rd22 and many other genes in Arabidopsis; W = A/T;</p>
                     </c>
                     <c ca="center">
                        <p>WAACCA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p><b>MYB2AT </b>Binding site for ATMYB2, an Arabidopsis MYB homolog; ATMYB2 is involved in regulation of genes that are responsive to water stress in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>TAACTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p><b>MYCATERD1 </b>MYC recognition sequence necessary for expression of erd1 (early responsive to dehydration) in dehydrated Arabidopsis; NAC protein bound specifically to the CATGTG motif (Tran et al., 2004);</p>
                     </c>
                     <c ca="center">
                        <p>CATGTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="left">
                        <p><b>MYCATRD22 </b>Binding site for MYC (rd22BP1) in Arabidopsis dehydration-responsive gene, rd22; MYC binding site in rd22 gene of Arabidopsis; ABA-induction;</p>
                     </c>
                     <c ca="center">
                        <p>CACATG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p><b>PREATPRODH </b>"PRE (Pro- or hypoosmolarity-responsive element) found in the promoter region of proline dehydrogenase (ProDH) gene in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>ACTCAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p><b>RAV1BAT </b>Binding consensus sequence of an Arabidopsis transcription factor, RAV1; The expression level of RAV1 were relatively high in rosette leaves and roots;</p>
                     </c>
                     <c ca="center">
                        <p>CACCTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p><b>SREATMSD </b>"sugar-repressive element (SRE)" found in 272 of the 1592 down-regulated genes after main stem decapitation in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>TTATCC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>33</p>
                     </c>
                     <c ca="left">
                        <p><b>TBOXATGAPB </b>"Tbox" found in the Arabidopsis GAPB gene promoter; Mutations in the "Tbox" resulted in reductions of light-activated gene transcription;</p>
                     </c>
                     <c ca="center">
                        <p>ACTTTG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="left">
                        <p><b>AGCBOXNPGLB </b>"AGC box" repeated twice in a 61 bp enhancer element in tobacco (N.p.) class I beta-1,3-glucanase (GLB) gene; "GCC-box"; Binding sequence of Arabidopsis AtERFs;</p>
                     </c>
                     <c ca="center">
                        <p>AGCCGCC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p><b>GAREAT </b>GARE (GA-responsive element); Occurrence of GARE in GA-inducible, GA-responsible, and GA-nonresponsive genes found in Arabidopsis seed germination was 20, 18, and 12%, respectively;</p>
                     </c>
                     <c ca="center">
                        <p>TAACAAR</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="left">
                        <p><b>LEAFYATAG </b>Target sequence of LEAFY in the intron of AGAMOUS gene in Arabidopsis;</p>
                     </c>
                     <c ca="center">
                        <p>CCAATGT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>37</p>
                     </c>
                     <c ca="left">
                        <p><b>LTREATLTI78 </b>Putative low temperature responsive element (LTRE); Found in Arabidopsis low-temperature-induced (lti) genes, lti78/cor78/rd29A and lti65;</p>
                     </c>
                     <c ca="center">
                        <p>ACCGACA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="left">
                        <p><b>MYBATRD22 </b>Binding site for MYB (ATMYB2) in dehydration-responsive gene, rd22; MYB binding site in rd22 gene of Arabidopsis thaliana; ABA-induction;</p>
                     </c>
                     <c ca="center">
                        <p>CTAACCA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p><b>SORLIP5AT </b>one of "Sequences Over-Represented in Light-Induced Promoters (SORLIPs) in Arabidopsis; Computationally identified phyA-induced motifs;</p>
                     </c>
                     <c ca="center">
                        <p>GAGTGAG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p><b>ABREZMRAB28 </b>ABRE; ABA and water-stress responses; Found in maize (Z.m.) rab28; maize rab28 is ABA-inducible in embryos and vegetative tissues; Found in the Arabidopsis alcohol dehydrogenase (Adh) gene promoter;</p>
                     </c>
                     <c ca="center">
                        <p>CCACGTGG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p><b>CCA1ATLHCB1 </b>CCA1 binding site; CCA1 protein (myb-related transcription factor) interact with two imperfect repeats of AAMAATCT in Lhcb1*3 gene of Arabidopsis ; Related to regulation by phytochrome;</p>
                     </c>
                     <c ca="center">
                        <p>AAMAATCT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p><b>E2FANTRNR </b>"E2Fa element" found in the tobacco RNR (Ribonucleotide reductase) gene promoter and in the Arabidopsis CDC6 gene promoter; Binding site of tobacco and Arabidopsis E2F; Involved in upregulation of the promoter at G1/S transition;</p>
                     </c>
                     <c ca="center">
                        <p>TTTCCCGC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p><b>L1BOXATPDF1 </b>"L1 box" found in promoter of Arabidopsis PROTODERMAL FACTOR1 (PDF1) gene; Y = C/T;</p>
                     </c>
                     <c ca="center">
                        <p>TAAATGYA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c ca="left">
                        <p><b>OCTAMERMOTIFTAH3H4 </b>"Octamer motif" found in promoter of wheat histone genes H3 and H4, and corn histone genes H3 and H4; Arabidopsis histone H4; "histone-specific octamer";</p>
                     </c>
                     <c ca="center">
                        <p>CGCGGATC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>45</p>
                     </c>
                     <c ca="left">
                        <p><b>PIATGAPB </b>"PI" found in the Arabidopsis GAPB gene promoter; Mutations in the "PI" resulted in reductions of light-activated gene transcription;</p>
                     </c>
                     <c ca="center">
                        <p>GTGATCAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p><b>RYREPEATVFLEB4 </b>"RY repeat motif"; quantitative seed expression; Gene: Vicia faba LeB4; Soybean glycinin (Gy2); other dicot and monocot seed protein genes; Binding site of Arabidopsis B3-domain-containing transcription factor FUS3;</p>
                     </c>
                     <c ca="center">
                        <p>CATGCATG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p><b>UP2ATMSD </b>"Up2" motif found in 193 of the 1184 up-regulated genes after main stem decapitation in Arabidopsis; W = A/T;</p>
                     </c>
                     <c ca="center">
                        <p>AAACCCTA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>48</p>
                     </c>
                     <c ca="left">
                        <p><b>ZDNAFORMINGATCAB1 </b>"Z-DNA-forming sequence" found in the Arabidopsis chlorophyll a/b binding protein gene (cab1) promoter; Involved in light-dependent developmental expression of the gene; "Z-box";</p>
                     </c>
                     <c ca="center">
                        <p>ATACGTGT</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Characterization of transcription start site</p>
            </st>
            <p>We then analyzed sequence characteristics around the TSS. In this region, the Initiator motif (Inr: YY<ul>A</ul>N(T/A)YY, TSS is underlined) is known in some mammalian promoters <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, and it is also functional in plants <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. A survey of <it>Arabidopsis </it>TSS revealed that a limited number of promoters (less than 10%) have the Inr motif around the TSS. Thus, we looked for a more general rule. We surveyed which base is preferred at the -1/+1 position among <it>Arabidopsis </it>TSS. The most frequently observed sequence was C<ul>A</ul> (TSS is underlined), and T<ul>A</ul> was the second. As summarized in Figure <figr fid="F8">8A</figr>, there is a strong preference of a dimer sequence at the -1/+1 position. The graph clearly shows most of the TSS is A or G, and the -1 position is likely to be C or T. This "YR Rule" (Y<ul>R</ul>, TSS underlined, Y: C or T, R: A or G) applies to as many as 77% of the <it>Arabidopsis </it>promoters that is a much higher frequency than expected random appearance (25%). Similar analysis for the -2/-1 and +1/+2 positions did not reveal clear extension of the rule. When the YR Rule was applied to the -6/-5 to +4/+5 positions, we found that the ratio of YR Rule-positive is highest at the -1/+1 position in the local region examined (Fig. <figr fid="F8">8B</figr>, <it>Arabidopsis</it>). The figure shows that this rule is also applicable to rice TSS (Fig. <figr fid="F8">8B</figr>, rice). These analyses have revealed that sequence preference at TSS is well conserved between <it>Arabidopsis </it>and rice.</p>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Identification of YR Rule</p>
               </caption>
               <text>
                  <p><b>Identification of YR Rule</b>. (A) Dinucleotide sequences at the -1/+1 position relative to <it>Arabidopsis </it>TSS, determined by information of the fl-cDNAs, were counted. As shown, most of the TSS have (C/T)(A/G), and this YR Rule applies to 77% of the analyzed TSSs. (B) Frequency of dinucleotide sequences fitting with YR Rule was scanned from -5 to +5 of <it>Arabidopsis </it>and rice TSS. Position of the downstream site of the dimer is shown. For example, the -1/+1 position is indicated as "1". Theoretically frequency of YR in non-biased sequence is 0.25.</p>
               </text>
               <graphic file="1471-2164-8-67-8"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>An example of <it>Arabidopsis </it>promoter</p>
            </st>
            <p>Our simple LDSS analysis has successfully revealed three distinct groups consisting of hundreds of short sequences. Figure <figr fid="F9">9A</figr> illustrates the architecture of plant promoters based on these findings.</p>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Illustration of YR Rule, Y Patch, TATA box, and REG</p>
               </caption>
               <text>
                  <p><b>Illustration of YR Rule, Y Patch, TATA box, and REG</b>. (A) Expected appearance positions relative to the TSS are as follows: YR Rule (-1/+1), Y Patch (-100 to -1), TATA box (-50 to -20), REG (-20 to -400). Among them, only the REG is orientation-insensitive, and the other groups are sensitive. In many cases the Y Patch locates between the TATA boxes and the TSS, but it is also observed upstream of the TATA boxes. (B) An example of an <it>Arabidopsis </it>promoter that has a Y Patch and TATA box. At1g10960 is one of the promoters clustered in Figure 6B. The promoter sequence from -100 to +1 is shown together with octamer motifs. Marks on the sequence are the same as illustrated in (A).</p>
               </text>
               <graphic file="1471-2164-8-67-9"/>
            </fig>
            <p>Tight positioning of the TATA boxes relative to the TSS fits with the general idea that the TATA boxes determine the position of the TSS. In addition, the YR Rule of <it>Arabidopsis </it>would be another important determinant as well. The Y Patches locate between the TATA boxes and the TSS, but they can be upstream of the TATA boxes, considering the wide distribution profiles (Figure <figr fid="F5">5</figr>). The role of the Y Patch is not known. The above three elements are orientation-sensitive, and constituents of a core promoter. REGs appear upstream of the TATA box, and they exist in an orientation-insensitive manner. Rice promoters share the above characteristics, showing architectural conservation between dicots and monocots.</p>
            <p>An example of an <it>Arabidopsis </it>promoter that has the Y Patch and TATA box is shown in Figure <figr fid="F9">9B</figr>. Octamer analysis of the promoter revealed one cluster of Group 2 REGs (Table <tblr tid="T3">3</tblr>), one cluster of Y Patches, one cluster of TATA box, and YR Rule. An interesting feature of the figure is the multiple hits of a locus, detecting a longer element. This demonstrates that octamer analysis can detect long functional units as clusters of octamers.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Characteristics of LDSS analysis</p>
            </st>
            <p>In this study, we have identified hundreds of novel sequences solely based on local distribution in the promoter region of <it>Arabidopsis </it>and rice. Biological information, such as microarray data, was not used at all for sequence extraction, and it becomes useful only during interpretation of the extracted sequence. This method is equally sensitive in detection of major and minor motifs in a promoter population as demonstrated by simultaneous detection of major TATA elements and minor REG elements. This feature is an advantage of the LDSS method over other methods of detection of consensus sequences among promoter populations, such as Gibbs Sampling method. We successfully applied the LDSS method to <it>Arabidopsis </it>and rice promoters, and of course, it is applicable to bacterial and mammalian research as well.</p>
            <p>The observed localized distribution is a direct result of the selection pressure. While the localization is an indication of a beneficial role for the organism, the relationship between local distribution of a sequence and its functionality is indirect. Therefore, the question arises if all regulatory elements can be picked up by the LDSS strategy.</p>
            <p>When we compared REG sequences with established <it>cis</it>-elements in the PLACE database, it was found that 27 out of 48 <it>Arabidopsis </it>PLACE entries are absent in the extracted REGs (Table <tblr tid="T5">5</tblr>). These results indicate that not all of the functional elements are LDSS-positive, and thus some would not be detected by this method. There are two possibilities for the presence of <it>cis</it>-elements that do not show local distribution. One possibility is that these elements are relatively "new" so there has not been selection pressure for a long enough period. Another possibility is that there has not been any selection pressure because of functional differences from the LDSS-positive elements. The latter idea suggests localization-insensitive classes of regulatory elements that are distinct from REGs. So called long range-regulators <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp> might be one of the classes.</p>
            <p>Generally, any functional sequences in the genome are recognized by <it>trans</it>-acting factors that are DNA-binding proteins. Promoter elements and their <it>trans-</it>factors have a relationship of co-evolution. Therefore, differentiation of REGs in the two species would reflect a different status of the corresponding <it>trans</it>-factors. Functional comparison of DNA-binding proteins of <it>Arabidopsis </it>and rice is expected to give some answers as to why these two species have differentiated REG sequences. As for the conserved REGs, it is reasonable that cell cycle-related elements (Group 1, Table <tblr tid="T3">3</tblr>) comprise the most conserved group, because the cell cycle is one of the most conserved activities in organisms.</p>
            <p>REG sequences can be extracted form mammalian promoters as well. However, our preliminary analyses suggest that the LDSS method can detect much less REGs than of plants (YYY and JO, unpublished results). This may be reflected by different promoter architecture between plants and animals.</p>
         </sec>
         <sec>
            <st>
               <p>Y Patch</p>
            </st>
            <p>The discovery that the Y Patch is conserved in monocots and dicots is one of the major achievements of this study. A related motif is reported by Molina and Grotewold from <it>Arabidopsis </it>core promoter analysis using the Gibbs-sampling method (Motif 1 with a typical sequence, TTCTTCTTC, <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>). The biochemical role of Y Patch is not known, but its position, direction sensitivity, and its abundant nature strongly suggest that it is a general component of the core promoter. Our LDSS analyses suggest that human and mouse do not share this element with plants and thus this is a plant-specific core element (YYY and JO, unpublished results).</p>
         </sec>
         <sec>
            <st>
               <p>YR Rule</p>
            </st>
            <p>At the TSS, the Initiator (Inr) motif (Y Y <ul>A</ul> N T/A Y Y, TSS is underlined) is known as a recognition site by TFIID <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Following their rules, the YR Rule can be considered as a less stringent form of Inr. According to this point of view, the YR Rule might be recognized by TFIID. The high coverage of the YR Rule is a useful feature for prediction of TSS. Recently, Carninci et al., have reported the same rule is applicable to mouse and human promoters as well <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, revealing conservation of YR Rule between plants and mammals.</p>
            <p>This rule is not an artifact by the Cap-Trapper method that is the basis of TSS mapping of this study and mammalian studies mentioned above <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, because it is applicable to human TSS determined by another method (Oligo-Cap method, <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>) as well (YYY and JO, unpublished results).</p>
            <p>A plant consensus around TSS (A/T n T/a C/t <ul>A/c</ul> a/t, TSS is underlined) is reported by Shahmuradov <it>et al </it>based on 217 dicot promoters (actual consensus is expressed by a matrix, <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>). This consensus also largely overlaps with YR Rule.</p>
            <p>The TFIIB-Recognition Element (BRE) is another core promoter element of animal genes. It is located just upstream of the TATA box and has a GC-rich sequence, (G/C)(G/C)(G/C)CGCC <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B48">48</abbr></abbrgrp>. Our analysis did not detect the BRE as a LDSS-positive element, although CC is preferred at the neighboring sequence of the TATA box at the upstream side in both <it>Arabidopsis </it>and rice promoters (Table S2 [see Additional file <supplr sid="S3">3</supplr>] and S3 [see Additional file <supplr sid="S4">4</supplr>]).</p>
         </sec>
         <sec>
            <st>
               <p>LDSS analysis provides useful information toward precise promoter prediction</p>
            </st>
            <p>The hundreds of octamer sequences identified by the LDSS analysis can be used for promoter prediction. The presence of the TATA box is an important feature of a promoter, but there are many false-positives in the genome. For example, a TATA octamer sequence with the highest specific localization is found within the peak area 30% of times in the promoter region, meaning that 70% are found outside of the peak area. This is essentially consistent with a previous study, where more than 200,000 putative TBP-binding sites were detected from the <it>Arabidopsis </it>genome <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Utilization of preferential sequence around the TATA box, and coexistence with the Y Patch and REG are expected to elevate accuracy of prediction. Although such a combinational approach is incorporated into several promoter prediction programs <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, motifs to be detected have been limited so far. Our long list of the LDSS-positive octamers is expected to serve as a thick dictionary for precise interpretation of plant genomes.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In this report, we showed that LDSS can be applied to plant genomes. We have successfully extracted hundreds of promoter elements as LDSS-positive octamers. All the observed behaviors of the isolated elements suggest functionality of these elements. Promoter architectures of monocot and dicot revealed in this study are well conserved, but there are moderate variations in the utilized sequences.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Preparation of promoter databases</p>
            </st>
            <p>Cap-Trapper <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> is one of the most reliable methods for identification of the 5' end of mRNA and thus suitable for determination of TSS. So-called full-length (fl) cDNAs of <it>Arabidopsis </it>and rice were made by the Cap-Trapper method, and around ten to twenty thousand of non-redundant fl-cDNA clones for each species have been completely sequenced <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Therefore, we decided to use the information from the fl-cDNAs for positioning of promoters. Genome sequences of promoter regions from -1,000 to -1 bp were prepared with the aid of information of the 5' ends of fl-cDNAs of <it>Arabidopsis </it><abbrgrp><abbr bid="B50">50</abbr><abbr bid="B52">52</abbr></abbrgrp> and rice <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. The established <it>Arabidopsis </it>promoter database <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B53">53</abbr></abbrgrp> and a rice database with 11,370 promoters, prepared in this study, were utilized for our analysis.</p>
            <p>Positions of rice fl-cDNA clones of rice <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> were mapped on to corresponding BAC clones according to description of "MappingData.txt" obtained from the KOME web site <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, and promoter regions from -1 kb to +200 bp relative to the TSS, that are 1.2 kbp long, were collected. BAC and fl-cDNA sequences were obtained from DDBJ. Special care was taken for 5' end of fl-cDNA sequences, and ones with less than 2 bp mismatch with the corresponding genomic sequences were used for the promoter mapping. Sequences of non-redundant 11,370 rice promoters have been prepared. For analyses of the TSS region, as shown in Figure <figr fid="F6">6</figr>, rice fl-cDNA sequences with no mismatch to the 5' end (6,209 promoters) were used. Establishment of the <it>Arabidopsis </it>promoter database is described elsewhere <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B53">53</abbr></abbrgrp>. Earlier analyses with <it>Arabidopsis </it>hexamers have been done using the distributed database containing 15,607 promoters. This database is based on distinct TSS and allows multiple promoters belonging to a single gene. A smaller set of 12,951 promoters was re-selected from the 15,607-version so as to pick-up one promoter from one gene, and used for octamer analyses. For preparation of random genomic fragments, non-overlapping <it>Arabidopsis </it>BAC clones were selected by consulting a TAIR web site <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>, they were successively cut into 1 kb pieces and serial numbers were given to the fragments. Sequences corresponding to 3,000 randomly chosen numbers based on the Mersenne Twister method <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> were used as random genomic fragments of 1 kb length.</p>
            <p>The programs used in this study will be freely provided upon request for non-profit purposes. A searchable web site to obtain results in this work will be released.</p>
         </sec>
         <sec>
            <st>
               <p>Generation of random distribution</p>
            </st>
            <p>Random distribution samples were generated with respect to Total Area, that is indication of total count in a promoter database. For each Total Area, 1,000 samples were prepared, and their RPA values were subjected to statistical analysis. Average and standard deviation are functions of Total Area (Figure S1 [Additional File <supplr sid="S2">2</supplr>]) and affected by a smoothing window. Model RPA populations of random distribution were calculated as the following equations:</p>
            <p>REG detection (smoothing with a 21-bin (width of window), and Total Area &lt; 2,000): log<sub>10</sub>(average) = -0.1861Ln(Total Area) &#8211; 0.5329, SD = 0.17 CORE detection (smoothing with a 3-bin, and Total Area &lt; 10,000): log<sub>10</sub>(average) = -0.1784Ln(Total Area) &#8211; 0.8026, SD = 0.13</p>
            <p>These models were utilized for estimation of p value for each octamer distribution.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence analysis</p>
            </st>
            <p>Sequence analysis was achieved by a combination of home-made Perl and C<sup>++ </sup>programs and also Excel software (Microsoft Japan, Tokyo). The first step of the analysis was the preparation of index files for each promoter with all the possible 4,096 hexamer and 65,536 octamer sequences. Information of the index files was then rearranged for each hexamer and octamer sequence, and the occurrence of the short sequences was summarized according to the promoter position. Summarized distribution data of each hexamer was then subjected to smoothing with a bin of 15 bp. Generally, smoothing with a wide bin lowers the peak height of a sharp peak, and with a narrow bin capturing a wide and low peak is not always possible. Considering these tendencies, a bin of 21 bp was used for identification of octamer REGs, and a bin of 3 bp was used for octamer core elements. Octamer REGs were extracted after merging the distribution data of the complementary sequence to increase the count of occurrence. As for extraction of octamer Core elements that is orientation-sensitive, merging was avoided. Positions of octamers and hexamers were counted from the first base of the sequence. For example, the position of a hexamer sequence that locates from -6 to -1 is expressed as -6. Positions of average values for line smoothing are indicated at the centre of the region. Therefore, positions closest to TSS vary depending on the bin length as well.</p>
            <p>Thresholds for distribution of peaks are as follows:</p>
            <p>Hexamer: (peak height/Base Line > 3) &amp; (peak height/SD > 5) &amp; (Peak Area/basal fluctuation) > 5),</p>
            <p>Octamer Core: (p value &lt; 10<sup>-4</sup>) &amp; (peak height/Base Line > 5) &amp; (peak height/SD > 10) &amp; (Peak Area/basal fluctuation > 6) &amp; (peak position > -51),</p>
            <p>Octamer REG: (p values &lt; 10<sup>-4</sup>) &amp; (peak height/Base Line > 3) &amp; (Peak Area/total area > 0.1) &amp; (peak height/SD > 5) &amp; (Peak Area/basal fluctuation > 6) &amp; (peak position &lt;-50).</p>
            <p>Fitting the distribution data with the Gaussian curve was achieved using Igor Pro (Hulinks, Tokyo). All the LDSS-positive octamers together with above parameters can be viewed at our web site (<abbrgrp><abbr bid="B57">57</abbr></abbrgrp>).</p>
            <p>Clustering analyses were achieved with Cluster <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> and visualized with TreeView <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. For clustering of LDSS-positive elements based on distribution profiles, peak value of each profile was adjusted to 5.0. For REG-promoter clustering, number of each REG appeared at a region between -400 to -40 bp was scored for each promoter and a REG-promoter table was prepared. Among the Cluster options, the hierarchical clustering method (centroid linkage) gave the most natural results over the <it>k</it>-means and SOM methods.</p>
            <p>Among the PLACE database <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>, 48 entries with definition sequences of 8 bases or less and also with description containing "Arabidopsis" were subjected to REG survey.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>LDSS &#8211; Local Distribution of Short Sequences</p>
         <p>TSS- transcription start site</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>YYY designed and performed the analyses including writing Perl programs. HI and TA prepared rice promoter database and wrote C<sup>++ </sup>programs. MM, TS, MSatou, MSeki, and KS prepared <it>Arabidopsis </it>promoter database. JO contributed in identification of YR Rule. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported in part by KAKENHI (Grant-in-Aid for Scientific Research) on Priority Areas "Comparative Genomics" from the Ministry of Education, Culture, Sports, Science and Technology of Japan (to Y.Y.Y. and J.O.).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Concepts and strategies: I. promoter and the general transcription machinery</p>
            </title>
            <aug>
               <au>
                  <snm>Carey</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Smale</snm>
                  <fnm>ST</fnm>
               </au>
            </aug>
            <source>Transcriptional regulation in eukaryotes</source>
            <publisher>New York , Cold Spring Harbor Laboratory Press</publisher>
            <pubdate>2001</pubdate>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The RNA polymerase II core promoter: a key component in the regulation of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Butler</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Kadonaga</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <issue>20</issue>
            <fpage>2583</fpage>
            <lpage>2592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.1026202</pubid>
                  <pubid idtype="pmpid" link="fulltext">12381658</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The RNA polymerase II core promoter</p>
            </title>
            <aug>
               <au>
                  <snm>Smale</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Kadonaga</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>2003</pubdate>
            <volume>72</volume>
            <fpage>449</fpage>
            <lpage>479</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.biochem.72.121801.161520</pubid>
                  <pubid idtype="pmpid" link="fulltext">12651739</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Number of CpG islands and genes in human and mouse</p>
            </title>
            <aug>
               <au>
                  <snm>Antequera</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Bird</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1993</pubdate>
            <volume>90</volume>
            <issue>24</issue>
            <fpage>11995</fpage>
            <lpage>11999</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">48112</pubid>
                  <pubid idtype="pmpid" link="fulltext">7505451</pubid>
                  <pubid idtype="doi">10.1073/pnas.90.24.11995</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Large-scale human promoter mapping using CpG islands</p>
            </title>
            <aug>
               <au>
                  <snm>Ioshikhes</snm>
                  <fnm>IP</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <issue>1</issue>
            <fpage>61</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/79189</pubid>
                  <pubid idtype="pmpid" link="fulltext">10973249</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Sequence-specific recognition of DNA by zinc-finger peptides derived from the transcription factor Sp1</p>
            </title>
            <aug>
               <au>
                  <snm>Kriwacki</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Caradonna</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1992</pubdate>
            <volume>89</volume>
            <issue>20</issue>
            <fpage>9759</fpage>
            <lpage>9763</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">50212</pubid>
                  <pubid idtype="pmpid" link="fulltext">1329106</pubid>
                  <pubid idtype="doi">10.1073/pnas.89.20.9759</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>DNA methylation patterns and epigenetic memory</p>
            </title>
            <aug>
               <au>
                  <snm>Bird</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>6</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.947102</pubid>
                  <pubid idtype="pmpid" link="fulltext">11782440</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bataille</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Poitras</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Laganiere</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lefebvre</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Deblois</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Giguere</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ferretti</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bergeron</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Coulombe</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Robert</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>656</fpage>
            <lpage>668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457048</pubid>
                  <pubid idtype="pmpid" link="fulltext">16606704</pubid>
                  <pubid idtype="doi">10.1101/gr.4866006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Skew in CG content near the transcription start site in Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Tatarinova</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Brover</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Troukhan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Alexandrov</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19 Suppl 1</volume>
            <fpage>i313</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg1043</pubid>
                  <pubid idtype="pmpid" link="fulltext">12855475</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>GC-compositional strand bias around transcription start sites in plants and fungi</p>
            </title>
            <aug>
               <au>
                  <snm>Fujimori</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Washio</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tomita</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>1</issue>
            <fpage>26</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">555766</pubid>
                  <pubid idtype="pmpid" link="fulltext">15733327</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-6-26</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment</p>
            </title>
            <aug>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Frankish</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Harrow</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ohler</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7 Suppl 1</volume>
            <fpage>S3 1</fpage>
            <lpage>13</lpage>
         </bibl>
         <bibl id="B12">
            <title>
               <p>ARTS: accurate recognition of transcription starts in human</p>
            </title>
            <aug>
               <au>
                  <snm>Sonnenburg</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zien</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ratsch</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>14</issue>
            <fpage>e472</fpage>
            <lpage>80</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl250</pubid>
                  <pubid idtype="pmpid" link="fulltext">16873509</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Promoter prediction analysis on the whole human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2004</pubdate>
            <volume>22</volume>
            <fpage>1467</fpage>
            <lpage>1473</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1032</pubid>
                  <pubid idtype="pmpid" link="fulltext">15529174</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Neuwald</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Wootton</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>262</volume>
            <issue>5131</issue>
            <fpage>208</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.8211139</pubid>
                  <pubid idtype="pmpid" link="fulltext">8211139</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation</p>
            </title>
            <aug>
               <au>
                  <snm>Roth</snm>
                  <fnm>FP</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Estep</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <issue>10</issue>
            <fpage>939</fpage>
            <lpage>945</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1098-939</pubid>
                  <pubid idtype="pmpid" link="fulltext">9788350</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The value of prior knowledge in discovering motifs with MEME</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Elkan</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>3</volume>
            <fpage>21</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7584439</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies</p>
            </title>
            <aug>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Andre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>281</volume>
            <issue>5</issue>
            <fpage>827</fpage>
            <lpage>842</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.1947</pubid>
                  <pubid idtype="pmpid" link="fulltext">9719638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Estep</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Tavazoie</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>296</volume>
            <issue>5</issue>
            <fpage>1205</fpage>
            <lpage>1214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.3519</pubid>
                  <pubid idtype="pmpid" link="fulltext">10698627</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Genome-wide location and function of DNA binding proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Ren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Robert</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Wyrick</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Aparicio</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Jennings</snm>
                  <fnm>EG</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Zeitlinger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hannett</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kanin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Volkert</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <issue>5500</issue>
            <fpage>2306</fpage>
            <lpage>2309</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5500.2306</pubid>
                  <pubid idtype="pmpid" link="fulltext">11125145</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association</p>
            </title>
            <aug>
               <au>
                  <snm>Lieb</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2001</pubdate>
            <volume>28</volume>
            <issue>4</issue>
            <fpage>327</fpage>
            <lpage>334</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng569</pubid>
                  <pubid idtype="pmpid" link="fulltext">11455386</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Predicting regulons and their cis-regulatory motifs by comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Manson McGuire</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <issue>22</issue>
            <fpage>4523</fpage>
            <lpage>4530</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">113887</pubid>
                  <pubid idtype="pmpid" link="fulltext">11071941</pubid>
                  <pubid idtype="doi">10.1093/nar/28.22.4523</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Sequencing and comparison of yeast species to identify genes and regulatory elements</p>
            </title>
            <aug>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Patterson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Endrizzi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>423</volume>
            <issue>6937</issue>
            <fpage>241</fpage>
            <lpage>254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01644</pubid>
                  <pubid idtype="pmpid" link="fulltext">12748633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Transcriptional regulatory code of a eukaryotic genome</p>
            </title>
            <aug>
               <au>
                  <snm>Harbison</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Rinaldi</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Macisaac</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Danford</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Hannett</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Tagne</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Reynolds</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Yoo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jennings</snm>
                  <fnm>EG</fnm>
               </au>
               <au>
                  <snm>Zeitlinger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pokholok</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rolfe</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Takusagawa</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Gifford</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Fraenkel</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>431</volume>
            <issue>7004</issue>
            <fpage>99</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02800</pubid>
                  <pubid idtype="pmpid" link="fulltext">15343339</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Discovery of regulatory elements in vertebrates through comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Prakash</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <issue>10</issue>
            <fpage>1249</fpage>
            <lpage>1256</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1140</pubid>
                  <pubid idtype="pmpid" link="fulltext">16211068</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Plant cis-acting regulatory DNA elements (PLACE) database: 1999</p>
            </title>
            <aug>
               <au>
                  <snm>Higo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ugawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Iwamoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Korenaga</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <issue>1</issue>
            <fpage>297</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148163</pubid>
                  <pubid idtype="pmpid" link="fulltext">9847208</pubid>
                  <pubid idtype="doi">10.1093/nar/27.1.297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Davuluri</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Palaniswamy</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Molina</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kurtz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grotewold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>25</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166152</pubid>
                  <pubid idtype="pmpid" link="fulltext">12820902</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-4-25</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>AthaMap: an online resource for in silico transcription factor binding sites in the Arabidopsis thaliana genome</p>
            </title>
            <aug>
               <au>
                  <snm>Steffens</snm>
                  <fnm>NO</fnm>
               </au>
               <au>
                  <snm>Galuschka</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Schindler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bulow</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Database issue</issue>
            <fpage>D368</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308752</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681436</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh017</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>AthaMap: from in silico data to real transcription factor binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>B&#252;low</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Steffens</snm>
                  <fnm>NO</fnm>
               </au>
               <au>
                  <snm>Galuschka</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Shindler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>In Silico Biol</source>
            <pubdate>2006</pubdate>
            <volume>6</volume>
            <fpage>23</fpage>
            <xrefbib>
               <pubid idtype="pmpid">16789908</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Genome wide analysis of Arabidopsis core promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Molina</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Grotewold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>1</issue>
            <fpage>25</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">554773</pubid>
                  <pubid idtype="pmpid" link="fulltext">15733318</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-6-25</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Computational analysis of core promoters in the Drosophila genome</p>
            </title>
            <aug>
               <au>
                  <snm>Ohler</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Liao</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Niemann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>RESEARCH0087</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151189</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537576</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0087</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells</p>
            </title>
            <aug>
               <au>
                  <snm>Elkon</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Linhart</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Sharan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shamir</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shiloh</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <issue>5</issue>
            <fpage>773</fpage>
            <lpage>780</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430898</pubid>
                  <pubid idtype="pmpid" link="fulltext">12727897</pubid>
                  <pubid idtype="doi">10.1101/gr.947203</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Clustering of DNA sequences in human promoters</p>
            </title>
            <aug>
               <au>
                  <snm>FitzGerald</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Shlyakhtenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mir</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Vinson</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <issue>8</issue>
            <fpage>1562</fpage>
            <lpage>1574</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">509265</pubid>
                  <pubid idtype="pmpid" link="fulltext">15256515</pubid>
                  <pubid idtype="doi">10.1101/gr.1953904</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Trinklein</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Anton</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>10</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1356123</pubid>
                  <pubid idtype="pmpid" link="fulltext">16344566</pubid>
                  <pubid idtype="doi">10.1101/gr.4222606</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Eukaryotic promoter recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Fickett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Hatzigeorgiou</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <issue>9</issue>
            <fpage>861</fpage>
            <lpage>878</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9314492</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Internal telomeric repeats and 'TCP domain' protein-binding sites co-operate to regulate gene expression in Arabidopsis thaliana cycling cells</p>
            </title>
            <aug>
               <au>
                  <snm>Tr&#233;mousaygue</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Garnier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bardet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dabos</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Herv&#233;</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lescure</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2003</pubdate>
            <volume>33</volume>
            <issue>6</issue>
            <fpage>957</fpage>
            <lpage>966</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.2003.01682.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12631321</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Plant bZIP proteins gather at ACGT elements</p>
            </title>
            <aug>
               <au>
                  <snm>Foster</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Izawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chua</snm>
                  <fnm>NH</fnm>
               </au>
            </aug>
            <source>FASEB J</source>
            <pubdate>1994</pubdate>
            <volume>8</volume>
            <issue>2</issue>
            <fpage>192</fpage>
            <lpage>200</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8119490</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>AtGenExpress</p>
            </title>
            <url>http://www.arabidopsis.org/info/expression/ATGenExpress.jsp</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>The Arabidopsis genome: a foundation for plant research</p>
            </title>
            <aug>
               <au>
                  <snm>Bevan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>12</issue>
            <fpage>1632</fpage>
            <lpage>1642</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.3723405</pubid>
                  <pubid idtype="pmpid" link="fulltext">16339360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Organization of cis-acting regulatory elements in osmotic- and cold-stress-responsive promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Yamaguchi-Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Trends Plant Sci</source>
            <pubdate>2005</pubdate>
            <volume>10</volume>
            <issue>2</issue>
            <fpage>88</fpage>
            <lpage>94</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tplants.2004.12.012</pubid>
                  <pubid idtype="pmpid" link="fulltext">15708346</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>PLACE</p>
            </title>
            <url>http://www.dna.affrc.go.jp/PLACE/</url>
         </bibl>
         <bibl id="B41">
            <title>
               <p>AGRIS</p>
            </title>
            <url>http://arabidopsis.med.ohio-state.edu</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Photosynthesis nuclear genes generally lack TATA-boxes: a tobacco photosystem I gene responds to light through an initiator</p>
            </title>
            <aug>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tsunoda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Obokata</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2002</pubdate>
            <volume>29</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>10</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.0960-7412.2001.01188.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12060222</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Long-range chromatin regulatory interactions in vivo</p>
            </title>
            <aug>
               <au>
                  <snm>Carter</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chakalova</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Osborne</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>YF</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32</volume>
            <issue>4</issue>
            <fpage>623</fpage>
            <lpage>626</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1051</pubid>
                  <pubid idtype="pmpid" link="fulltext">12426570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly</p>
            </title>
            <aug>
               <au>
                  <snm>Lettice</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Horikoshi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Heaney</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>van Baren</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>van der Linde</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Breedveld</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Joosse</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Akarsu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Oostra</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Endo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Shibata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shinka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakahori</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ayusawa</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nakabayashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Heutink</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Noji</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>11</issue>
            <fpage>7548</fpage>
            <lpage>7553</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">124279</pubid>
                  <pubid idtype="pmpid" link="fulltext">12032320</pubid>
                  <pubid idtype="doi">10.1073/pnas.112212199</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Genome-wide analysis of mammalian promoter architecture and evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Katayama</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shimokawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ponjavic</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Semple</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Engstrom</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Forrest</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Alkema</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Plessy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kodzius</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ravasi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kasukawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fukuda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanamori-Katayama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kitazume</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kawaji</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kai</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Konno</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mottagui-Tabar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Arner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chesi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gustincich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Persichetti</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Grimmond</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Wells</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Orlando</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Wahlestedt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Harbers</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <issue>6</issue>
            <fpage>626</fpage>
            <lpage>635</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1789</pubid>
                  <pubid idtype="pmpid" link="fulltext">16645617</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>DBTSS: DataBase of Human Transcription Start Sites, progress report 2006</p>
            </title>
            <aug>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wakaguri</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tsuritani</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D86</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347491</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381981</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>PlantProm: a database of plant promoter sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Shahmuradov</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Gammerman</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Hancock</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Bramley</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>114</fpage>
            <lpage>117</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165488</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519961</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg041</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor TFIIB</p>
            </title>
            <aug>
               <au>
                  <snm>Lagrange</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kapanidis</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Reinberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ebright</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1998</pubdate>
            <volume>12</volume>
            <fpage>34</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">316406</pubid>
                  <pubid idtype="pmpid" link="fulltext">9420329</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>High-efficiency full-length cDNA cloning by biotinylated CAP trapper</p>
            </title>
            <aug>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kvam</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kitamura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ohsumi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Okazaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kamiya</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shibata</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sasaki</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Izawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Muramatsu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>1996</pubdate>
            <volume>37</volume>
            <issue>3</issue>
            <fpage>327</fpage>
            <lpage>336</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.1996.0567</pubid>
                  <pubid idtype="pmpid" link="fulltext">8938445</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Functional annotation of a full-length Arabidopsis cDNA collection</p>
            </title>
            <aug>
               <au>
                  <snm>Seki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Narusaka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kamiya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ishida</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Satou</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sakurai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakajima</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Enju</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Akiyama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Oono</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Muramatsu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ishii</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Arakawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shibata</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shinagawa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <issue>5565</issue>
            <fpage>141</fpage>
            <lpage>145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1071006</pubid>
                  <pubid idtype="pmpid" link="fulltext">11910074</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice</p>
            </title>
            <aug>
               <au>
                  <snm>Kikuchi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Satoh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nagata</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kawagashira</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Doi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kishimoto</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yazaki</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ishikawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ooka</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hotta</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kojima</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Namiki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ohneda</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Yahagi</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Ohtsuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shishiki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Otomo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Murakami</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Iida</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fujimura</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tsunoda</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kurosaki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kodama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Masuda</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Narikawa</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sugiyama</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mizuno</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yokomizo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Niikura</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ikeda</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ishibiki</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kawamata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yoshimura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Miura</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kusumegi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Oka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ryu</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ueda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Matsubara</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Adachi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Aizawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Arakawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fukuda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hara</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hashizume</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Hayatsu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Imotani</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ishii</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kagawa</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kondo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Konno</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Miyazaki</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Osato</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ota</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sasaki</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shibata</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shinagawa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shiraki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yoshino</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yasunishi</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>301</volume>
            <issue>5631</issue>
            <fpage>376</fpage>
            <lpage>379</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1081288</pubid>
                  <pubid idtype="pmpid" link="fulltext">12869764</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Empirical analysis of transcriptional activity in the Arabidopsis genome</p>
            </title>
            <aug>
               <au>
                  <snm>Yamada</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dale</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shinn</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Palm</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Southwick</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pham</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cheuk</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Karlin-Newmann</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>SX</fnm>
               </au>
               <au>
                  <snm>Lam</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sakano</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Miranda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Quach</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Tripp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Toriumi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Onodera</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Akiyama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ansari</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Arakawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Banh</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Banno</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Bowser</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chao</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Choy</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Enju</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Goldsmith</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Gurjal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hansen</snm>
                  <fnm>NF</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Johnson-Hopson</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hsuan</snm>
                  <fnm>VW</fnm>
               </au>
               <au>
                  <snm>Iida</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Karnes</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Koesema</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ishida</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>PX</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kamiya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Meyers</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakajima</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Narusaka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Seki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sakurai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Satou</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tamse</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Vaysberg</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wallender</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Yamamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yuan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Theologis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ecker</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <issue>5646</issue>
            <fpage>842</fpage>
            <lpage>846</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1088305</pubid>
                  <pubid idtype="pmpid" link="fulltext">14593172</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>RARGE: a large-scale database of RIKEN Arabidopsis resources ranging from transcriptome to phenome</p>
            </title>
            <aug>
               <au>
                  <snm>Sakurai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Satou</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Akiyama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Iida</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Seki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kuromori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ito</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Konagaya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Toyoda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Database issue</issue>
            <fpage>D647</fpage>
            <lpage>50</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539968</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608280</pubid>
                  <pubid idtype="doi">10.1093/nar/gki014</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>KOME</p>
            </title>
            <url>http://cdna01.dna.affrc.go.jp/cDNA/</url>
         </bibl>
         <bibl id="B55">
            <title>
               <p>TAIR</p>
            </title>
            <url>http://www.arabidopsis.org/</url>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator</p>
            </title>
            <aug>
               <au>
                  <snm>Matsumoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nishimura</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>ACM Transactions on Modeling and Computer Simuation</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <fpage>3</fpage>
            <lpage>30</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1145/272991.272995</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>yamHP</p>
            </title>
            <url>http://www.gene.nagoya-u.ac.jp/~obokata-g/yyy/yamHP.html</url>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <issue>25</issue>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>EisenLab</p>
            </title>
            <url>http://rana.lbl.gov/EisenSoftware.htm</url>
         </bibl>
         <bibl id="B60">
            <title>
               <p>PCF1 and PCF2 specifically bind to cis elements in the rice proliferating cell nuclear antigen gene</p>
            </title>
            <aug>
               <au>
                  <snm>Kosugi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ohashi</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>1997</pubdate>
            <volume>9</volume>
            <issue>9</issue>
            <fpage>1607</fpage>
            <lpage>1619</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">157037</pubid>
                  <pubid idtype="pmpid" link="fulltext">9338963</pubid>
                  <pubid idtype="doi">10.1105/tpc.9.9.1607</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Direct targeting of light signals to a promoter element-bound transcription factor [see comments]</p>
            </title>
            <aug>
               <au>
                  <snm>Martinez-Garcia</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Huq</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>PH</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>288</volume>
            <issue>5467</issue>
            <fpage>859</fpage>
            <lpage>863</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.288.5467.859</pubid>
                  <pubid idtype="pmpid" link="fulltext">10797009</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Calcium/calmodulin-mediated signal network in plants</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Poovaiah</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>Trends Plant Sci</source>
            <pubdate>2003</pubdate>
            <volume>8</volume>
            <issue>10</issue>
            <fpage>505</fpage>
            <lpage>512</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tplants.2003.09.004</pubid>
                  <pubid idtype="pmpid" link="fulltext">14557048</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
