<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-8-378</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Short sequence motifs, overrepresented in mammalian conserved non-coding sequences</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Minovitsky</snm>
               <fnm>Simon</fnm>
               <insr iid="I1"/>
               <email>sminovitsky@lbl.gov</email>
            </au>
            <au id="A2">
               <snm>Stegmaier</snm>
               <fnm>Philip</fnm>
               <insr iid="I2"/>
               <email>philip.stegmaier@biobase-international.com</email>
            </au>
            <au id="A3">
               <snm>Kel</snm>
               <fnm>Alexander</fnm>
               <insr iid="I2"/>
               <email>alexander.kel@biobase-international.com</email>
            </au>
            <au id="A4">
               <snm>Kondrashov</snm>
               <mi>S</mi>
               <fnm>Alexey</fnm>
               <insr iid="I3"/>
               <email>kondrash@umich.edu</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Dubchak</snm>
               <fnm>Inna</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>ildubchak@lbl.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA</p>
            </ins>
            <ins id="I2">
               <p>BIOBASE GmbH, Halchtersche Strasse 33, D-38304 Wolfenbuettel, Germany</p>
            </ins>
            <ins id="I3">
               <p>Life Sciences Institute and Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48103, USA</p>
            </ins>
            <ins id="I4">
               <p>DOE Joint Genome Institute, Walnut Creek, CA 94598, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>378</fpage>
         <url>http://www.biomedcentral.com/1471-2164/8/378</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17945028</pubid>
               <pubid idtype="doi">10.1186/1471-2164-8-378</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>21</day>
               <month>2</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>18</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>18</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Minovitsky et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments <it>vs</it>. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Genomes of multicellular eukaryotes mostly consist of DNA segments which do not encode proteins. Still, a sizeable fraction of such non-coding DNA is subject to selective constraint and, thus, is conserved between species. Typically, a long intergenic region consists of alternating segments with high and low rates of evolution <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. A variety of terms have been used to refer to slowly-evolving segments <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, here we will call them CNSs (conserved non-coding sequences).</p>
         <p>A majority of mutations in segments which evolve at high rates are presumably selectively neutral or nearly-neutral. In contrast, a large fraction of mutations within CNSs must be deleterious enough to be removed by negative selection. Indeed, data on within-population genetic variability indicate that slow evolution of CNSs is due to negative selection, and not to locally reduced mutation rate <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. In multicellular eukaryotes with compact genomes, such as <it>Drosophila melanogaster</it>, a majority of mutations affecting non-coding sequences may be removed by selection <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. For large-genome organisms, such as mammals, the fraction of selectively constrained non-coding sequences is probably between 3% <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and ~10% <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Obviously, CNSs must perform important biological functions, but the whole range and nature of these functions remains unknown <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Still, many CNSs are certainly involved in regulation of transcription, and harbor binding sites of a variety of transcription factors <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Thus, we can expect some short sequence motifs to be overrepresented in at least some kinds of CNSs, as this is the case for proximal promoters <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Indeed, analyses of samples from human CNSs demonstrated overrepresentation of some short sequence motifs <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>New, powerful methods of detecting overrepresented motifs [<it>e. g</it>., <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>], make it possible to undertake the analysis of small-scale composition of mammalian CNSs at the genomic level. Such analysis has a potential to reveal short sequence-specific function(s) common for all human CNSs. Here, we report the results of application of discriminating matrix enumerator (DME) <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> to all strong human CNSs.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>We studied representation of short sequence motifs in all human CNSs against three backgrounds: unconserved or only weakly conserved segments of intergenic regions (non-CNSs), near-promoter non-coding sequences, and randomized sequences with the same nucleotide composition as that of CNSs. CNSs are relatively AT-rich <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>: frequencies of nucleotides A, T, G, and C are 30.7%, 30.7%, 19.3%, and 19.3% in CNSs, 26.3%, 26.4%, 23.6%, and 23.7% in non-CNSs, and 23.7%, 23.7%, 26.3%, and 26.3% in near-promoter sequences. Dinucleotide compositions of sequences of different classes were also substantially different (Fig. <figr fid="F1">1</figr>).</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Percentages of dinucleotide frequencies, in CNSs (red), non-CNSs (green), near-promoters (lue), and random sequences (black)</p>
            </caption>
            <text>
               <p>Percentages of dinucleotide frequencies, in CNSs (red), non-CNSs (green), near-promoters (lue), and random sequences (black).</p>
            </text>
            <graphic file="1471-2164-8-378-1"/>
         </fig>
         <p>CNSs from human chromosomes with odd and even numbers were analyzed separately, to check the results for consistency. The overall lengths of CNSs were 27,112,333 on odd chromosomes and 24,962,379 on even chromosomes. Tables <tblr tid="T1">1</tblr>, <tblr tid="T2">2</tblr>, and <tblr tid="T3">3</tblr> list top 30 motifs, overrepresented within CNSs over these three backgrounds. Overrepresentation was calculated as the ratio of the number of occurrences of a motif within CNSs, normalized to their overall length, over normalized number of occurrences of the motif within the background sequences.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Motifs overrepresented in CNSs over non-CNSs</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c cspan="3" ca="center">
                     <p>Odd Chromosomes</p>
                  </c>
                  <c cspan="3" ca="center">
                     <p>Even Chromosomes</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepre-sentation</p>
                  </c>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepresen-tation</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SYTAATTA</p>
                  </c>
                  <c ca="right">
                     <p>10620</p>
                  </c>
                  <c ca="right">
                     <p>3.45</p>
                  </c>
                  <c ca="left">
                     <p>TTAATTAV</p>
                  </c>
                  <c ca="right">
                     <p>12637</p>
                  </c>
                  <c ca="right">
                     <p>3.72</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CTRATTAS</p>
                  </c>
                  <c ca="right">
                     <p>6152</p>
                  </c>
                  <c ca="right">
                     <p>3.14</p>
                  </c>
                  <c ca="left">
                     <p>TAATTRCW</p>
                  </c>
                  <c ca="right">
                     <p>12019</p>
                  </c>
                  <c ca="right">
                     <p>3.43</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>WGYAATTA</p>
                  </c>
                  <c ca="right">
                     <p>12596</p>
                  </c>
                  <c ca="right">
                     <p>3.09</p>
                  </c>
                  <c ca="left">
                     <p>GYAATTAS</p>
                  </c>
                  <c ca="right">
                     <p>6142</p>
                  </c>
                  <c ca="right">
                     <p>3.39</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TTAATTAV</p>
                  </c>
                  <c ca="right">
                     <p>13141</p>
                  </c>
                  <c ca="right">
                     <p>3.08</p>
                  </c>
                  <c ca="left">
                     <p>TTTAATBA</p>
                  </c>
                  <c ca="right">
                     <p>15060</p>
                  </c>
                  <c ca="right">
                     <p>3.14</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>STAATTGV</p>
                  </c>
                  <c ca="right">
                     <p>8267</p>
                  </c>
                  <c ca="right">
                     <p>2.89</p>
                  </c>
                  <c ca="left">
                     <p>ATTAATBA</p>
                  </c>
                  <c ca="right">
                     <p>10910</p>
                  </c>
                  <c ca="right">
                     <p>3.07</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>VWGCTAAT</p>
                  </c>
                  <c ca="right">
                     <p>10503</p>
                  </c>
                  <c ca="right">
                     <p>2.84</p>
                  </c>
                  <c ca="left">
                     <p>TAATTWGM</p>
                  </c>
                  <c ca="right">
                     <p>10885</p>
                  </c>
                  <c ca="right">
                     <p>3.04</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TTTAATBA</p>
                  </c>
                  <c ca="right">
                     <p>15800</p>
                  </c>
                  <c ca="right">
                     <p>2.77</p>
                  </c>
                  <c ca="left">
                     <p>GMWTAATT</p>
                  </c>
                  <c ca="right">
                     <p>9941</p>
                  </c>
                  <c ca="right">
                     <p>2.97</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GMWTAATT</p>
                  </c>
                  <c ca="right">
                     <p>10290</p>
                  </c>
                  <c ca="right">
                     <p>2.72</p>
                  </c>
                  <c ca="left">
                     <p>CWTAATKA</p>
                  </c>
                  <c ca="right">
                     <p>10028</p>
                  </c>
                  <c ca="right">
                     <p>2.94</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TAATTATV</p>
                  </c>
                  <c ca="right">
                     <p>10100</p>
                  </c>
                  <c ca="right">
                     <p>2.72</p>
                  </c>
                  <c ca="left">
                     <p>ATTAAWTT</p>
                  </c>
                  <c ca="right">
                     <p>11570</p>
                  </c>
                  <c ca="right">
                     <p>2.85</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>STTAATKG</p>
                  </c>
                  <c ca="right">
                     <p>5905</p>
                  </c>
                  <c ca="right">
                     <p>2.71</p>
                  </c>
                  <c ca="left">
                     <p>TTAATBAT</p>
                  </c>
                  <c ca="right">
                     <p>10115</p>
                  </c>
                  <c ca="right">
                     <p>2.79</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATTVAATT</p>
                  </c>
                  <c ca="right">
                     <p>12177</p>
                  </c>
                  <c ca="right">
                     <p>2.68</p>
                  </c>
                  <c ca="left">
                     <p>CWKTAATT</p>
                  </c>
                  <c ca="right">
                     <p>13079</p>
                  </c>
                  <c ca="right">
                     <p>2.75</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATTAATBA</p>
                  </c>
                  <c ca="right">
                     <p>11006</p>
                  </c>
                  <c ca="right">
                     <p>2.61</p>
                  </c>
                  <c ca="left">
                     <p>VWGCTAAT</p>
                  </c>
                  <c ca="right">
                     <p>9823</p>
                  </c>
                  <c ca="right">
                     <p>2.71</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWKTAATT</p>
                  </c>
                  <c ca="right">
                     <p>13577</p>
                  </c>
                  <c ca="right">
                     <p>2.59</p>
                  </c>
                  <c ca="left">
                     <p>CMATWAAT</p>
                  </c>
                  <c ca="right">
                     <p>10129</p>
                  </c>
                  <c ca="right">
                     <p>2.65</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATAATTAV</p>
                  </c>
                  <c ca="right">
                     <p>10536</p>
                  </c>
                  <c ca="right">
                     <p>2.58</p>
                  </c>
                  <c ca="left">
                     <p>ATTTVATT</p>
                  </c>
                  <c ca="right">
                     <p>15715</p>
                  </c>
                  <c ca="right">
                     <p>2.64</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SMAATTAA</p>
                  </c>
                  <c ca="right">
                     <p>12754</p>
                  </c>
                  <c ca="right">
                     <p>2.57</p>
                  </c>
                  <c ca="left">
                     <p>CAATTRCH</p>
                  </c>
                  <c ca="right">
                     <p>8188</p>
                  </c>
                  <c ca="right">
                     <p>2.61</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SBTAATGA</p>
                  </c>
                  <c ca="right">
                     <p>8828</p>
                  </c>
                  <c ca="right">
                     <p>2.56</p>
                  </c>
                  <c ca="left">
                     <p>MCWAATTA</p>
                  </c>
                  <c ca="right">
                     <p>9605</p>
                  </c>
                  <c ca="right">
                     <p>2.61</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>VATTWGCA</p>
                  </c>
                  <c ca="right">
                     <p>14265</p>
                  </c>
                  <c ca="right">
                     <p>2.53</p>
                  </c>
                  <c ca="left">
                     <p>ATTWWGCA</p>
                  </c>
                  <c ca="right">
                     <p>9959</p>
                  </c>
                  <c ca="right">
                     <p>2.61</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TWAATCAR</p>
                  </c>
                  <c ca="right">
                     <p>10639</p>
                  </c>
                  <c ca="right">
                     <p>2.52</p>
                  </c>
                  <c ca="left">
                     <p>GKTAATTW</p>
                  </c>
                  <c ca="right">
                     <p>9019</p>
                  </c>
                  <c ca="right">
                     <p>2.59</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AATTAVTT</p>
                  </c>
                  <c ca="right">
                     <p>12668</p>
                  </c>
                  <c ca="right">
                     <p>2.51</p>
                  </c>
                  <c ca="left">
                     <p>AATTAMCW</p>
                  </c>
                  <c ca="right">
                     <p>10053</p>
                  </c>
                  <c ca="right">
                     <p>2.58</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GTAATTMM</p>
                  </c>
                  <c ca="right">
                     <p>7484</p>
                  </c>
                  <c ca="right">
                     <p>2.49</p>
                  </c>
                  <c ca="left">
                     <p>MATTDGCA</p>
                  </c>
                  <c ca="right">
                     <p>13694</p>
                  </c>
                  <c ca="right">
                     <p>2.58</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T2">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Motifs overrepresented in CNSs over near-promoter sequences</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c cspan="3" ca="center">
                     <p>Odd Chromosomes</p>
                  </c>
                  <c cspan="3" ca="center">
                     <p>Even Chromosomes</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepre- sentation</p>
                  </c>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepresen-tation</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>STAATTAS</p>
                  </c>
                  <c ca="right">
                     <p>7576</p>
                  </c>
                  <c ca="right">
                     <p>4.55</p>
                  </c>
                  <c ca="left">
                     <p>SYTAATTA</p>
                  </c>
                  <c ca="right">
                     <p>9852</p>
                  </c>
                  <c ca="right">
                     <p>4.26</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TTAATKAR</p>
                  </c>
                  <c ca="right">
                     <p>17516</p>
                  </c>
                  <c ca="right">
                     <p>4.33</p>
                  </c>
                  <c ca="left">
                     <p>TTAATTAD</p>
                  </c>
                  <c ca="right">
                     <p>14561</p>
                  </c>
                  <c ca="right">
                     <p>4.07</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GBTAATKA</p>
                  </c>
                  <c ca="right">
                     <p>12299</p>
                  </c>
                  <c ca="right">
                     <p>3.96</p>
                  </c>
                  <c ca="left">
                     <p>CTRATTAS</p>
                  </c>
                  <c ca="right">
                     <p>5744</p>
                  </c>
                  <c ca="right">
                     <p>3.90</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>VTAATTGM</p>
                  </c>
                  <c ca="right">
                     <p>10174</p>
                  </c>
                  <c ca="right">
                     <p>3.91</p>
                  </c>
                  <c ca="left">
                     <p>ATTAATGN</p>
                  </c>
                  <c ca="right">
                     <p>9762</p>
                  </c>
                  <c ca="right">
                     <p>3.74</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TTTMATKA</p>
                  </c>
                  <c ca="right">
                     <p>19449</p>
                  </c>
                  <c ca="right">
                     <p>3.86</p>
                  </c>
                  <c ca="left">
                     <p>TAATTATD</p>
                  </c>
                  <c ca="right">
                     <p>11760</p>
                  </c>
                  <c ca="right">
                     <p>3.73</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>MTTMATTA</p>
                  </c>
                  <c ca="right">
                     <p>13688</p>
                  </c>
                  <c ca="right">
                     <p>3.82</p>
                  </c>
                  <c ca="left">
                     <p>TTTAATDA</p>
                  </c>
                  <c ca="right">
                     <p>16633</p>
                  </c>
                  <c ca="right">
                     <p>3.66</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AATKYAAT</p>
                  </c>
                  <c ca="right">
                     <p>15204</p>
                  </c>
                  <c ca="right">
                     <p>3.73</p>
                  </c>
                  <c ca="left">
                     <p>ATAATTAB</p>
                  </c>
                  <c ca="right">
                     <p>9233</p>
                  </c>
                  <c ca="right">
                     <p>3.62</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TTAATKGV</p>
                  </c>
                  <c ca="right">
                     <p>12925</p>
                  </c>
                  <c ca="right">
                     <p>3.72</p>
                  </c>
                  <c ca="left">
                     <p>TAATKSAA</p>
                  </c>
                  <c ca="right">
                     <p>10418</p>
                  </c>
                  <c ca="right">
                     <p>3.59</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RTAATKAA</p>
                  </c>
                  <c ca="right">
                     <p>13613</p>
                  </c>
                  <c ca="right">
                     <p>3.68</p>
                  </c>
                  <c ca="left">
                     <p>STAATTGV</p>
                  </c>
                  <c ca="right">
                     <p>7823</p>
                  </c>
                  <c ca="right">
                     <p>3.55</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>MMTAATTA</p>
                  </c>
                  <c ca="right">
                     <p>12518</p>
                  </c>
                  <c ca="right">
                     <p>3.68</p>
                  </c>
                  <c ca="left">
                     <p>GYAATWAA</p>
                  </c>
                  <c ca="right">
                     <p>10608</p>
                  </c>
                  <c ca="right">
                     <p>3.55</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TSTAATTW</p>
                  </c>
                  <c ca="right">
                     <p>14964</p>
                  </c>
                  <c ca="right">
                     <p>3.49</p>
                  </c>
                  <c ca="left">
                     <p>TGYAATTW</p>
                  </c>
                  <c ca="right">
                     <p>13322</p>
                  </c>
                  <c ca="right">
                     <p>3.51</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AATKMATT</p>
                  </c>
                  <c ca="right">
                     <p>18824</p>
                  </c>
                  <c ca="right">
                     <p>3.48</p>
                  </c>
                  <c ca="left">
                     <p>AATGMWTT</p>
                  </c>
                  <c ca="right">
                     <p>15412</p>
                  </c>
                  <c ca="right">
                     <p>3.49</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TGATWAAW</p>
                  </c>
                  <c ca="right">
                     <p>12898</p>
                  </c>
                  <c ca="right">
                     <p>3.46</p>
                  </c>
                  <c ca="left">
                     <p>AGYAATTW</p>
                  </c>
                  <c ca="right">
                     <p>12585</p>
                  </c>
                  <c ca="right">
                     <p>3.41</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>KATAATKA</p>
                  </c>
                  <c ca="right">
                     <p>10739</p>
                  </c>
                  <c ca="right">
                     <p>3.46</p>
                  </c>
                  <c ca="left">
                     <p>AATTDATT</p>
                  </c>
                  <c ca="right">
                     <p>14693</p>
                  </c>
                  <c ca="right">
                     <p>3.39</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CATTAAKV</p>
                  </c>
                  <c ca="right">
                     <p>10838</p>
                  </c>
                  <c ca="right">
                     <p>3.42</p>
                  </c>
                  <c ca="left">
                     <p>AATTATAD</p>
                  </c>
                  <c ca="right">
                     <p>10379</p>
                  </c>
                  <c ca="right">
                     <p>3.36</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CATWAWTT</p>
                  </c>
                  <c ca="right">
                     <p>14599</p>
                  </c>
                  <c ca="right">
                     <p>3.39</p>
                  </c>
                  <c ca="left">
                     <p>TWAATTGR</p>
                  </c>
                  <c ca="right">
                     <p>8896</p>
                  </c>
                  <c ca="right">
                     <p>3.35</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CATTWAAW</p>
                  </c>
                  <c ca="right">
                     <p>19325</p>
                  </c>
                  <c ca="right">
                     <p>3.37</p>
                  </c>
                  <c ca="left">
                     <p>AWTARCAT</p>
                  </c>
                  <c ca="right">
                     <p>9601</p>
                  </c>
                  <c ca="right">
                     <p>3.35</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CAATTAKV</p>
                  </c>
                  <c ca="right">
                     <p>9515</p>
                  </c>
                  <c ca="right">
                     <p>3.33</p>
                  </c>
                  <c ca="left">
                     <p>TAATTHAT</p>
                  </c>
                  <c ca="right">
                     <p>12789</p>
                  </c>
                  <c ca="right">
                     <p>3.34</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATRATTYA</p>
                  </c>
                  <c ca="right">
                     <p>13356</p>
                  </c>
                  <c ca="right">
                     <p>3.30</p>
                  </c>
                  <c ca="left">
                     <p>CWTTAATR</p>
                  </c>
                  <c ca="right">
                     <p>9114</p>
                  </c>
                  <c ca="right">
                     <p>3.32</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATTTYMAT</p>
                  </c>
                  <c ca="right">
                     <p>20983</p>
                  </c>
                  <c ca="right">
                     <p>3.29</p>
                  </c>
                  <c ca="left">
                     <p>ATTSMATT</p>
                  </c>
                  <c ca="right">
                     <p>11547</p>
                  </c>
                  <c ca="right">
                     <p>3.27</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T3">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Motifs overrepresented in CNSs over randomized sequences</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c cspan="3" ca="center">
                     <p>Odd Chromosomes</p>
                  </c>
                  <c cspan="3" ca="center">
                     <p>Even Chromosomes</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepre-sentation</p>
                  </c>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="left">
                     <p>Number of occurrences</p>
                  </c>
                  <c ca="left">
                     <p>Overrepre-sentation</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWGSCWGS</p>
                  </c>
                  <c ca="right">
                     <p>32472</p>
                  </c>
                  <c ca="right">
                     <p>7.50</p>
                  </c>
                  <c ca="left">
                     <p>CWGSCWGV</p>
                  </c>
                  <c ca="right">
                     <p>38927</p>
                  </c>
                  <c ca="right">
                     <p>5.78</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCCHGSCH</p>
                  </c>
                  <c ca="right">
                     <p>42207</p>
                  </c>
                  <c ca="right">
                     <p>5.68</p>
                  </c>
                  <c ca="left">
                     <p>SCCWGGSN</p>
                  </c>
                  <c ca="right">
                     <p>33122</p>
                  </c>
                  <c ca="right">
                     <p>5.63</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GGSWGGSN</p>
                  </c>
                  <c ca="right">
                     <p>39555</p>
                  </c>
                  <c ca="right">
                     <p>5.55</p>
                  </c>
                  <c ca="left">
                     <p>CYCWSCCH</p>
                  </c>
                  <c ca="right">
                     <p>33976</p>
                  </c>
                  <c ca="right">
                     <p>5.50</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWGSCCWS</p>
                  </c>
                  <c ca="right">
                     <p>24103</p>
                  </c>
                  <c ca="right">
                     <p>5.52</p>
                  </c>
                  <c ca="left">
                     <p>RGCWGSCH</p>
                  </c>
                  <c ca="right">
                     <p>30738</p>
                  </c>
                  <c ca="right">
                     <p>4.95</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RGTCCTBY</p>
                  </c>
                  <c ca="right">
                     <p>22100</p>
                  </c>
                  <c ca="right">
                     <p>5.45</p>
                  </c>
                  <c ca="left">
                     <p>GGSDGRGV</p>
                  </c>
                  <c ca="right">
                     <p>34873</p>
                  </c>
                  <c ca="right">
                     <p>4.93</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GRGSWGRG</p>
                  </c>
                  <c ca="right">
                     <p>25293</p>
                  </c>
                  <c ca="right">
                     <p>5.36</p>
                  </c>
                  <c ca="left">
                     <p>CWGSCYCH</p>
                  </c>
                  <c ca="right">
                     <p>29902</p>
                  </c>
                  <c ca="right">
                     <p>4.78</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CCYYYCCH</p>
                  </c>
                  <c ca="right">
                     <p>40727</p>
                  </c>
                  <c ca="right">
                     <p>5.22</p>
                  </c>
                  <c ca="left">
                     <p>CWSCWGGV</p>
                  </c>
                  <c ca="right">
                     <p>31840</p>
                  </c>
                  <c ca="right">
                     <p>4.73</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCCWGGRV</p>
                  </c>
                  <c ca="right">
                     <p>33839</p>
                  </c>
                  <c ca="right">
                     <p>5.20</p>
                  </c>
                  <c ca="left">
                     <p>SCWGCWGV</p>
                  </c>
                  <c ca="right">
                     <p>30968</p>
                  </c>
                  <c ca="right">
                     <p>4.71</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWGSCYCH</p>
                  </c>
                  <c ca="right">
                     <p>36409</p>
                  </c>
                  <c ca="right">
                     <p>5.04</p>
                  </c>
                  <c ca="left">
                     <p>CWGGGRRV</p>
                  </c>
                  <c ca="right">
                     <p>31866</p>
                  </c>
                  <c ca="right">
                     <p>4.64</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCWGGGSN</p>
                  </c>
                  <c ca="right">
                     <p>36038</p>
                  </c>
                  <c ca="right">
                     <p>5.03</p>
                  </c>
                  <c ca="left">
                     <p>CWGRGSCH</p>
                  </c>
                  <c ca="right">
                     <p>28886</p>
                  </c>
                  <c ca="right">
                     <p>4.61</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCHGSCCH</p>
                  </c>
                  <c ca="right">
                     <p>36013</p>
                  </c>
                  <c ca="right">
                     <p>4.91</p>
                  </c>
                  <c ca="left">
                     <p>CCWGGRRV</p>
                  </c>
                  <c ca="right">
                     <p>31578</p>
                  </c>
                  <c ca="right">
                     <p>4.61</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWGRGSCH</p>
                  </c>
                  <c ca="right">
                     <p>35318</p>
                  </c>
                  <c ca="right">
                     <p>4.77</p>
                  </c>
                  <c ca="left">
                     <p>SCHGGSCH</p>
                  </c>
                  <c ca="right">
                     <p>28689</p>
                  </c>
                  <c ca="right">
                     <p>4.50</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCYCWGCH</p>
                  </c>
                  <c ca="right">
                     <p>34141</p>
                  </c>
                  <c ca="right">
                     <p>4.56</p>
                  </c>
                  <c ca="left">
                     <p>GGRARGRR</p>
                  </c>
                  <c ca="right">
                     <p>29240</p>
                  </c>
                  <c ca="right">
                     <p>4.47</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NCAGCTGN</p>
                  </c>
                  <c ca="right">
                     <p>32928</p>
                  </c>
                  <c ca="right">
                     <p>4.52</p>
                  </c>
                  <c ca="left">
                     <p>RRGGCWGV</p>
                  </c>
                  <c ca="right">
                     <p>30772</p>
                  </c>
                  <c ca="right">
                     <p>4.44</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CAGCTGNN</p>
                  </c>
                  <c ca="right">
                     <p>32867</p>
                  </c>
                  <c ca="right">
                     <p>4.51</p>
                  </c>
                  <c ca="left">
                     <p>RGGGRARR</p>
                  </c>
                  <c ca="right">
                     <p>29828</p>
                  </c>
                  <c ca="right">
                     <p>4.41</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TWACWGAA</p>
                  </c>
                  <c ca="right">
                     <p>14781</p>
                  </c>
                  <c ca="right">
                     <p>4.48</p>
                  </c>
                  <c ca="left">
                     <p>GVWGGGRR</p>
                  </c>
                  <c ca="right">
                     <p>31019</p>
                  </c>
                  <c ca="right">
                     <p>4.37</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RGGGRRAR</p>
                  </c>
                  <c ca="right">
                     <p>32929</p>
                  </c>
                  <c ca="right">
                     <p>4.42</p>
                  </c>
                  <c ca="left">
                     <p>CYCYVSCC</p>
                  </c>
                  <c ca="right">
                     <p>19097</p>
                  </c>
                  <c ca="right">
                     <p>4.37</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CWGSAGSY</p>
                  </c>
                  <c ca="right">
                     <p>24140</p>
                  </c>
                  <c ca="right">
                     <p>4.37</p>
                  </c>
                  <c ca="left">
                     <p>KCCWSCCH</p>
                  </c>
                  <c ca="right">
                     <p>26417</p>
                  </c>
                  <c ca="right">
                     <p>4.33</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SCWGGRAR</p>
                  </c>
                  <c ca="right">
                     <p>32065</p>
                  </c>
                  <c ca="right">
                     <p>4.37</p>
                  </c>
                  <c ca="left">
                     <p>CAGCYSNG</p>
                  </c>
                  <c ca="right">
                     <p>16617</p>
                  </c>
                  <c ca="right">
                     <p>4.28</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GGARRGRR</p>
                  </c>
                  <c ca="right">
                     <p>33390</p>
                  </c>
                  <c ca="right">
                     <p>4.37</p>
                  </c>
                  <c ca="left">
                     <p>KKGGCWGV</p>
                  </c>
                  <c ca="right">
                     <p>28051</p>
                  </c>
                  <c ca="right">
                     <p>4.13</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p>In order to study a possible similarity of the overrepresented CNS motifs with known binding sites for transcription factors (TF), we applied our recently developed method m2transfac <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, and compared all the motifs found at the previous step with the TRANSFAC library of positional weight matrices (PWMs). Relatively few matches between the motifs and the TF matrices were found. Out of 12000 motifs reported at the previous step as being overrepresented in CNS versus the three different backgrounds, we have identified just 20 motifs that match TF matrices with E-values lower than 0.001 and satisfy factor class-specific cut-offs (Table <tblr tid="T4">4</tblr>). The majority of these matches involved matrices for the factors of "Forkhead DNA-binding domain", especially of the FOX family, which were repeatedly found over two rather different backgrounds: of non-CNSs and randomized sequences. Among the motifs found over the background of near-promoter sequences, there was only one that matched a PWM.</p>
         <tbl id="T4">
            <title>
               <p>Table 4</p>
            </title>
            <caption>
               <p>Motifs found matching transcription factor PWMs from TRANSFAC</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>
                        <b>Accession</b>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Consensus/ID</b>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Factor class</b>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Taxon</b>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Binding factors</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>acns even</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME280</p>
                  </c>
                  <c ca="left">
                     <p>ATAAACAN</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXI1a,FOXF1,FOXL1,FOXO4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME424</p>
                  </c>
                  <c ca="left">
                     <p>WGTAAAYA</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXC1,FOXA4a,HNF-3beta</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME768</p>
                  </c>
                  <c ca="left">
                     <p>WTGTCATV</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + leucine zipper (bZIP)</p>
                  </c>
                  <c ca="left">
                     <p>Nematode</p>
                  </c>
                  <c ca="left">
                     <p>Skn-1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1427</p>
                  </c>
                  <c ca="left">
                     <p>WGTCATSM</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + leucine zipper (bZIP)</p>
                  </c>
                  <c ca="left">
                     <p>Nematode</p>
                  </c>
                  <c ca="left">
                     <p>Skn-1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>acns odd</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME27</p>
                  </c>
                  <c ca="left">
                     <p>VATTWGCA</p>
                  </c>
                  <c ca="left">
                     <p>POU</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>POU2F1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME349</p>
                  </c>
                  <c ca="left">
                     <p>ATAAACAN</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXI1a,FOXF1,FOXL1,FOXO4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1014</p>
                  </c>
                  <c ca="left">
                     <p>GTMAACAD</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXD1,HNF-3beta,FOXO1a</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1700</p>
                  </c>
                  <c ca="left">
                     <p>CCAATMAB</p>
                  </c>
                  <c ca="left">
                     <p>DNA-binding domain with Histone fold</p>
                  </c>
                  <c ca="left">
                     <p>Fungal</p>
                  </c>
                  <c ca="left">
                     <p>HAP2,HAP3,HAP4</p>
                  </c>
               </r>
               <r>
                  <c cspan="2" ca="left">
                     <p>
                        <b>promoters even</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="2" ca="left">
                     <p>
                        <b>promoters odd</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1268</p>
                  </c>
                  <c ca="left">
                     <p>STGASTYA</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + leucine zipper (bZIP)</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>NF-E2,AP-1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>random even</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME90</p>
                  </c>
                  <c ca="left">
                     <p>VCAGATGN</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + helix-loop-helix motif</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>ITF-2,Tal-1beta</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME94</p>
                  </c>
                  <c ca="left">
                     <p>CATCTGBN</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + helix-loop-helix motif</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>ITF-2,Tal-1beta,E47</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME765</p>
                  </c>
                  <c ca="left">
                     <p>RTGWSTCA</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + leucine zipper (bZIP)</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>NF-E2,AP-1,Fos,Jun,Fra</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1106</p>
                  </c>
                  <c ca="left">
                     <p>TGTTBACW</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>HNF-3beta</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1111</p>
                  </c>
                  <c ca="left">
                     <p>ATAAACAH</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXI1a,FOXF1,FOXL1,FOXO4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1920</p>
                  </c>
                  <c ca="left">
                     <p>CCACGTGG</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + helix-loop-helix motif</p>
                  </c>
                  <c ca="left">
                     <p>Plant, Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>PIF3,c-Myc:Max</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>random odd</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME11</p>
                  </c>
                  <c ca="left">
                     <p>CAGCTGNN</p>
                  </c>
                  <c ca="left">
                     <p>Basic region + helix-loop-helix motif</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>AP-4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME456</p>
                  </c>
                  <c ca="left">
                     <p>MAYAAACA</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>FOXF1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME790</p>
                  </c>
                  <c ca="left">
                     <p>TATGVAAA</p>
                  </c>
                  <c ca="left">
                     <p>POU</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>POU2F1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME930</p>
                  </c>
                  <c ca="left">
                     <p>ATAAAYAT</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate, Insect</p>
                  </c>
                  <c ca="left">
                     <p>FOXI1a,Croc</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>DME1145</p>
                  </c>
                  <c ca="left">
                     <p>TGTTBACW</p>
                  </c>
                  <c ca="left">
                     <p>Forkhead DNA-binding domain</p>
                  </c>
                  <c ca="left">
                     <p>Vertebrate</p>
                  </c>
                  <c ca="left">
                     <p>HNF-3beta</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>We treated all human CNSs as a single class of sequences. Comparison of this class against three different backgrounds demonstrates that many short sequence motifs are substantially overrepresented within CNSs (Tables <tblr tid="T1">1</tblr>, <tblr tid="T2">2</tblr>, <tblr tid="T3">3</tblr>). CNSs from odd- and from even-numbered human chromosomes show very similar patterns, which is consistent with the lack of any large-scale heterogeneity within CNSs. At a first glance, these results may seem to suggest that CNSs as a whole possess some complex sequence pattern(s), with possible implications for their functioning. However, this is probably not the case. Instead, the results can be explained by simple, generic properties of CNSs.</p>
         <p>Indeed, when CNSs are analyzed against a background of non-CNSs (Table <tblr tid="T1">1</tblr>) or of near-promoter sequences (Table <tblr tid="T2">2</tblr>), almost all overrepresented motifs possess two common features: (i) they are AT-rich (consist of 75% or more of A and/or T) and (ii) they contain runs of A's and/or T's. Feature (i) simply reflects a well-known, although poorly understood, fact that CNSs are more AT-rich than the genome as a whole <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B17">17</abbr></abbrgrp> or that these two classes of background sequences. Feature (ii) appears to be due to general excess of AA and TT dinucleotides in CNSs, relatively to corresponding random sequences. This tendency of A's and T'e to clump is probably due to patterns in mutation, and not to any functional constraint. Indeed, context-dependence of spontaneous mutation in mammals tends to produce runs of A's and T's, because at a site preceded and followed by A's (T's) T>A (A>T) transversions are ~2 times more common than A>T transversions <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>; Table <tblr tid="T2">2</tblr>.</p>
         <p>Obviously, it is neccessary to consider CNSs against a background of the same nucleotide composition, as otherwise the impact of different compositions is the leading factor causing overrepresentation of some motifs. When CNSs are analyzed against a background of random sequences of the same, AT-rich, nucleotide composition, the results are very different (Table <tblr tid="T3">3</tblr>), and overrepresented motifs can be naturally subdivided into two classes. The first, larger class contains a variety of GC-rich motifs which, however, are devoid of CpG dinucleotides and are correspondingly enriched with CpA and CpT dinucleotides and with CWG short motif. The second, smaller class contains several motifs which are either purine- or pyrimidine-rich. Overrepresentation of motifs from the first class appear to be due to two simple factors: i) the presence, within CNSs, of short GC-rich segments and ii) hypermutability of CpG dinucleotides <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Indeed, CNSs are depleted of CpG's more than the other two classes of genomic sequences (Fig. <figr fid="F1">1</figr>), which might reflect strong methilation of CNSs. Overrepresentation of motifs of the second class simply reflects a well-known <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, although poorly understood, abundance of short segments with strong purine/pyrimidine imbalance between the two DNA stands within the human genome.</p>
         <p>The analysis of all human CNSs does not reveal clear "global" patterns consistent with overrepresentation of specific, functional motifs. A small number of the observed overrepresented motifs are similar to Position Weight Matrices (PWMs) from TRANSFAC database <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> (Table <tblr tid="T4">4</tblr>). Among them, the strongest similarity was to the PWMs of FOX and POU families of factors which are characterized by a specific AT-rich pattern. In order to test if the identification of FOX-domain matrices is merely an effect of the general AT-richness of the CNS regions we check carefully results of alignments of all other "AT-rich" matrices in TRANSFAC. There are approximately 64% of matrices in TRANSFAC with overall AT composition higher then 50%. 16 of them are characterised by the same and even higher AT-composition then any of the FOX and POU-domain matrices (e.g. matrices for such factors as TBP, Lhx3, Evi-1, Nkx3-1 and others). Nevertheless, non of them gave statistically significant results of the alignments with the motifs under study. This confirms the similarity of some motifs from the list specifically to the FOX- and POU-domain matrices. The FOX factors are involved in many cellular processes and often control very first steps of organism development as well as cell cycle and differentiation; e. g. FOXF1 is highly expressed in mouse embryonic extraembryonic and lateral mesoderm <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and control murine gut development <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>; FOXD1 is predominantly expressed in embryonic forebrain neuroepithelium, head mesenchyme and adrenal cortex <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and controls normal brain and kidney morphogenesis and cellularity in the renal capsule <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>; FOXO1 governs cell growth in the heart <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Factors of other families, such as POU and bZIP are often involved in regulation of basic cell cycle machinery; e.g. POU2F1 is an ubiquitous factor involved in stimulation of replication <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> and also participates in early mouse embryogenesis <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. In summary, it might be tempting to speculate that at least some motifs overrepresented in all CNSs may play crucial role in organizing the process of development of the vertebrate organisms. However, the number of such motifs is not high., More specific classes of CNSs, such as those adjacent to genes with a particular pattern in expression <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp> should be considered in order to find a larger number of functional motifs.</p>
         <p>In contrast, small-scale composition of human CNSs, considered as a whole, is strongly affected by patterns in mutation &#8211; hypermutability of CpG's and the tendency for A's and T's to form runs. This is unexpected because CNSs must be under negative selection which can overcome any impact of mutation <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Apparently, selective constraint on the evolution of individual nucleoitide site can be quite weak even within strongly conserved CNSs.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Abundance of short sequence motifs in all human CNSs is mostly dictated by their general features: overall AT-richness of CNSs, runs of A's and T's, GC-rich regions, avoidance of CpG's, and local purine/pyrimidine imbalance of the DNA strands. Apparently, CNSs as a whole are too broad a class to display strong overrepresentation of specific motifs. Instead, such motifs must be sought within subclasses of CNSs. In particular, tissue-specificity of expression of the genes adjacent to a CNS must be taken into account.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>We used the VISTA pipeline infrastructure <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> with Shuffle-LAGAN glocal chaining algorithm <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> applied to local alignments produced by translated BLAT <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> for the construction of genome-wide pairwise human/mouse alignment. The level of conservation in the alignment was evaluated with the computational algorithm Gumby <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> that makes minimal assumptions about the statistical features of conserved noncoding regions and treating the sequence alignment as its own training set. Gumby <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> proceeds through five steps:</p>
         <p>1. Noncoding regions in the input alignment are used to estimate the neutral mismatch frequency <it>p</it>N between each pair of aligned sequences. This is done simply by counting the number of mismatches in nonexonic positions and dividing by the number of aligned nonexonic positions.</p>
         <p>2. A log-odds scoring scheme for constrained versus neutral evolution is then independently initialized for the pair of sequences, based on the assumption that the mismatch frequency <it>p</it>C in constrained regions equals <it>p</it>N/<it>R</it>, where the ratio <it>R </it>is an arbitrary parameter. For example, if <it>R </it>= 3/2 (default value), constrained regions are expected to evolve at 2/3 times the neutral rate, until sequence divergence begins to saturate.</p>
         <p>The log-odds mismatch score for the sequence pair is then given by S0 = log((<it>p</it>N/<it>R</it>)/<it>p</it>N) = -log(<it>R</it>), and the match score is S1 = log((1 - <it>p</it>N/<it>R</it>)/(1 - <it>p</it>N)). The default <it>R</it>-ratio (1.5) was selected to optimize the sensitivity-specificity tradeoff in detecting empirically defined regulatory elements in the <it>SCL </it>locus. Gap characters in the alignment are assigned a weighted average of mismatch and match scores: SG = <it>p</it>NS0 + (1 - <it>p</it>N)S1.</p>
         <p>3. Each alignment column is scored as a sum of pairwise log-odds scores. The resulting conservation score fulfills the requirements of Karlin-Altschul statistics, in that positive column scores are possible, though the average column score is negative <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
         <p>4. Conserved regions appear as stretches of alignment columns with a high aggregate score.</p>
         <p>5. The aggregate score of the alignment columns in each conserved region is translated into a <it>P</it>-value using Karlin-Altschul statistics. As is the case with the BLAST algorithm <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, the <it>P</it>-value of a given conserved element varies with the size of the search space, since one is more likely to find a given degree of conservation by random chance in a long alignment than in a short alignment. To make the <it>P</it>-values comparable across alignments of different lengths, Gumby normalizes them to refer to a fictitious fixed-length alignment with the same statistical properties as the true alignment. The 10-kb <it>P</it>-value is related to the expected number of false positives in a 10-kb region (i.e. the 10-kb <it>E</it>-value) as follows: <it>P </it>= 1-exp(-<it>E</it>). When <it>P </it>&lt;&lt; 1, <it>P </it>&#8776; <it>E</it>. Thus, the <it>P</it>-value also doubles as an estimate of the false-positive rate.</p>
         <p>Intervals with P-value threshold of 0.01 produced a set of 144,165 highly conserved sequences that totaled 49 Mb in length. We eliminated all conserved regions that coincide with the coding evidence provided by the UCSC data sets of mRNA, human spliced EST and human EST. We excluded CNSs located within (-1000, +1000) from the start and end of transcription.</p>
         <p>Non-CNSs were defined as regions that have human/mouse alignment, conserved below 50% in a 100 bp window and not containing repeats and coding evidences. Random sequences were generated using standard C library pseudo-random generator. Overrepresentation of motifs in different random sequences was calculated using DME <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> (see Additional File <supplr sid="S1">1</supplr>). DME identifies motifs, represented as position weight matrices that are overrepresented in one set of sequences relative to another set. The ability to directly optimize relative overrepresentation is a unique feature of DME, making DME an ideal tool for comparing two sets. In all of studies we compared 8-mers (parameter w = 8) and bits/column bound was set to 1.6 (parameter i = 1.6).</p>
         <suppl id="S1">
            <title>
               <p>Additional file 1</p>
            </title>
            <text>
               <p>Overrepresented motifs when two random sets are compared. The data provided represent comparison of two randomized sets of sequences.</p>
            </text>
            <file name="1471-2164-8-378-S1.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
         <p>DME motifs were compared to the TRANSFAC<sup>&#174; </sup>database with the m2transfac program <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The program retrieves all non-overlapping pairwise ungapped alignments of a query matrix and a TRANSFAC matrix satisfying a given threshold. The primary similarity measure is an alignment score which combines Kullback-Leibler divergence with a scoring system that was previously applied successfully to comparison of Hidden Markov Models <abbrgrp><abbr bid="B35">35</abbr></abbrgrp></p>
         <p>
            <display-formula id="M1"><it>S</it>(<it>p</it>, <it>q</it>) = <it>C</it>(<it>p</it>, <it>q</it>) - <it>D</it>(<it>p</it>, <it>q</it>)</display-formula>
         </p>
         <p>with</p>
         <p>
            <display-formula id="M2">
               <m:math name="1471-2164-8-378-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>C</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>p</m:mi>
                        <m:mo>,</m:mo>
                        <m:mi>q</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:msub>
                           <m:mrow>
                              <m:mi>log</m:mi>
                              <m:mo>&#8289;</m:mo>
                           </m:mrow>
                           <m:mn>2</m:mn>
                        </m:msub>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mn>4</m:mn>
                           </m:munderover>
                           <m:mrow>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                    <m:msub>
                                       <m:mi>q</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>r</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                        </m:mstyle>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqcqGGOaakcqWGWbaCcqGGSaalcqWGXbqCcqGGPaqkcqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaaqahabaWaaSaaaeaacqWGWbaCdaWgaaWcbaGaemyAaKgabeaakiabdghaXnaaBaaaleaacqWGPbqAaeqaaaGcbaGaemOCai3aaSbaaSqaaiabdMgaPbqabaaaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabisda0aqdcqGHris5aaaa@48E5@</m:annotation>
                  </m:semantics>
               </m:math>
            </display-formula>
         </p>
         <p>
            <display-formula id="M3">
               <m:math name="1471-2164-8-378-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>D</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>p</m:mi>
                        <m:mo>,</m:mo>
                        <m:mi>q</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mn>1</m:mn>
                           <m:mn>2</m:mn>
                        </m:mfrac>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>i</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mn>4</m:mn>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                    <m:mi>log</m:mi>
                                    <m:mo>&#8289;</m:mo>
                                    <m:mfrac>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>p</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>q</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mfrac>
                                    <m:mo>+</m:mo>
                                    <m:mstyle displaystyle="true">
                                       <m:munderover>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mrow>
                                             <m:mi>i</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                          <m:mn>4</m:mn>
                                       </m:munderover>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>q</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                          <m:mi>log</m:mi>
                                          <m:mo>&#8289;</m:mo>
                                          <m:mfrac>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>q</m:mi>
                                                   <m:mi>i</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>p</m:mi>
                                                   <m:mi>i</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mfrac>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGebarcqGGOaakcqWGWbaCcqGGSaalcqWGXbqCcqGGPaqkcqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaadaqadaqaamaaqahabaGaemiCaa3aaSbaaSqaaiabdMgaPbqabaGccyGGSbaBcqGGVbWBcqGGNbWzdaWcaaqaaiabdchaWnaaBaaaleaacqWGPbqAaeqaaaGcbaGaemyCae3aaSbaaSqaaiabdMgaPbqabaaaaOGaey4kaSYaaabCaeaacqWGXbqCdaWgaaWcbaGaemyAaKgabeaakiGbcYgaSjabc+gaVjabcEgaNnaalaaabaGaemyCae3aaSbaaSqaaiabdMgaPbqabaaakeaacqWGWbaCdaWgaaWcbaGaemyAaKgabeaaaaaabaGaemyAaKMaeyypa0JaeGymaedabaGaeGinaqdaniabggHiLdaaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqaI0aana0GaeyyeIuoaaOGaayjkaiaawMcaaaaa@5FCB@</m:annotation>
                  </m:semantics>
               </m:math>
            </display-formula>
         </p>
         <p>In equation (2), r is background model which is set to the uniform distribution. Equation (2) is based on the column score derived in <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. The term assigns a positive score to similar distributions and tends towards zero for less conserved positions. Equation (3) is a symmetrized relative entropy or Kullback-Leibler (KL) divergence. Relative entropy was used previously in applications for classification of protein as well as nucleotide patterns <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. The m2transfac scoring system combines the advantages of both measures. The KL divergence directly assesses the difference of two distributions and therefore increases specificity for similar distributions, but makes no distinction on the basis of their conservation, which is however a property of the column score.</p>
         <p>The m2transfac output provides E-values, the number of alignments with greater or equal score expected from searching a database with 1000 matrices. These are derived for each TRANSFAC PWM from score distribution estimates based on large-scale searches of a random matrix library. Furthermore, we apply the transcription factor classification that was developed in our group <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp> to gather matrices according to DNA-binding domain classes of their binding factors and derive factor class-specific score thresholds. We define 57 matrix groups, 15 of which comprise matrices which cannot be associated with a particular factor class, e.g. the barbiturate-inducible element, or whose binding factors are so far not assigned to a protein-structural class. Some matrices occur in more than one class if TFs of different classes are annotated as binding factors, or binding factors possess multiple DNA-binding domain types. For each PWM, score thresholds are defined at three levels of stringency above the score of the first observed false positive in a search of the TRANSFAC database.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>SM designed and carried out the computational experiments; PS developed the program and analyzed TRANSFAC PWMs, A Kel provided biological insight and actively participated in discussion of the project and writing the paper, A Kondrashov and ID designed and led the project. All authors have read and approved the final version of the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We are grateful to Andrew Smith for providing us with the DME software. Research was conducted at the E.O. Lawrence Berkeley National Laboratory, supported by grant HL066681 Berkeley-PGA (SM and ID), under the Programs for Genomic Applications, funded by National Heart, Lung, &amp; Blood Institute and by HG003988 (L.A.P.) and performed under Department of Energy Contract DE-AC02-05CH11231, University of California. MB was supported by the NSERC Discovery grant.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Pattern of selective constraint in C. elegans and C. briggsae genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Shabalina</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>Genet Res</source>
            <pubdate>1999</pubdate>
            <volume>74</volume>
            <issue>1</issue>
            <fpage>23</fpage>
            <lpage>30</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">10505405</pubid>
                  <pubid idtype="doi">10.1017/S0016672399003821</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs)</p>
            </title>
            <aug>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Scamuffa</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ucla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kirkness</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rossier</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <issue>5647</issue>
            <fpage>1033</fpage>
            <lpage>1035</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">14526086</pubid>
                  <pubid idtype="doi">10.1126/science.1087047</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Identification and characterization of multi-species conserved sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Margulies</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <issue>12</issue>
            <fpage>2507</fpage>
            <lpage>2518</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">14656959</pubid>
                  <pubid idtype="doi">10.1101/gr.1602203</pubid>
                  <pubid idtype="pmcid">403793</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Conserved noncoding sequences are selectively constrained and not mutation cold spots</p>
            </title>
            <aug>
               <au>
                  <snm>Drake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Bird</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nemesh</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Newton-Cheh</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Excoffier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Attar</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <issue>2</issue>
            <fpage>223</fpage>
            <lpage>227</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16380714</pubid>
                  <pubid idtype="doi">10.1038/ng1710</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison</p>
            </title>
            <aug>
               <au>
                  <snm>Halligan</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Keightley</snm>
                  <fnm>PD</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>7</issue>
            <fpage>875</fpage>
            <lpage>884</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16751341</pubid>
                  <pubid idtype="doi">10.1101/gr.5022906</pubid>
                  <pubid idtype="pmcid">1484454</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Adaptive evolution of non-coding DNA in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Andolfatto</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>437</volume>
            <issue>7062</issue>
            <fpage>1149</fpage>
            <lpage>1152</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16237443</pubid>
                  <pubid idtype="doi">10.1038/nature04107</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Initial sequencing and comparative analysis of the mouse genome</p>
            </title>
            <aug>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Agarwala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ainscough</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Alexandersson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>An</snm>
                  <fnm>P</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <issue>6915</issue>
            <fpage>520</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12466850</pubid>
                  <pubid idtype="doi">10.1038/nature01262</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Selective constraint in intergenic regions of human and mouse genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Shabalina</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Ogurtsov</snm>
                  <fnm>AY</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <issue>7</issue>
            <fpage>373</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11418197</pubid>
                  <pubid idtype="doi">10.1016/S0168-9525(01)02344-7</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Conserved non-genic sequences &#8211; an unexpected feature of mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>2</issue>
            <fpage>151</fpage>
            <lpage>157</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15716910</pubid>
                  <pubid idtype="doi">10.1038/nrg1527</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Conserved noncoding sequences are reliable guides to regulatory elements</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>9</issue>
            <fpage>369</fpage>
            <lpage>372</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">10973062</pubid>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02081-3</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>DNA motifs in human and mouse proximal promoters predict tissue-specific expression</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Sumazin</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Xuan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <issue>16</issue>
            <fpage>6275</fpage>
            <lpage>6280</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16606849</pubid>
                  <pubid idtype="doi">10.1073/pnas.0508169103</pubid>
                  <pubid idtype="pmcid">1458868</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Distant conserved sequences flanking endothelial-specific promoters contain tissue-specific DNase-hypersensitive sites and over-represented motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Bernat</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Crawford</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Ogurtsov</snm>
                  <fnm>AY</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Ginsburg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2006</pubdate>
            <volume>15</volume>
            <issue>13</issue>
            <fpage>2098</fpage>
            <lpage>2105</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16723375</pubid>
                  <pubid idtype="doi">10.1093/hmg/ddl133</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Classification of common conserved sequences in mammalian intergenic regions</p>
            </title>
            <aug>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Shabalina</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <issue>6</issue>
            <fpage>669</fpage>
            <lpage>674</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11912182</pubid>
                  <pubid idtype="doi">10.1093/hmg/11.6.669</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Identifying tissue-selective transcription factor binding sites in vertebrate promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Sumazin</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>5</issue>
            <fpage>1560</fpage>
            <lpage>1565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15668401</pubid>
                  <pubid idtype="doi">10.1073/pnas.0406123102</pubid>
                  <pubid idtype="pmcid">547828</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A suite of web-based programs to search for transcriptional regulatory motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>XS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <issue>32 Web Server</issue>
            <fpage>W204</fpage>
            <lpage>207</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15215381</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh461</pubid>
                  <pubid idtype="pmcid">441599</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <aug>
               <au>
                  <snm>Stegmaier</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <url>http://www.gene-regulation.com/pub/programs.html#m2transfac</url>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Striking nucleotide frequency pattern at the borders of highly conserved vertebrate non-coding sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Walter</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Abnizova</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Elgar</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gilks</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>8</issue>
            <fpage>436</fpage>
            <lpage>440</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15979195</pubid>
                  <pubid idtype="doi">10.1016/j.tig.2005.06.003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Hwang</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>39</issue>
            <fpage>13994</fpage>
            <lpage>14001</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15292512</pubid>
                  <pubid idtype="doi">10.1073/pnas.0404142101</pubid>
                  <pubid idtype="pmcid">521089</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Selection in favor of nucleotides G and C diversifies evolution rates and levels of polymorphism at mammalian synonymous sites</p>
            </title>
            <aug>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Ogurtsov</snm>
                  <fnm>AY</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>J Theor Biol</source>
            <pubdate>2006</pubdate>
            <volume>240</volume>
            <issue>4</issue>
            <fpage>616</fpage>
            <lpage>626</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16343547</pubid>
                  <pubid idtype="doi">10.1016/j.jtbi.2005.10.020</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Initial sequencing and analysis of the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Linton</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Devon</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dewar</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Doyle</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>FitzHugh</snm>
                  <fnm>W</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <issue>6822</issue>
            <fpage>860</fpage>
            <lpage>921</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11237011</pubid>
                  <pubid idtype="doi">10.1038/35057062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Liebich</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Barre-Dirrie</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Chekmenev</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Krull</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Database</issue>
            <fpage>D108</fpage>
            <lpage>110</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16381825</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj143</pubid>
                  <pubid idtype="pmcid">1347505</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The winged helix transcriptional activator HFH-8 is expressed in the mesoderm of the primitive streak stage of mouse embryos and its cellular derivatives</p>
            </title>
            <aug>
               <au>
                  <snm>Peterson</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Lim</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Overdier</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Costa</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Mech Dev</source>
            <pubdate>1997</pubdate>
            <volume>69</volume>
            <issue>1&#8211;2</issue>
            <fpage>53</fpage>
            <lpage>69</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">9486531</pubid>
                  <pubid idtype="doi">10.1016/S0925-4773(97)00153-6</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Foxf1 and Foxf2 control murine gut development by limiting mesenchymal Wnt signaling and promoting extracellular matrix production</p>
            </title>
            <aug>
               <au>
                  <snm>Ormestad</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Astorga</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Landgren</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Johansson</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Miura</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Carlsson</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2006</pubdate>
            <volume>133</volume>
            <issue>5</issue>
            <fpage>833</fpage>
            <lpage>843</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16439479</pubid>
                  <pubid idtype="doi">10.1242/dev.02252</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Essential role of stromal mesenchyme in kidney morphogenesis revealed by targeted disruption of Winged Helix transcription factor BF-2</p>
            </title>
            <aug>
               <au>
                  <snm>Hatini</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Huh</snm>
                  <fnm>SO</fnm>
               </au>
               <au>
                  <snm>Herzlinger</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Soares</snm>
                  <fnm>VC</fnm>
               </au>
               <au>
                  <snm>Lai</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1996</pubdate>
            <volume>10</volume>
            <issue>12</issue>
            <fpage>1467</fpage>
            <lpage>1478</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">8666231</pubid>
                  <pubid idtype="doi">10.1101/gad.10.12.1467</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Foxd1-dependent signals control cellularity in the renal capsule, a structure required for normal renal development</p>
            </title>
            <aug>
               <au>
                  <snm>Levinson</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Batourina</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Choi</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Vorontchikhina</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kitajewski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mendelsohn</snm>
                  <fnm>CL</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2005</pubdate>
            <volume>132</volume>
            <issue>3</issue>
            <fpage>529</fpage>
            <lpage>539</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15634693</pubid>
                  <pubid idtype="doi">10.1242/dev.01604</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Foxo transcription factors blunt cardiac hypertrophy by inhibiting calcineurin signaling</p>
            </title>
            <aug>
               <au>
                  <snm>Ni</snm>
                  <fnm>YG</fnm>
               </au>
               <au>
                  <snm>Berenji</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sachan</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Dey</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Morris</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Castrillon</snm>
                  <fnm>DH</fnm>
               </au>
               <etal/>
            </aug>
            <source>Circulation</source>
            <pubdate>2006</pubdate>
            <volume>114</volume>
            <issue>11</issue>
            <fpage>1159</fpage>
            <lpage>1168</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16952979</pubid>
                  <pubid idtype="doi">10.1161/CIRCULATIONAHA.106.637124</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Identification of a novel lymphoid specific octamer binding protein (OTF-2B) by proteolytic clipping bandshift assay (PCBA)</p>
            </title>
            <aug>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Matthias</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>M&#252;ller</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Schaffner</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1988</pubdate>
            <volume>7</volume>
            <issue>13</issue>
            <fpage>4221</fpage>
            <lpage>4229</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">455135</pubid>
                  <pubid idtype="pmpid">3072196</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Octamer binding proteins confer transcriptional activity in early mouse embryogenesis</p>
            </title>
            <aug>
               <au>
                  <snm>Scholer</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Balling</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hatzopoulos</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gruss</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1989</pubdate>
            <volume>8</volume>
            <issue>9</issue>
            <fpage>2551</fpage>
            <lpage>2557</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">401254</pubid>
                  <pubid idtype="pmpid">2573524</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>VISTA: computational tools for comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <issue>32 Web Server</issue>
            <fpage>W273</fpage>
            <lpage>279</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15215394</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh458</pubid>
                  <pubid idtype="pmcid">441596</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Glocal alignment: finding rearrangements during alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Malde</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Do</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 1</issue>
            <fpage>i54</fpage>
            <lpage>62</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12855437</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/btg1005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>BLAT&#8211;the BLAST-like alignment tool</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>4</issue>
            <fpage>656</fpage>
            <lpage>664</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11932250</pubid>
                  <pubid idtype="doi">10.1101/gr.229202. Article published online before March 2002</pubid>
                  <pubid idtype="pmcid">187518</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Close sequence comparisons are sufficient to identify human cis-regulatory elements</p>
            </title>
            <aug>
               <au>
                  <snm>Prabhakar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Poulin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Shoukry</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Afzal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>7</issue>
            <fpage>855</fpage>
            <lpage>863</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16769978</pubid>
                  <pubid idtype="doi">10.1101/gr.4717506</pubid>
                  <pubid idtype="pmcid">1484452</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1990</pubdate>
            <volume>87</volume>
            <issue>6</issue>
            <fpage>2264</fpage>
            <lpage>2268</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">2315319</pubid>
                  <pubid idtype="doi">10.1073/pnas.87.6.2264</pubid>
                  <pubid idtype="pmcid">53667</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Basic local alignment search tool</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <issue>3</issue>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2231712</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Protein homology detection by HMM-HMM comparison</p>
            </title>
            <aug>
               <au>
                  <snm>Soding</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>7</issue>
            <fpage>951</fpage>
            <lpage>960</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15531603</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/bti125</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>T-Reg Comparator: an analysis tool for the comparison of position weight matrices</p>
            </title>
            <aug>
               <au>
                  <snm>Roepcke</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Grossmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rahmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <issue>33 Web Server</issue>
            <fpage>W438</fpage>
            <lpage>441</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15980506</pubid>
                  <pubid idtype="doi">10.1093/nar/gki590</pubid>
                  <pubid idtype="pmcid">1160266</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Phylogenetic inference in protein superfamilies: analysis of SH2 domains</p>
            </title>
            <aug>
               <au>
                  <snm>Sj&#246;lander</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>6</volume>
            <fpage>165</fpage>
            <lpage>174</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9783222</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Systematic DNA-binding domain classification of transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Stegmaier</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Inform</source>
            <pubdate>2004</pubdate>
            <volume>15</volume>
            <issue>2</issue>
            <fpage>276</fpage>
            <lpage>286</lpage>
            <xrefbib>
               <pubid idtype="pmpid">15706513</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>[Classification of eukaryotic transcription factors]</p>
            </title>
            <aug>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Mol Biol (Mosk)</source>
            <pubdate>1997</pubdate>
            <volume>31</volume>
            <issue>4</issue>
            <fpage>584</fpage>
            <lpage>600</lpage>
            <note>[Article in Russian]</note>
            <xrefbib>
               <pubid idtype="pmpid">9340487</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
