<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-7-235</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p><it>Cis</it>-regulatory variations: A study of SNPs around genes showing <it>cis</it>-linkage in segregating mouse populations</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>GuhaThakurta</snm>
               <fnm>Debraj</fnm>
               <insr iid="I1"/>
               <email>debraj_guhathakurta@merck.com</email>
            </au>
            <au id="A2">
               <snm>Xie</snm>
               <fnm>Tao</fnm>
               <insr iid="I1"/>
               <email>tao_xie@merck.com</email>
            </au>
            <au id="A3">
               <snm>Anand</snm>
               <fnm>Manish</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>manish__anand@hotmail.com</email>
            </au>
            <au id="A4">
               <snm>Edwards</snm>
               <mi>W</mi>
               <fnm>Stephen</fnm>
               <insr iid="I1"/>
               <email>stephen_edwards@merck.com</email>
            </au>
            <au id="A5">
               <snm>Li</snm>
               <fnm>Guoya</fnm>
               <insr iid="I2"/>
               <email>guoya09@yahoo.com</email>
            </au>
            <au id="A6">
               <snm>Wang</snm>
               <mi>S</mi>
               <fnm>Susanna</fnm>
               <insr iid="I3"/>
               <email>sueming@ucla.edu</email>
            </au>
            <au ca="yes" id="A7">
               <snm>Schadt</snm>
               <mi>E</mi>
               <fnm>Eric</fnm>
               <insr iid="I1"/>
               <email>eric_schadt@merck.com</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Genetics, Rosetta Inpharmatics LLC, a wholly owned subsidiaryof Merck &amp; Co., Inc. 401 Terry Avenue North, Seattle, WA 98109, USA</p>
            </ins>
            <ins id="I2">
               <p>Informatics, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck &amp; Co., Inc. 401 Terry Avenue North, Seattle, WA 98109, USA</p>
            </ins>
            <ins id="I3">
               <p>Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095-1679, USA</p>
            </ins>
            <ins id="I4">
               <p>Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>235</fpage>
         <url>http://www.biomedcentral.com/1471-2164/7/235</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16978413</pubid>
               <pubid idtype="doi">10.1186/1471-2164-7-235</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>07</day>
               <month>6</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>15</day>
               <month>9</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>15</day>
               <month>9</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>GuhaThakurta et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Changes in gene expression are known to be responsible for phenotypic variation and susceptibility to diseases. Identification and annotation of the genomic sequence variants that cause gene expression changes is therefore likely to lead to a better understanding of the cause of disease at the molecular level. In this study we investigate the pattern of single nucleotide polymorphisms (SNPs) in genes for which the mRNA levels show <it>cis</it>-genetic linkage (gene <ul>e</ul>xpression <ul>q</ul>uantitative <ul>t</ul>rait <ul>l</ul>oci mapping in <it>cis</it>, or <it>cis</it>-eQTLs) in segregating mouse populations. Such genes are expected to have polymorphisms near their physical location (<it>cis</it>-variations) that affect their mRNA levels by altering one or more of the <it>cis</it>-regulatory elements. This led us to characterize the SNPs in promoter (5 Kb upstream) and non-coding gene regions (introns and 5 Kb downstream) (<it>cis</it>-SNPs) and the effects they may have on putative transcription factor binding sites.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We demonstrate that the <it><ul>c</ul>is</it>-<ul>e</ul>QTL genes (CEGs) have a significantly higher frequency of <it>cis</it>-SNPs compared to non-CEGs (when both sets are taken from the non-IBD regions, i.e. regions not identical by descent). Most CEGs having <it>cis</it>-SNPs do not contain these SNPs in the phylogenetically conserved regions. In those CEGs that contain <it>cis</it>-SNPs in the phylogenetically conserved regions, enrichment of <it>cis</it>-SNPs occurs both within and outside of the conserved sequences. A higher fraction of CEGs are also seen to harbor <it>cis</it>-SNP that affect predicted transcription factor binding sites, a likely consequence of the higher <it>cis</it>-SNPs density in these genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>This present study provides the first genome-wide investigation of the putative <it>cis</it>-regulatory variations in a large set of genes whose levels of expression give rise to <it>cis</it>-linkage in segregating mammalian populations. Our results provide insights into the challenges that exist in identifying polymorphisms regulating gene expression using bioinformatic sequence analysis approaches. The data provided herein should benefit future investigations in this area.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Single nucleotide polymorphisms (SNPs) in the genomic sequence underlie susceptibility to or protection from diseases by affecting biological processes at the molecular level, such as protein structure, transcription, alternative splicing etc <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. There are a number of examples in which polymorphisms in the promoter regions, and those causing expression changes in the corresponding genes, have been found to be associated with disease <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. In addition, genetic variation of gene expression has been utilized to identify causal genes for complex diseases <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. However, the pattern of polymorphisms that underlie heritable variation of gene expression in segregating mammalian populations, as well as bioinformatic sequence analysis methods for identifying these regulatory polymorphisms, have not yet been investigated in a systematic way. Here we characterize the pattern of <it>cis</it>-SNPs that could cause quantitative genetic variations in mRNA levels in two mouse intercross populations.</p>
         <p>We investigated the frequency and the potential role of the <it>cis</it>-SNPs for disrupting transcription factor binding sites (TFBS) around the genes whose expression levels in murine intercross populations gave rise to strong <it>cis</it>-acting eQTL. We focused on this set of genes for the following reasons: 1) a sizable fraction of genes whose expression varies in a segregating population show <it>cis</it>-linkage <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>, 2) evidence for the medical importance of <it>cis</it>-regulatory variation has been demonstrated by positional cloning studies in which SNPs in susceptibility genes that were not located in the protein coding or splice-site regions were nevertheless shown to be associated with complex human diseases such as stroke, type 2 diabetes etc. <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>, 3) the polymorphisms that affect the expression levels of these genes are either in the genomic region of the gene or in the nearby upstream or downstream region (<it>cis</it>-regulatory variation <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>), which significantly restricts the search space for these causal variations.</p>
         <p>We found a significantly higher number of <it><ul>cis</ul></it>-acting <ul>e</ul>QTL genes (<it>CEGs</it>) were in regions that were not <ul>i</ul>dentical <ul>b</ul>y <ul>d</ul>escent (IBD) between the parental inbred mouse lines used to construct the mouse crosses. In considering the genes that fall outside of these IBD regions, we found that a significantly higher number of CEGs have <it>cis</it>-SNPs in their promoter (i.e. immediate 5' upstream sequence) and non-coding regions (i.e. introns and immediate 3' downstream sequences) compared to genes that do not give rise to <it>cis</it>-acting eQTLs (non-CEGs). The density of SNPs in these regions is also significantly higher in the CEGs compared to non-CEGs. In addition, the enrichment of <it>cis</it>-SNPs is not limited to the highly conserved sequences between mouse and human, and in fact in a majority of the CEGs the <it>cis</it>-SNPs do not overlap any conserved sequences in the promoter or non-coding regions, suggesting that the <it>cis</it>-SNPs in these genes do not perturb the highly conserved sequences in the immediate vicinity. A higher fraction of CEGs have <it>cis</it>-SNPs that perturb predicted transcription factor binding sites (TFBS) in non-coding regions, likely a consequence of the higher <it>cis</it>-SNP density in these regions resulting in an increased number of intersections between <it>cis</it>-SNPs and the TFBSs.</p>
         <p>The implications of the above findings on the challenges related to the identification and annotation of genomic regulatory polymorphisms through bioinformatic sequence analysis methods are discussed. Our results suggest that the approaches that are commonly employed in identification of putative regulatory variants, such as searches for polymorphisms in the immediate upstream regions and cross-species conserved sequences, are unlikely to elucidate a significant fraction of the <it>cis</it>-regulatory variations responsible for causing changes in gene expression in genetically segregating mammalian populations.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Mouse intercross populations and cis-acting eQTL genes</p>
            </st>
            <p>mRNA expression data for multiple tissues in F<sub>2 </sub>animals from two mouse intercrosses constructed from C57BL/6J and DBA/2J (referred here as the BXD cross) <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B15">15</abbr></abbrgrp>, and from C57BL/6J and C3H/HeJ inbred lines (referred here as the BXH cross) <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, were available for analysis (for details see methods). The BXD F<sub>2 </sub>population <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B15">15</abbr></abbrgrp> consisted of 111 female mice and comprehensive mRNA expression profiles were available for liver, while the BXH F<sub>2 </sub>population <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> contained 334 mice (169 female, 165 male) and expression profiles were available for four tissues, namely liver, white adipose, whole brain, and skeletal muscle. All of the expression data from the two crosses we have used here for analyses were generated and described previously <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B10">10</abbr><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <p>In the same manner as classic phenotypic trait data, QTLs for gene expression levels can be computationally mapped using genetic linkage mapping strategies <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. mRNA levels of genes were treated as continuous variables and mapped to the genome using a standard interval mapping procedure <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> to identify <ul>e</ul>xpression QTLs (eQTLs). From the linkage results CEGs were defined as follows: 1) eQTL LOD score &#8805; 4.3 (the threshold in an F<sub>2 </sub>mouse intercross for achieving a genome-wide p-value of 0.05, and a point-wise significance of 0.00005), 2) eQTL is near the physical location of the gene itself (within 10 Mb, equivalent to roughly 5 cM), 3) the eQTL explains more than 10% of the genetic variation of expression for the gene in the respective F<sub>2 </sub>populations.</p>
            <p>Using the specific conditions described above, a total of 3,769 distinct CEGs were identified (roughly 20% of the genes represented on the array) over all four tissues in the BXH cross, and 338 CEGs were identified in BXD cross. Reasons for identification of significantly fewer CEGs in the BXD cross relative to the BXH cross include: 1) availability of mRNA profiles from only one (liver) tissue compared to four tissues in the BXH cross, and 2) a lower number of animals (111 in BXD compared to 334 in BXH) resulting in a reduced power to detect QTLs. The number of CEGs for the BXD cross given here is less than previously reported for this same cross <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, given we employed a more conservative definition of <it>cis</it>-eQTLs in this preset study for the purpose of minimizing the false positive calls and working with the highest confidence CEGs. The CEGs from all tissues in both crosses are provided in the supplementary materials (Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>). Later we have described how we have prepared a common set of CEGs and non-CEGs for analysis by combining the data from the two crosses.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>All genes showing <it>cis</it>-expression linkage (CEGs) in the two mouse crosses. The mouse cross, the tissue in which the <it>cis</it>-linkage was observed, LocusLink identifier, official gene symbol, physical locations of the genes (UCSC mm4 or NCBI build 32 assembly), and the <it>cis</it>-acting LOD scores are given.</p>
               </text>
               <file name="1471-2164-7-235-S1.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>The complete integrated list of CEGs from BxD and BxH crosses. Gene coordinates are with respect to UCSC mm4 assembly (NCBI build 32). The <it>cis</it>-acting eQTL information of these genes may be obtained from the <supplr sid="S1">Additional file 1</supplr>.</p>
               </text>
               <file name="1471-2164-7-235-S2.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>IBD regions between parental strains and the distribution of CEGs with respect to the IBD status</p>
            </st>
            <p>Genomic segments in different mouse strains that are inherited from a common ancestor are referred to as <ul>i</ul>dentical <ul>b</ul>y <ul>d</ul>escent or IBD. The IBD regions can be considered to be largely homologous sequence blocks between two strains, while the non-IBD (nIBD) regions can be considered as polymorphic blocks. Most of the polymorphisms between mouse strains exist in sequence regions that are not in IBD <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and reported variations that are in the IBD regions either represent sequencing errors or mutations that occurred in the strains after sub-speciation.</p>
            <p>So that the readers can focus on the key findings of the manuscript, we present the details of the IBD map we have used here <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and the reason for its selection in the Data and Methods section. However, it is worth mentioning here briefly that a very significant enrichment of CEGs was observed in the regions that were <ul>n</ul>ot in <ul>IBD</ul> (nIBD) (with the Fisher exact test p = 8.87 &#215; 10<sup>-28 </sup>for CEGs from the BxD cross, and 4.08 &#215; 10<sup>-296 </sup>for CEGs from the BxH cross). The analyses described in subsequent sections were performed with the set of genes that are <it>not </it>in IBD regions. This is because genes in IBD regions would be expected to have significantly fewer SNPs if any, in the surrounding regions, and therefore the comparison of patterns of polymorphisms in those genes with the CEGs, most of which contain <it>cis</it>-SNPs and are in nIBD, would not be appropriate.</p>
            <p>Since the C3H strain has not yet been sequenced, a complete set of SNPs between the B6 and C3H parental strains used to construct the BXH cross was not available from public sources or the Celera mouse SNP database <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Only a small number of SNPs (25,064) that mapped uniquely to the mouse genome were available between these two strains (from the public dbSNP database <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> (build 120)). Therefore, we used the set of SNPs that were polymorphic between B6 and DBA for analysis of the <it>cis</it>-SNPs around CEGs in the BXH cross as described below, imputing the regions of shared haplotypes between strains using the IBD map. Genomic sequence blocks that were IBD between C3H and DBA, but nIBD between B6 and C3H and nIBD between B6 and DBA (see Figure <figr fid="F1">1</figr>), were identified. These regions are called nIBD-BXH for reference. The nIBD-BXH regions identified in this way are expected to be homologous between C3H and DBA, but polymorphic between B6 and DBA as well as between B6 and C3H. In the nIBD-BXH regions, SNPs occurring between B6 and DBA should thus be the same as those occurring between B6 and C3H. Of the 492,250 SNPs polymorphic between B6 and DBA identified as falling in non-repeat regions, 274,908 (55.4%) were nIBD-BXH. Genes and SNPs contained in the nIBD-BXH regions were used for analysis of data from the BXH cross.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Determining nIBD-BXH regions from IBD blocks between mouse strains</p>
               </caption>
               <text>
                  <p><b>Determining nIBD-BXH regions from IBD blocks between mouse strains</b>. B6 refers to C57BL/6J, DBA refers to DBA/2J, and C3H refers to C3H/HeJ. Horizontal bars represent genomic sequence. Regions that are in the same color between two or more strains represent the IBD blocks between those strains. nIBD-BXH (indicated with a box) are regions that are IBD between C3H/HeJ and DBA/2J, but nIBD between C3H/HeJ and C57BL/6J, and nIBD between C57BL/6J and DBA/2J, as explained in the text.</p>
               </text>
               <graphic file="1471-2164-7-235-1"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Creating a common set of CEGs and non-CEGs from the BXD and BXH crosses</p>
            </st>
            <p>To characterize the frequency and location of <it>cis</it>-SNPs in genes, we constructed a common set of CEGs and non-CEGs from the BXD and BXH crosses. Given the number of CEGs identified in the BXD cross was small (~10% of the total available from both crosses), these data on their own would not be as highly powered to identify <it>cis</it>-SNP patterns of interest in the CEGs. Therefore, we combined the CEGs from the BXD and BXH crosses to carry out all subsequent analyses. Combining the 338 CEGs from BXD cross with the 3,769 CEGs from the BXH cross, and only considering the CEGs within the nIBD-BXH regions, resulted in a set of 2,047 distinct CEGs (see <supplr sid="S2">Additional file 2</supplr>). The inclusion CEGs from the BXD cross added 75 distinct CEGs into the BXH data set. For the purpose of comparison with the CEGs, we created a set of non-CEGs by considering genes that did not give rise to any <it>cis</it>-eQTL in either cross. To be consistent with the CEG set, only non-CEGs falling in the nIBD-BXH were considered. Thus, a combined set of 2,705 distinct non-CEGs was created by taking the intersection of the genes that did not show <it>cis</it>-acting eQTLs in either of the two crosses. It is of note that some of the non-CEGs defined here may show up as CEGs in other segregating mouse populations, in other tissues, or in other F<sub>2 </sub>populations constructed from B6, DBA, and C3H mice (increased number of mice in a F<sub>2 </sub>population would have higher power to detect eQTLs). As additional comprehensive sets of CEGs become available, the sets of CEGs and non-CEGs can be refined to produce more accurate positive as well as negative sets.</p>
         </sec>
         <sec>
            <st>
               <p>Fraction of CEGs containing cis-SNPs is significantly higher compared to non-CEGs</p>
            </st>
            <p>CEGs by definition are expected to contain genetic variations near their physical location on the genome which give rise to variations of their mRNA levels in a segregating population. We have therefore studied the frequency and density of SNPs in the promoters and non-coding regions (for definitions see below) of the CEGs and compared them to non-CEGs. These studies are described below.</p>
            <p>In defining the promoters and non-coding regions, the gene boundaries and exons were first determined based on clustering of all mRNAs and cDNAs (including ESTs) aligning to a common genomic locus as described in detail earlier <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp> (see Methods). The promoter regions were then defined to be the 5 Kb or 2 Kb sequence upstream of the gene start coordinates. The non-coding regions comprised of the introns and 5 Kb sequence downstream of the genes. SNPs in the promoter and non-coding regions of genes are referred to here as <it>cis</it>-SNPs.</p>
            <p>Although transcriptional regulatory elements are often found to be concentrated in the immediate promoter region, they are also located in the introns and downstream regions <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. On one hand examining only the promoter sequence would clearly be insufficient; on the other hand including the introns and down-stream sequences could dilute the density of regulatory elements (if in fact they were enriched in the immediate promoter regions of most genes under consideration in our study), thereby making it difficult to identify any relationship between SNPs and these elements. Therefore we analyzed the promoter and non-coding regions (NCR) separately. In addition to the immediate vicinity of the genes, regulatory elements such as enhancers or silencers can also be present at distances that are far away from the genes themselves <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>; we have not addressed these in our present study.</p>
            <p>We analyzed <it>cis</it>-SNPs in regions that were most conserved between the mouse and human genomes. Functional non-coding sequences are often assumed to be under evolutionary selection pressure, and thereby conserved relative to the surrounding non-functional sequence. Consequently, phylogenetic footprinting has been widely used for the analyses of non-coding regulatory sequences <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. Although phylogenetic footprinting methods have limitations (sequences from organisms that are too distant or too close can be uninformative), the alignments of rodent-human sequences have been demonstrated in many studies to be successful in identifying regulatory elements, and significant enrichment of known regulatory elements have been found in these regions <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. We therefore investigated the presence of SNPs in the mouse-human aligned regions in the promoters and non-coding regions to see if a higher fraction of CEGs contain <it>cis</it>-SNPs in these conserved sequences. For this purpose the mouse-human genome alignments were taken directly from the UCSC genome annotation project <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, where the two genomes were aligned using the BLASTZ software <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> and post-processed to obtain the best alignments for each region (see Methods for details). These alignments represent the most conserved sequences between the mouse and human genomes and cover ~6% of the mouse genome, which is roughly the percentage of mammalian genome that is estimated to be under purifying selection <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
            <p>A significantly higher fraction of the CEGs contained <it>cis</it>-SNPs (at p &lt; 0.01 with Fisher exact test, Table <tblr tid="T1">1</tblr>) compared to non-CEGs. When we considered <it>cis</it>-SNPs contained only within regions that are conserved between mouse and human, the fraction of CEGs containing <it>cis</it>-SNPs was still observed to be higher than non-CEGs (p &lt; 0.01, Table <tblr tid="T1">1</tblr>), but the significances were decreased for the conserved promoters regions (p ~10<sup>-3</sup>) compared to all promoter regions (p ~10<sup>-12</sup>). A ratio of over-representation (ROR) for CEGs containing <it>cis</it>-SNPs may be defined as the ratio of the fraction of CEGs containing <it>cis</it>-SNPs to the fraction of all genes (CEGs+non-CEGs) containing <it>cis</it>-SNPs (Table <tblr tid="T1">1</tblr>, last column). The ROR values were decreased when considering <it>cis</it>-SNPs in the conserved promoter regions relative to all promoter regions. Therefore the decreased significance of CEGs containing SNPs in the conserved regions of the promoters could be explained by the decreased ROR value. Another reason contributing to the decreased significance could be the smaller sample size, given many fewer genes contained <it>cis</it>-SNPs in conserved regions.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Genes containing SNPs in promoters (Prom) and non-coding regions (NCR)</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Region</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of Total Genes (CEGs+Non-CEGs)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Total Genes Containing SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>CEGs Containing SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>P-value (FET)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Ratio of Over-representation</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NCR</p>
                     </c>
                     <c ca="center">
                        <p>4752</p>
                     </c>
                     <c ca="center">
                        <p>2047</p>
                     </c>
                     <c ca="center">
                        <p>3514</p>
                     </c>
                     <c ca="center">
                        <p>1769</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>6.03E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.169</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Prom 2 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>1569</p>
                     </c>
                     <c ca="center">
                        <p>863</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>2.48E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.277</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Prom 5 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>2260</p>
                     </c>
                     <c ca="center">
                        <p>1220</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>3.50E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.253</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons NCR</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>1476</p>
                     </c>
                     <c ca="center">
                        <p>782</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>4.20E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.230</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons Prom 2 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>236</p>
                     </c>
                     <c ca="center">
                        <p>122</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>2.60E-03</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.200</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons Prom 5 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>388</p>
                     </c>
                     <c ca="center">
                        <p>196</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>8.79E-04</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.173</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>CEG and non-CEG sets are defined in the text. Data for genes containing SNPs in non-coding region (NCR), 2 Kb upstream (Prom 2 Kb) and 5 Kb (Prom 5 Kb) upstream regions are given. Data on genes containing SNPs in conserved regions between mouse and human are indicated by 'Cons'. p-values less than 0.01 are in bold. All p-values are based on the Fisher exact test (FET). The ratio of over-representation (ROR) is defined as: ratio of the fraction of CEGs containing <it>cis</it>-SNPs to the fraction of all genes (CEGs+non-CEGs) containing <it>cis</it>-SNPs.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Higher density of SNPs in promoters and non-coding regions of CEGs</p>
            </st>
            <p>Next, we compared the <it>cis</it>-SNP density in the promoters and non-coding regions of CEGs to non-CEGs. Genes with no <it>cis</it>-SNPs in their promoters or non-coding regions were ignored for this analysis, since the absolute numbers of genes containing <it>cis</it>-SNPs were already compared earlier (Table <tblr tid="T1">1</tblr>) (consideration of genes with no <it>cis</it>-SNPs will only increase the significance of the p-values in Table <tblr tid="T1">1</tblr>, since a higher fraction of the CEGs contain <it>cis</it>-SNPs compared to non-CEGs). <it>Cis</it>-SNP densities between the two sets were compared using the non-parametric Wilcoxon rank sum test (Table <tblr tid="T2">2</tblr>). A non-parametric method was used because the distributions under study were non-normal. A significantly higher density of <it>cis</it>-SNPs (number of SNPs per Kb of total non-coding or promoter sequence) was observed in CEGs compared to non-CEGs (p &lt; 0.01).</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>SNP density in the promoter or non-coding regions of CEGs and non-CEGs</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Gene Set</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Region</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of Non-CEGs</b>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <b>SNP Density</b>
                        </p>
                        <p>
                           <b>(Normalized by total non-coding or promoter length)</b>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <b>SNP Density</b>
                        </p>
                        <p>
                           <b>(Normalized by Conserved or Non-Cons region length)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean SNP Density CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean SNP Density Non-CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>WRST</b>
                        </p>
                        <p>
                           <b>p-value</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean SNP Density CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean SNP Density Non-CEGs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>WRST</b>
                        </p>
                        <p>
                           <b>p-value</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Full Combined Set</p>
                     </c>
                     <c ca="center">
                        <p>NCR</p>
                     </c>
                     <c ca="center">
                        <p>1769</p>
                     </c>
                     <c ca="center">
                        <p>1745</p>
                     </c>
                     <c ca="center">
                        <p>0.630</p>
                     </c>
                     <c ca="center">
                        <p>0.463</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>&lt; E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Prom 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p>863</p>
                     </c>
                     <c ca="center">
                        <p>706</p>
                     </c>
                     <c ca="center">
                        <p>1.291</p>
                     </c>
                     <c ca="center">
                        <p>1.172</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>0.008</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Prom 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p>1220</p>
                     </c>
                     <c ca="center">
                        <p>1040</p>
                     </c>
                     <c ca="center">
                        <p>0.789</p>
                     </c>
                     <c ca="center">
                        <p>0.685</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>2.70E-06</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Genes with No</p>
                     </c>
                     <c ca="center">
                        <p>NCR</p>
                     </c>
                     <c ca="center">
                        <p>987</p>
                     </c>
                     <c ca="center">
                        <p>1051</p>
                     </c>
                     <c ca="center">
                        <p>0.546</p>
                     </c>
                     <c ca="center">
                        <p>0.352</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>&lt; E-12</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SNPs in Conserved </p>
                     </c>
                     <c ca="center"/>
                     <c ca="center">
                        <p>Prom 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p>741</p>
                     </c>
                     <c ca="center">
                        <p>592</p>
                     </c>
                     <c ca="center">
                        <p>1.243</p>
                     </c>
                     <c ca="center">
                        <p>1.106</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>0.009</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Regions (subset 1)</p>
                     </c>
                     <c ca="center">
                        <p>Prom 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p>1024</p>
                     </c>
                     <c ca="center">
                        <p>848</p>
                     </c>
                     <c ca="center">
                        <p>0.731</p>
                     </c>
                     <c ca="center">
                        <p>0.620</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>1.50E-06</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Genes having SNPs </p>
                     </c>
                     <c ca="center">
                        <p>All NCR</p>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>0.736</p>
                     </c>
                     <c ca="center">
                        <p>0.630</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>5.00E-10</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>in Conserved </p>
                     </c>
                     <c ca="center">
                        <p>Non Cons NCR</p>
                     </c>
                     <c ca="center">
                        <p>782</p>
                     </c>
                     <c ca="center">
                        <p>694</p>
                     </c>
                     <c ca="center">
                        <p>0.657</p>
                     </c>
                     <c ca="center">
                        <p>0.554</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>9.00E-09</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.732</p>
                     </c>
                     <c ca="center">
                        <p>0.620</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>1.00E-08</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Regions (subset 2)</p>
                     </c>
                     <c ca="center">
                        <p>Cons NCR</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>0.091</p>
                     </c>
                     <c ca="center">
                        <p>0.101</p>
                     </c>
                     <c ca="center">
                        <p>0.050</p>
                     </c>
                     <c ca="center">
                        <p>1.359</p>
                     </c>
                     <c ca="center">
                        <p>1.193</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>2.71E-06</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>All Prom 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>1.581</p>
                     </c>
                     <c ca="center">
                        <p>1.513</p>
                     </c>
                     <c ca="center">
                        <p>0.172</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Non Cons 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p>122</p>
                     </c>
                     <c ca="center">
                        <p>114</p>
                     </c>
                     <c ca="center">
                        <p>1.376</p>
                     </c>
                     <c ca="center">
                        <p>1.415</p>
                     </c>
                     <c ca="center">
                        <p>0.498</p>
                     </c>
                     <c ca="center">
                        <p>1.886</p>
                     </c>
                     <c ca="center">
                        <p>2.169</p>
                     </c>
                     <c ca="center">
                        <p>0.482</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Cos Prom 2 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>0.758</p>
                     </c>
                     <c ca="center">
                        <p>0.706</p>
                     </c>
                     <c ca="center">
                        <p>0.278</p>
                     </c>
                     <c ca="center">
                        <p>5.227</p>
                     </c>
                     <c ca="center">
                        <p>5.721</p>
                     </c>
                     <c ca="center">
                        <p>0.026</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>All Prom 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p/>
                     </c>
                     <c ca="center">
                        <p>1.091</p>
                     </c>
                     <c ca="center">
                        <p>0.967</p>
                     </c>
                     <c ca="center">
                        <p>0.036</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Non Cons 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p>196</p>
                     </c>
                     <c ca="center">
                        <p>192</p>
                     </c>
                     <c ca="center">
                        <p>0.910</p>
                     </c>
                     <c ca="center">
                        <p>0.822</p>
                     </c>
                     <c ca="center">
                        <p>0.119</p>
                     </c>
                     <c ca="center">
                        <p>1.139</p>
                     </c>
                     <c ca="center">
                        <p>1.025</p>
                     </c>
                     <c ca="center">
                        <p>0.174</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Cons Prom 5 Kb</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>0.357</p>
                     </c>
                     <c ca="center">
                        <p>0.317</p>
                     </c>
                     <c ca="center">
                        <p>0.036</p>
                     </c>
                     <c ca="center">
                        <p>3.763</p>
                     </c>
                     <c ca="center">
                        <p>3.276</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>0.001</b>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>"SNP density (Normalized by total non-coding or promoter length)" = 1000*(total number of SNPs in non-coding or promoter sequence)/(total non-coding or promoter length). "SNP density (Normalized by Conserved or Non-Cons region length)" = 1000*(the number of SNPs in conserved or non-conserved regions)/(total length of the conserved or non-conserved sequence in promoters or non-coding regions). "Mean SNP density" gives the average SNP-density over all the genes in a particular set. The means are only given for reference, and have not been used for calculation of p-values (which were done using a non-parametric method). p-values of significance with the Wilcoxon rank sum test (WRST) are given. H<sub>0 </sub>= CEGs and non-CEGs have equal SNP density, H<sub>A </sub>= CEGs have higher SNP density compared to non-CEGs. p-values less than 0.01 are in bold.</p>
               </tblfn>
            </tbl>
            <p>In order to compare the density of <it>cis</it>-SNPs in the conserved and non-conserved regions, genes were partitioned into two sub-sets, namely, those with no <it>cis</it>-SNPs in mouse-human conserved regions (subset 1), and those containing <it>cis</it>-SNPs in the conserved regions (subset 2) (Table <tblr tid="T2">2</tblr>). In subset 1, containing a majority of the CEGs, higher <it>cis</it>-SNP density was observed in both promoter and non-coding regions (p &lt; 0.01). In subset 2, a higher <it>cis</it>-SNP density was observed in non-coding region (p &lt; 0.01) only. Upon normalizing the number of SNPs by the length of the conserved or non-conserved sequence (instead of the total promoter or non-coding sequence length), significantly higher density was observed in both conserved as well as non-conserved non-coding region (p &lt; 0.01, Table <tblr tid="T2">2</tblr>, subset 2). In the 5 Kb upstream promoter regions of genes in subset 2, significantly higher SNP density was observed only when the number of <it>cis</it>-SNPs in mouse-human aligned sequences was normalized by the length of these conserved regions.</p>
         </sec>
         <sec>
            <st>
               <p>A higher fraction of CEGs has cis-SNPs that alter predicted transcription factor binding sites</p>
            </st>
            <p>In an attempt to study what effect the <it>cis</it>-SNPs in CEGs have on the transcription regulatory machinery, the perturbation of transcription factor binding sites (TFBSs) by <it>cis</it>-SNPs was investigated. All known mouse, rat and human TFBSs (a total of 2,528 sites) from the TRANSFAC<sup>&#174; </sup>database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> were first mapped to the mouse genome using BLASTN (for mapping details see Methods). However, none of the mapped sites overlapped with <it>cis</it>-SNPs of any of the CEGs. Consequently we investigated the overlap of <it>predicted </it>TFBSs with <it>cis</it>-SNPs.</p>
            <p>The rationale and caveats for using predicted TFBSs are discussed below. It has been shown through experiments that the score of a transcription factor (TF) binding site, as computed from a position weight matrix (PWM) built from a collection of its known sites, can give a fairly accurate estimate of the <it>in vitro </it>DNA binding affinity of the transcription factor to that site (e.g. <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>). This observation and the thermodynamic principles behind it forms the basis of most of the generic bioinformatic methods that are in use today to predict TFBSs in genomic DNA (Reviewed in <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B38">38</abbr></abbrgrp>). Compared to <it>in vitro</it>, the TF-DNA binding events are definitively more complicated <it>in vivo </it>since TF binding to DNA in eukaryotes is context dependant (e.g. dependant on other TFs which bind nearby DNA sites, local DNA structure), and influenced by factors like chromatin remodeling and concentration of the TF. But such contextual and other relevant information are available only in rare cases and cannot be generally leveraged in the prediction of TFBSs <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Therefore, although the change in TFBS score may not be accurately predictive of the binding of a transcription factor to its target DNA site (and of the change to the target gene's expression) <it>in vivo</it>, in the absence of other specific information such as chromosomal regions that are open for the regulatory proteins to bind, the DNA binding partners for a given TF, concentration of TF etc., the approach we have taken here (i.e. looking for base changes that lead to perturbation of the binding sites predicted with models built out of previously known sites for TFs) is a reasonable strategy (and the only <it>generic </it>strategy at this time) that one can use to examine how SNPs may affect TF binding to putative TFBSs. This is a common strategy that has also been used by others for the prediction of TFBSs as well as prediction of putative regulatory SNPs that could perturb TF-DNA binding and cause changes to expression of the target gene <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>.</p>
            <p>In our study TFBS predictions were made with PWMs representing the transcription factor DNA binding sites available from the TRANSFAC<sup>&#174; </sup>(v. 6.3) database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> using the MATCH&#8482; software <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Only PWMs generated from the collections of vertebrate DNA binding sites were used. 30 bp regions were taken around all <it>cis</it>-SNPs (a total 61 bp including the SNP nucleotide) and scored with the PWMs (for details see Methods). Both the B6 and DBA alleles were scored, since as explained earlier, these were the variants that were used in the analysis of data from both crosses. If a <it>cis</it>-SNP location overlapped with a predicted binding site, and a difference was observed in the predicted binding site score due to the two alleles, the change in score was noted, and the predicted TFBS was considered to be perturbed by the SNP.</p>
            <p>Since the TFBSs are typically short and degenerate, predictions using PWMs are known to contain a large percentage of false positives <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B36">36</abbr><abbr bid="B44">44</abbr></abbrgrp>. Therefore, orthogonal data such as co-regulation of the target genes with the transcription factors or phylogenetic footprinting, are commonly used to increase the specificity of these predictions <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr></abbrgrp>. Although the transcription factors and their target genes may not co-regulate at the mRNA level, it is generally assumed that genes that <it>do </it>co-regulate across a diverse set of conditions may belong to the same regulatory pathway <abbrgrp><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. To reduce the number of false positive predictions for TFBSs we employed a similar strategy, requiring that the transcript levels of the TFs and their putative target genes (based on the TFBS predictions) be significantly correlated across a diverse set of mRNA profiling experiments (see Additional files <supplr sid="S3">3</supplr> and <supplr sid="S4">4</supplr>). Using expression profiles available from a set of 145 diverse mouse tissues and cell-lines <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp> (referred to here as the 'body-atlas' data set), we determined the Spearman rank-order correlation (with p &lt; 0.01) between all genes and the 282 distinct vertebrate transcription factors in the TRANSFAC<sup>&#174; </sup>(v. 6.3) database which have known gene symbols as well as PWM models for their DNA binding sites. It is of note that the body-atlas expression data set was used instead of the BXD and BXH F<sub>2 </sub>populations given others have shown that significant correlation between any two genes in a segregating population can result from closely linked eQTLs as opposed to biologically relevant co-regulation <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. These effects can be amplified in cases where genes give rise to strong <it>cis</it>-eQTLs.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Comprehensive set of promoter <it>cis</it>-SNPs in CEGs. All <it>cis</it>-SNPs in the promoter (upto 5 Kb upstream gene) of the CEGs, along with their location in conserved mouse-human regions (indicated by 'M-H CONS' in the CONSERVATION column), the predicted transcription factor binding sites perturbed by the SNP, and correlation of the gene to the transcription factors (if p &lt; 0.01) (in cases where there is a transcription factor binding site present that is perturbed by a <it>cis</it>-SNP) are given. All Celera SNPs <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> are now also available through latest release of the public mouse dbSNP database <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The public dbSNP identifiers corresponding to the Celera SNPs are therefore provided for reference.</p>
               </text>
               <file name="1471-2164-7-235-S3.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Comprehensive set of <it>cis</it>-SNPs in non-coding regions of CEGs. All SNPs in the non-coding region (introns and 5 Kb downstream) of the CEGs, along with their location in conserved mouse-human regions (indicated by 'M-H CONS' in the CONSERVATION column), the predicted transcription factor binding sites perturbed by the SNP, and correlation of the gene to the transcription factors (if p &lt; 0.01) (in cases where there is a transcription factor binding site present that is perturbed by a <it>cis</it>-SNP) are given. dbSNP identifiers corresponding to the Celera SNPs are provided.</p>
               </text>
               <file name="1471-2164-7-235-S4.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>We found that a higher fraction of CEGs have <it>cis</it>-SNPs affecting predicted TFBS scores in non-coding regions (p = 4.23 &#215; 10<sup>-4</sup>, as determined by the Fisher exact test) (Table <tblr tid="T3">3</tblr>), when correlations were required between TFs and their target genes (based on TFBS predictions, see Figure <figr fid="F2">2</figr> for an example). Significance was not observed when the TFBS predictions were <it>not </it>filtered by correlations (data not shown), which may be due to the large false positive rate in the predicted TFBS set. Interestingly, the p-value of the hypothesis that more CEGs harbor <it>cis</it>-SNPs that disrupt predicted TFBSs (p = 4.23 &#215; 10<sup>-4</sup>) is much larger compared to the p-value of hypothesis that more CEGs contain <it>cis</it>-SNPs in the promoter and non-coding region (p ~10<sup>-12</sup>, Table <tblr tid="T1">1</tblr>). Possible reasons for this observation include: 1) DNA binding sites for most vertebrate TFs cannot be predicted since PWM models for their binding sites are not available, 2) a large fraction of the <it>cis</it>-SNPs are neutral with respect to their effects on the transcriptional levels <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, and 3) <it>cis</it>-SNPs could perturb regulatory elements other than TFBSs (see Discussions for more details).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>An example of a putative candidate <it>cis</it>-regulatory SNP affecting transcription in mouse F<sub>2 </sub>population</p>
               </caption>
               <text>
                  <p><b>An example of a putative candidate <it>cis</it>-regulatory SNP affecting transcription in mouse F<sub>2 </sub>population</b>. <b>A</b>. <it>cis</it>-acting LOD scores for Casc4 on chromosome 2 in multiple tissues and sample sets (male, female or combined/all) in the BXH cross. x-axis &#8211; genomic location in Mb, y-axis &#8211; LOD score from interval mapping. Physical location of the gene is indicated with a red arrow-head. Only LODs scores >10 are shown. <b>B</b>. Association of expression levels of Casc4 with genotypes of the promoter SNP, mCV23866990. The distribution of the expression levels in brain (left) and adipose (right) is shown according to the genotypes of this SNP in the F<sub>2 </sub>animals. A_A represents the DBA and C3H allele, and C_C the B6 allele. <b>C</b>. A binding site for transcription factor Hand1 is affected by SNP mCV23866990. The polymorphism changes a highly conserved base in the binding site (T&#8594;C change on the reverse strand, boxed and shaded). The frequency matrix and a sequence logo of the profile representing the binding site are shown. <b>D</b>. Scatter plot of Casc4 (x-axis) versus Hand1 expression levels in the body atlas data set [49, 50]. Hand1 expression is correlated to that of Casc4 with a p-value of &lt; 10<sup>-6 </sup>(Spearman rank order correlation -0.58).</p>
               </text>
               <graphic file="1471-2164-7-235-2"/>
            </fig>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Perturbation of TFBSs by <it>cis</it>-SNPs and summary gene counts where TFBS predictions are affected by <it>cis</it>-SNPs</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Region</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Total Genes Containing SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>CEGs Containing SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Total Genes with TFBS Score Changed by SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>CEGs with TFBS Score Changed by SNPs</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>P-value (FET)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NCR</p>
                     </c>
                     <c ca="center">
                        <p>3514</p>
                     </c>
                     <c ca="center">
                        <p>1769</p>
                     </c>
                     <c ca="center">
                        <p>610</p>
                     </c>
                     <c ca="center">
                        <p>344</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>4.23E-04</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Prom 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p>1569</p>
                     </c>
                     <c ca="center">
                        <p>863</p>
                     </c>
                     <c ca="center">
                        <p>94</p>
                     </c>
                     <c ca="center">
                        <p>61</p>
                     </c>
                     <c ca="center">
                        <p>0.0174</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Prom 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p>2260</p>
                     </c>
                     <c ca="center">
                        <p>1220</p>
                     </c>
                     <c ca="center">
                        <p>193</p>
                     </c>
                     <c ca="center">
                        <p>111</p>
                     </c>
                     <c ca="center">
                        <p>0.1346</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons NCR</p>
                     </c>
                     <c ca="center">
                        <p>1476</p>
                     </c>
                     <c ca="center">
                        <p>782</p>
                     </c>
                     <c ca="center">
                        <p>129</p>
                     </c>
                     <c ca="center">
                        <p>66</p>
                     </c>
                     <c ca="center">
                        <p>0.6338</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons Prom 2 Kb</p>
                     </c>
                     <c ca="center">
                        <p>236</p>
                     </c>
                     <c ca="center">
                        <p>122</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.0852</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Cons Prom 5 Kb</p>
                     </c>
                     <c ca="center">
                        <p>388</p>
                     </c>
                     <c ca="center">
                        <p>196</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.2293</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The total gene-set consisted of the combined set of CEGs and non-CEGs as described in the text. All p-values are based on the Fisher exact test (FET); p-values less than 0.01 are in bold.</p>
               </tblfn>
            </tbl>
            <p>In order to determine whether the <it>cis</it>-SNPs in CEGs perturb predicted TFBSs with an increased frequency relative to the non-CEGs, we compared the fraction of <it>cis</it>-SNPs affecting TFBSs in CEGs versus non-CEGs using the Fisher exact test. The fraction of <it>cis</it>-SNPs in CEGs affecting predicted TFBSs was not observed to be higher (at the 0.01 significance level), suggesting that a higher rate of TFBS perturbation by SNPs in CEGs is likely due to the increased density of <it>cis</it>-SNPs in these genes relative to non-CEGs.</p>
         </sec>
         <sec>
            <st>
               <p>An example cis-eQTL and putative regulatory cis-SNP</p>
            </st>
            <p>To illustrate how high-density SNP data may be intersected with eQTL data to identify putative candidate <ul>q</ul>uantitative <ul>t</ul>rait <ul>n</ul>ucleotide (QTN) underlying the eQTLs (and also to illustrate the different types of data we have used in our analyses), we highlight one example of a CEG with a <it>cis</it>-SNP in its promoter region perturbing a predicted TFBS (Figure <figr fid="F2">2</figr>). The gene Casc4 (cancer susceptibility candidate 4) gives rise to a strong <it>cis</it>-acting eQTLs (LOD score &#8805; 10) in a number of tissues in the BXH cross (Figure <figr fid="F2">2a</figr>). As Casc4 is in a nIBD-BXH region, the polymorphisms between B6 and C3H in this region should be identical to those between B6 and DBA. There are five SNPs in the promoter (upstream 5 Kb) of this gene; only one SNP (mCV23866990), which is located close to the 5' end of Casc4 (-701 bp), perturbs the predicted binding-site for a transcription factor, Hand1 <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, whose mRNA level is correlated with that of the gene. The mCV23866990 SNP genotype shows a significant association with the expression level of Casc4 (p &lt; 0.0001 using a standard one-way ANOVA) (Figure <figr fid="F2">2b</figr>). A conserved position in the binding site for the Hand1 is perturbed by mCV23866990 (Figure <figr fid="F2">2c</figr>). By affecting the Hand1 binding site in the promoter, this <it>cis</it>-SNP could be responsible for differential expression of Casc4 in the BXH F<sub>2 </sub>population.</p>
            <p>Isolating the specific causative regulatory mutations underlying eQTLs that are responsible for variation of gene expression in a segregating mouse population is difficult. This is especially true in an F<sub>2 </sub>population, where regions of linkage disequilibrium are very large in any given region (given an F<sub>2 </sub>population is constructed from intercrossing a single F<sub>1 </sub>founder). Determination of the actual functional role of the causative polymorphisms is even more challenging, since there are several different molecular mechanisms through which mRNA levels in cells can be regulated. Although such challenges exist, <it>putative </it>candidate polymorphisms that affect transcription of a given gene may be prioritized for experimental validation, and hypotheses can be generated for the possible biological roles of candidate regulatory <it>cis</it>-SNPs based on examination of the data, as illustrated by the example above. For such candidates the gold standard is to introduce the polymorphism in question onto the background of a wild-type mouse and then compare changes in the <it>in vivo </it>activity of the gene and phenotypes to the wild-type mouse.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Cis-SNPs in genes showing cis-acting linkage in segregating mouse populations</p>
            </st>
            <p>SNPs are often used as markers for disease, and as noted earlier, there are now several examples where <it>cis</it>-regulatory variants are associated with disease <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B42">42</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp>. Computational approaches for identifying the <it>cis</it>-regulatory polymorphisms would therefore be useful in prioritizing the candidate polymorphisms that a play causative role in disease, reducing the laborious experimental process of testing multiple candidate variants <it>in vivo</it>, selecting biologically meaningful SNPs for association studies, and ultimately in generating testable hypotheses for elucidating the molecular basis of a given disease. However, little bioinformatics research has been done in a systematic way to build predictors of variations that are likely to affect gene-expression in segregating mammalian populations <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B55">55</abbr></abbrgrp>. In this study we have investigated the frequency and potential biological role of the polymorphisms underlying genes whose expression give rise to strong <it>cis</it>-linkage in segregating mouse populations. The study provides the first investigation of putative regulatory SNPs around genes showing <it>cis</it>-linkage, and insight into the challenges associated with identifying the causative regulatory variants in such populations through bioinformatic sequence analyses methods. In addition, the data we provide here (CEGs from four different tissues in two crosses, <it>cis</it>-SNPs, and predicted TFBSs affected by those <it>cis</it>-SNPs) should benefit further investigations in this area.</p>
            <p>There have been a few previous studies surveying the role of <it>cis</it>-polymorphisms and haplotypes in promoter regions of sets of human genes, and identifying those that change expression <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B40">40</abbr><abbr bid="B51">51</abbr><abbr bid="B53">53</abbr><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr></abbrgrp>. These studies considered a relatively small sampling of genes (&lt;300) and assessed the promoter SNPs that affected the expression in a limited number of cell lines with reporter gene assays <abbrgrp><abbr bid="B51">51</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr></abbrgrp>. Since the changes in expression due to the polymorphisms were tested in cell-lines, it is not known if the SNPs that caused expression changes in the <it>in vitro </it>assays are responsible for varying levels of expression <it>in vivo </it>in genetically segregating populations. We started with large-scale genetic linkage data of gene expression and investigated the frequency of polymorphisms in the genes showing <it>cis</it>-linkage. Therefore the work we present here (namely, investigation of SNPs in the vicinity of genes with <it>cis</it>-expression linkage in segregating populations in as many as four different tissues), is complementary to the previous work, and provides a different approach to investigating <it>cis</it>-regulatory SNPs with murine populations.</p>
            <p>One recent study reports the mapping of <it>cis</it>-regulatory variants in a small set of genes to haplotype blocks in human samples <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>, and another recent study reports the investigation of <it>cis</it>-regulatory variations in 3' UTRs of a set of genes showing <it>cis</it>-acting regulation in a panel of mouse recombinant congenic strains <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. However, to the best of our knowledge, the present study represents the first large scale genome-wide survey of <it>cis</it>-SNPs in genes that give rise to strong <it>cis</it>-eQTL in a mammalian population, and an investigation of their potential role in disrupting putative <it>cis</it>-regulatory elements. We observe a significantly higher fraction of CEGs to contain <it>cis</it>-SNPs compared to non-CEGs, and that the density of these SNPs is significantly higher in the CEGs. We have not conclusively proven the functional role of any polymorphism in regulating the expression of CEGs through <it>in vitro </it>or <it>in vivo </it>experimental validation, and many of the SNPs in the vicinity of the CEGs could be neutral (i.e. have no consequence on the expression levels). However, CEGs by definition should have variations near the genes themselves affecting their expression in a segregating population, and based on earlier work on human genes with promoter polymorphisms, it has been estimated that a sizable fraction of <it>cis</it>-variants (about one third of the SNPs in the promoters) may alter gene-expression <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B51">51</abbr></abbrgrp>. Therefore, it is reasonable to infer that the higher density of <it>cis</it>-SNPs in CEGs is associated with changes in expression of those genes, and one or more of the <it>cis</it>-SNPs would be responsible for causing variation of expression in a large fraction of the CEGs in the mouse F<sub>2 </sub>populations, although we do not exactly know how many causal regulatory <it>cis</it>-SNPs are in this set of CEGs.</p>
            <p>We investigated the effect of a few other relevant biological factors which could give rise to <it>cis</it>-eQTLs in our data-set instead of (or in addition to) <it>cis</it>-SNPs in the promoters and non-coding regions, such as non-sense mediated decay (NMD), polymorphisms in exons, and genomic segmental duplications. We estimated that a very small fraction of our set of CEGs may arise due to NMD and segmental duplications (see  Methods section for details). A higher fraction of CEGs was seen to contain SNPs in their exons (1081 out of 2047, p = 2.5 &#215; 10<sup>-12 </sup>with Fisher exact test) relative to the non-CEGs. But an increased fraction of the CEGs that contained exonic SNPs also contained promoter and non-coding <it>cis</it>-SNPs relative to non-CEGs (p &lt; 10<sup>-4 </sup>with Fisher exact test). Therefore, we did not exclude these genes from the analyses presented here (the purpose of which was to investigate potential regulatory SNPs in the non-coding and promoter regions). It is worth noting however that our analyses with the set of CEGs and non-CEGs which did not contain any exonic SNPs yielded results that were very similar to those obtained from the full set of CEGs and non-CEGs.</p>
            <p>One of our objectives was to investigate the challenges involved in identification of the causative <it>cis</it>-regulatory SNPs through bioinformatic sequence analysis approaches. To that end, we have examined the propensities of the SNPs in potentially functional non-coding sequences (namely, mouse-human conserved sequences) and predicted transcription factor biding sites. A few recent studies have suggested the use of mouse-human alignments to identify putative candidate regulatory SNPs <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>, and although SNPs falling in these conserved regions have been shown to affect transcription <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>, it is unclear whether a majority of the regulatory variants lie in these regions. In our analyses we find that the sequence variations around the CEGs are not specifically enriched in evolutionarily conserved non-coding and promoter sequences, and in fact in the majority of the CEGs all of the SNPs are outside of these highly conserved regions. It is possible that the causative regulatory SNPs lie further away in these CEGs (sequences that are > &#177; 5 Kb away from the genes) where they alter conserved regulatory elements (such as silencers or enhancers). However, the higher <it>cis</it>-SNP density in the immediate vicinity of these CEGs (relative to non-CEGs) suggests that a significant fraction of <it>cis</it>-regulatory SNP could lie outside of regions that are most conserved in mammalian evolution. As comparative genomics and phylogenetic footprinting approaches are frequently utilized in the searches for functional regulatory elements in mammalian genomes, and computational prediction of transcriptional regulatory elements is a difficult problem <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B36">36</abbr><abbr bid="B44">44</abbr></abbrgrp>, the above observations imply that the identification of <it>cis</it>-regulatory variations in genetically segregating populations is likely to be difficult using sequence-driven bioinformatic approaches alone. Since the information for transcriptional regulatory networks is in part hard-wired in the genomic DNA itself through the array of regulatory elements <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>, one can hypothesize that for CEGs where the <it>cis</it>-regulatory SNPs are not present in the highly conserved promoter or non-coding sequences, variations of expression may not cause a significantly perturbation of the transcriptional networks that have been conserved in the mammals. Experimental validation of a set of SNPs affecting putative regulatory sites in conserved and non-conserved regions (both in the immediate vicinity of the genes, and also some distances away), will be ultimately required to fully understand how the in <it>cis</it>-SNPs affect gene expression in segregating populations. This would consequently lead to a better understanding of the bioinformatic approaches that would be effective in identifying <it>cis</it>-regulatory variants.</p>
            <p>Below we describe the examination of a set of characterized human <it>cis</it>-regulatory SNPs that are associated with inherited diseases. A collection of these rare examples is available from the rSNP_Guide database <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B63">63</abbr></abbrgrp>. From the examples presented in this database (see <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>) we collected a set of 33 <it>cis</it>-regulatory SNPs in 5 distinct genes (PROC, TNF, HGB2, GP1BB, and F7) that are: a) known to be underlying or associated with inherited human diseases, b) known or predicted to disrupt transcription factor binding sites, and c) have flanking sequences available from the Human Gene Mutation Database <abbrgrp><abbr bid="B64">64</abbr></abbrgrp> so that they could be used for mapping reliably to the human genome assembly. An examination of the locations of these regulatory SNPs indicates that 10 SNPs in 2 genes are in the human-mouse conserved regions, while the remaining 23 SNPs in 3 genes are in non-conserved regions. In addition to the above cases that were taken directly from rSNP_Guide database, from two recent publications we examined the non-coding SNPs (three in total) within or around two genes, namely INSIG2 <abbrgrp><abbr bid="B65">65</abbr></abbrgrp> and TCF7L2 <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>, that represent extremely rare examples of variants associated with complex disease (obesity and diabetes respectively in this case) that have been validated across diverse and multiple human cohorts. These causative SNPs were also outside of human-mouse conserved regions. This is a small sample of genes to draw concrete conclusions from; however this observation with the well characterized human SNPs supports our similar finding from the mouse data and suggests that a large fraction of the causative <it>cis</it>-regulatory SNPs, including those that are associated with inherited disease, could be outside of the sequences that are highly conserved in mammalian evolution.</p>
         </sec>
         <sec>
            <st>
               <p>Perturbation of putative transcription factor binding sites by cis-SNPs</p>
            </st>
            <p>A higher fraction of CEGs had TFBS predictions perturbed by SNPs in the non-coding region. No significant difference was observed in the fraction of total SNPs affecting binding sites between CEGs and non-CEGs, suggesting that the higher fraction of perturbations of TFBSs in the CEGs was a consequence of increased <it>cis</it>-SNP density.</p>
            <p>Several factors are likely to have confounded our study with predicted TF binding sites: 1) it is possible that the false positive rate of TFBS prediction is high, even after requiring correlations of TFs to putative target genes; 2) at the time of our analysis binding site models for only ~300 vertebrate TFs were available from the TRANSFAC database, whereas the number of distinct TFs in mammals are estimated to be around 2,000 <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>; therefore prediction of DNA binding sites for the majority of TFs was not possible; 3) although often enriched in the immediate promoter region, transcription regulatory elements in mammals can be spread over large distances (sometimes more than 100 Kb <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>) whereas in this present study we considered only the immediate vicinity of the genes (&#177; 5 Kb); 4) transcriptional regulatory elements frequently act in conjunction with others forming regulatory modules where multiple TFs bind DNA (involving both protein-DNA as well as protein-protein interactions); consequently the change in score of one individual binding site by an overlapping <it>cis</it>-SNP may often fail to reflect the extent to which transcription is affected by that mutation. It would clearly be useful to re-analyze the data when significantly more experimentally verified murine transcription factor binding sites are available in order to obtain a better understanding of how <it>cis</it>-variations specifically affect the transcriptional machinery, and whether a larger fraction of SNPs in the CEGs perturb known binding sites.</p>
            <p>It is of note that in a recent study of <it>Saccharomyces cerevisiae </it>segregants it was shown that genes having <it>cis</it>-linkage contained a higher frequency of SNPs in promoters and 3' UTR sequences <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The study also found moderate evidence for enrichment of SNPs in TFBS sequences that were mapped using the ChIP-chip technology <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. The yeast study with TF binding sites was not confounded by some of the factors mentioned above, since most of the regulatory sequences were experimentally determined, and a comprehensive set of DNA binding sequences for almost all of the yeast TFs were available <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>.</p>
            <p>Obviously, in addition to TFBSs, other classes of regulatory sequences, e.g. those affecting transport from the nucleus, mRNA stability or decay, RNA mediated regulation, and those potentially involved in epigenetic regulation of gene expression, could be affected by <it>cis</it>-SNPs, which we have not studied here. Coding SNPs that cause changes in the protein structure can act in <it>trans </it>to influence expression through a feedback loop as shown recently for the AMN1 gene in <it>S. cerevisiae </it><abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Such cases were also not studied here.</p>
         </sec>
         <sec>
            <st>
               <p>Undetermined factors in our study</p>
            </st>
            <p>The specificity of the identification of <it>cis</it>-acting eQTLs was unknown. Recently Doss et al. <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, gives a lower bound estimate of the true positive rate for the BXD cross (64%); we have used more stringent thresholds for identification of putative <it>cis</it>-eQTLs in our study to increase the true positive rate, so we anticipate the true positive rate would be higher than 64%, but the exact number is not known. The specificity of TFBS predictions is unknown; moreover binding sites for the majority of the TFs could not be predicted because TRANSFAC<sup>&#174; </sup>PWMs were unavailable. It is likely that additional SNPs exist between the strains we studied that had not been identified in the databases we used in our study <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. Even with these unknown factors in our current study, several observations have been made that shed light on the nature of variation in genes showing <it>cis</it>-linkage in segregating populations, as well as the bioinformatic challenges that are involved in characterizing the non-coding <it>cis</it>-regulatory polymorphisms using computational sequence analysis strategies.</p>
         </sec>
         <sec>
            <st>
               <p>Future work</p>
            </st>
            <p>The problem of identifying and annotating the functional <it>cis</it>-regulatory polymorphisms is a difficult one that will require various experimental as well as computational approaches to address. Our understanding of <it>cis</it>-regulatory variations and their biological role would benefit from <it>in-vivo </it>experimental evaluation of the contribution of polymorphisms around CEGs towards changes in gene expression, characterization of more regulatory elements in the genome (which is severely limited at this time), examination of the multi-species genome alignments, and more accurate prediction of the (transcriptional and other) regulatory elements. The data on CEGs and <it>cis</it>-SNPs that we supply here (supplementary information) will provide a valuable resource for further exploration in this area.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The analyses of <it>cis</it>-SNPs in the promoters and non-coding regions around <it>cis</it>-acting eQTL genes (CEGs) in mouse F<sub>2 </sub>populations indicate that a significantly higher fraction of CEGs contain <it>cis</it>-SNPs compared to non-CEGs. CEGs also contain higher SNP density in the promoters and non-coding sequences relative to the non-CEGs. Since non-coding sequences that are conserved in mammalian evolution are often biologically functional, the propensity of <it>cis</it>-SNPs in the promoter and non-coding regions that are most conserved between mouse and human was examined. A majority of the CEGs having <it>cis</it>-SNPs did not contain any <it>cis</it>-SNP in these conserved regions, and in the CEGs that contained <it>cis</it>-SNPs in conserved regions, the enrichment of <it>cis</it>-SNPs occurred both in conserved as well as non-conserved regions. This suggests many of the <it>cis</it>-regulatory SNPs underlying eQTLs and responsible for causing gene-expression changes in segregating populations could lie outside of the sequences that are most highly conserved in mammalian evolution. To investigate the possible biological role of the <it>cis</it>-SNPs in disrupting the transcriptional regulatory elements, we studied the perturbation of the predicted transcription factor binding sites (TFBSs) by the <it>cis</it>-SNPs. Relative to non-CEGs, a significantly higher fraction of CEGs harbor <it>cis</it>-SNPs that perturb the predicted TFBSs. However the fraction of <it>cis</it>-SNPs in the CEGs affecting the binding sites is not higher, suggesting that the increased incidence of TFBS perturbation in the CEGs is due to the higher <it>cis</it>-SNP density. These observations imply that the identification and annotation of <it>cis</it>-regulatory variations in genetically segregating populations is likely to be difficult using sequence-driven bioinformatic approaches alone.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Genomic data: Mouse genome assembly, gene sets, SNP locations, genomic regions that are IBD between strains, and mouse-human conserved regions</p>
            </st>
            <p>The UCSC mouse genome assembly mm4 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> (NCBI build 32) was downloaded and used for all mapping purposes. All mouse mRNAs, cDNAs and ESTs were aligned to the mm4 assembly and clustered to produce gene and exon coordinates as described in detail previously <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp> (for gene and exon coordinates see <supplr sid="S5">Additional file 5</supplr>). Celera (release 3.4) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and public reference (dbSNP build 120 <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>) mouse SNPs were mapped onto the mm4 assembly using BLASTN as described in <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Those SNPs between the strains C57BL/6J and DBA/2J that mapped uniquely to the autosomes, had allele count > 1, and allele frequency &#8805; 10%, were used for analysis. SNPs in repeat regions were removed (repeat coordinates in mm4 were downloaded from UCSC annotation server), which left a total of 484,727 Celera and 24,332 dbSNPs. 16,809 SNPs between the Celera and dbSNP databases were identical in genomic location, leaving a unique, non-redundant set of 492,250 SNPs. This data set was used for analyses of <it>cis</it>-SNPs.</p>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Gene structure of the CEGs. Exons of all 2,047 CEGs mapped to the UCSC mm4 assembly (NCBI build 32) are given. This data helps in finding where a non-coding <it>cis</it>-SNP is located in the gene.</p>
               </text>
               <file name="1471-2164-7-235-S5.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Genomic regions that are identical by descent (IBD) between mouse strains were taken directly from Cervino et al. <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Cervino et al. used a window of 50 Kb which was moved through the genome at 10 Kb intervals; regions in which fewer than five consecutive SNPs were observed between two strains were identified as blocks that were IBD. Genes were taken with 5 Kb flanking regions (&#177; 5 Kb), and if the gene or its flanking region overlapped with regions of IBD, the gene was considered to be in an IBD region.</p>
            <p>Mouse (UCSC assembly mm4) and human (UCSC assembly hg16) genome alignments (axtTight track) were downloaded from the UCSC mouse-human alignment download site <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>. In the axtTight track, mouse and human genomes were aligned using BLASTZ <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> and post-processed to obtained the best alignments for each region. The amount of mouse genome covered in axtTight track is ~6%.</p>
         </sec>
         <sec>
            <st>
               <p>BXD and BXH crosses and mRNA profiling</p>
            </st>
            <p>Both the crosses under study here have been described earlier <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>; we simply describe the key features of those crosses in brief here for the benefit of the readers. In the BXD cross, an F<sub>2 </sub>population consisting of 111 mice was constructed from a cross of two inbred strains of mice, C57BL/6J and DBA/2J <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B15">15</abbr></abbrgrp>. Only female mice were maintained in this population. At 16 months of age the mice were euthanized and their livers extracted for gene expression profiling. The mice were genotyped at 139 microsatellite markers uniformly distributed over the mouse genome to allow for the genetic mapping of the gene expression and disease traits. The BXH F<sub>2 </sub>mouse population was constructed from C57BL/6J ApoE null (B6.ApoE-/-) and C3H/HeJ ApoE null (C3H.ApoE-/-) mice <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B19">19</abbr></abbrgrp>. F<sub>1 </sub>mice were generated from reciprocal intercrossing between B6.ApoE-/- and C3H.ApoE-/-, and F<sub>2 </sub>mice were subsequently bred by intercrossing F<sub>1 </sub>mice. A total of 334 (169 female, 165 male) were bred. Mice were sacrificed at 24 weeks and four tissues (liver, white adipose, whole brain, muscle) were extracted for mRNA profiling. Genomic DNA was isolated from kidney. A linkage map for all 19 autosomes was constructed using 1032 SNPs markers, giving rise to a genetic map with an average density of 1.5 cM. Genotyping was conducted by ParAllele using the molecular-inversion probe multiplex technique <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>.</p>
            <p>All of the expression data from the two crosses we have used here for eQTL analyses were generated and described previously <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B10">10</abbr><abbr bid="B16">16</abbr></abbrgrp>. For the BXD and BXH crosses expression measurements were available for 21,740 and 21,640 transcripts, respectively, representing 18,774 and 19,197 distinct coding genes which mapped uniquely to the 19 autosomal chromosomes. 12,597 genes were common between the microarrays used to profile each cross.</p>
            <p>Although the actual mRNA profiling experiments were described earlier <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B16">16</abbr></abbrgrp>, a summary of the method is given below for the reader's information. Total RNA from the BXD and BXH samples was purified from 25-mg portions using an RNeasy Mini Kit according to the manufacturer's instructions (Qiagen, Valencia, CA, USA), as previously described for the BXD set <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Fluorescently labeled cRNA (5 mg) from each F<sub>2 </sub>animal in each cross was hybridized against a pool of RNAs specific to each cross. The RNA pools for each cross were constructed from equal aliquots of RNA from all animals in the BXD cross and 150 randomly selected animals in the BXH cross. Array images were processed as previously described to obtain background noise, single-channel intensity, and associated measurement error estimates <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>. Expression changes between two samples were quantified as log<sub>10</sub>(expression ratio), where the "expression ratio" was taken to be the ratio between normalized, background-corrected intensity values for the two channels (red and green) for each spot on the array. An error model for the log ratio was applied as previously described to quantify the significance of differential expression between two samples <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>eQTL mapping and identification of CEGs and non-CEGs</p>
            </st>
            <p>Expression level for each gene was treated as a continuous variable and mapped to the genome using interval mapping. QTL mapping in the BXD cross was done as described <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B10">10</abbr></abbrgrp>. For BXH, QTL mapping was done with the QTL-Cartographer suite of programs <abbrgrp><abbr bid="B72">72</abbr></abbrgrp> using the established interval mapping <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> procedure. Since each experiment was hybridized in flour reverse pairs, log<sub>10 </sub>of the two expression ratios was taken and averaged to get the expression level (called ml-ratio for mean log ratio) for a gene. When males and females were treated as one combined group (in order to increase the power to detect linkages with increased number of animals), the gender effect on expression was accounted for by subtracting the gender-specific mean from each expression value. Specific thresholds for selecting CEGs are given in the results section.</p>
            <p>For some genes, the probes on the microarray, when mapped to the mouse genome (UCSC mm4 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> or NCBI build 32), overlapped with SNPs. For these genes (which consisted roughly 4.6% of the total number of genes represented on the microarrays), <it>cis</it>-eQTLs could simply arise due to polymorphisms in the probe sequences influencing hybridization of the mRNA to the microarray, rather than non-coding <it>cis</it>-variations influencing their expression. Such genes were therefore removed from the list, in order to minimize the false positive calls on CEGs.</p>
            <p>Nonsense mediated decay (NMD) is a mechanism of mRNA surveillance that ensures rapid degradation of transcripts with premature stop codons [73]. Therefore some CEGs may not have non-coding <it>cis</it>-regulatory variation but instead contain nonsense mutations that result in NMD, which is detected as a <it>cis</it>-eQTL event. The Celera mouse SNP database <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> (from which most of the SNPs for our analysis of <it>cis</it>-variants were taken) provided the annotation on SNPs that cause nonsense mutations; 63 distinct genes represented on the microarrays used for the crosses were annotated as having nonsense mutations. Only 4 CEGs from the BXH cross (~0.1% of all CEGs from that cross) and none of the CEGs from the BXD cross were annotated as having SNPs resulting in nonsense mutations. The enrichment of genes having SNPs annotated as causing NMD in our list of 2,047 CEGs (Table <tblr tid="T1">1</tblr>) was not significant (p = 0.33 with the Fisher exact test). Since this fraction was small and the p-value not significant, we are confident that NMD did not introduce any bias in our results. In addition, the genes containing nonsense SNPs could still have non-coding <it>cis</it>-variations affecting their expression. In our analysis we therefore did not exclude the genes with nonsense SNPs.</p>
            <p>In addition to the factors discussed above, variations in segmental duplications in the genome may affect expression. A rough map of genomic duplications is available for the B6 strain [74], however no map of the <it>variations </it>of genomic duplications between mouse strains used to construct the F<sub>2 </sub>crosses is available, and it is not known whether significant variations exist between mouse inbred strains in terms of genomic duplications. The analysis that one can do with SNPs (which gives variations between strains) is therefore not possible with the genomic duplications. Nevertheless, we looked to see if CEGs contained an increased number of genes that were in the duplicated regions using the available map of segmental duplications from B6 [74]. From the genomic coordinates of the segmental duplications [74, 75], we obtained the list of genes that were contained within these regions. A total of 123 distinct genes that were represented on our microarrays were within the duplicated regions. We then checked whether these genes are over-represented in the CEGs using the Fisher exact test. Only 11 of the 2,047 CEGs were in regions that underwent segmental duplications, and we found no evidence of enrichment of these genes in our list of CEGs (p = 0.45 with Fisher exact test). Since the maps of duplications in other mouse strains are not available, it is not possible to check whether the CEGs are enriched for genes contained in the duplicated regions of C3H or DBA. It is of note however, even if the duplicated regions in the C3H and DBA did not overlap those in the B6 strain (but the extent of duplications remained roughly the same between the different mouse strains), we would still observe a very small fraction of CEGs to be in these regions.</p>
            <p>In addition to checking for the enrichment of genes known to be located within segmental duplication regions in our list of CEGs, we employed a different strategy to check whether a significant fraction of CEGs could arise due to variations in duplications. This analysis was based on the hypothesis that if a certain region on the genome, containing multiple CEGs, was duplicated in one of the parental strains involved in a cross but not the other (i.e. variations of duplication between parental strains), and if this duplication was responsible for giving rise to <it>cis</it>-eQTLs in an F<sub>2 </sub>population, then the CEGs contained in the duplicated region would all show the same sign of the additive component of their eQTLs. This is because the F<sub>2 </sub>mice containing the duplicated region would always be expected to have higher levels of mRNA (having multiple copies) for all the genes contained within that region. In other regions of the genome, where duplication was not responsible for differential mRNA levels between the parental strains or the F<sub>2 </sub>animals, the signs of the eQTLs for tandem genes on the genome would be expected to be random, and follow a simple Binomial distribution. In the BxH cross, which contained the majority of CEGs, we observed the instances of 3 or 4 tandem CEGs having the same sign of their <it>cis</it>-eQTLs. If the signs of the eQTLs for CEGs were completely random, the probability of 3 tandem CEGs having the same sign of their eQTLs would be 0.25, the probability of 4 tandem CEGs having the same sign of their eQTLs would be 0.125. Using a Binomial distribution we did not observe any deviation from these probabilities in our data (p > 0.1). This suggests that in our set of CEGs, variations in long duplication regions between strains (containing 3 or more CEGs), was not an important factor in giving rise to a significant fraction of the <it>cis</it>-eQTLs.</p>
         </sec>
         <sec>
            <st>
               <p>Description of the IBD map used for analysis</p>
            </st>
            <p>The IBD map used in this study <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> was constructed by looking at 50 Kb sequence windows in the different strains (moving the window through the genome at 10 Kb steps), and identifying as IBD regions the windows that had fewer than five consecutive SNPs <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Although this provides a comprehensive IBD map for multiple mouse strains for which complete genome sequences are not yet available, the IBD segments defined in this way are coarse as they have been derived using a ~10,300 SNP genotype map, whereas there are more than 2.5 million SNPs reported in the different mouse strains <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. While extensive genome sequence coverage is available for the B6 and DBA strains <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B33">33</abbr></abbrgrp>, allowing for a high-resolution IBD map to be constructed between these two strains, the BXD cross provides a small fraction (338 out of 4,107) of the total CEGs considered in this study. On the other hand, the BXH cross provides a far richer set of CEGs (3,769 out of the 4,107 considered in this study), but the complete C3H (C3H/HeJ) genomic sequence is not available. Therefore, we chose to leverage one of the previously published comprehensive, lower resolution maps based on a consistent set of SNPs genotyped in the B6, DBA, and C3H strains [19, 76]. The utility of the IBD map used here <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> has been demonstrated by its successful application in the identification of a causal disease gene <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and we anticipate that our conclusions will remain the same (although some of the specific numbers presented here may change) with a finer IBD map that will become available at a later date.</p>
            <p>With the IBD map used here <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> (built with a set of around ~10,300 SNPs), we examined the amount of genomic sequence, and the numbers of SNPs and genes falling within and outside of the IBD blocks. Of the 2.5 Gb mouse genomic sequence <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, 1.14 Gb (~45.6%) fell within IBD blocks between B6 and DBA, whereas 1.01 Gb (~40.4%) fell within IBD blocks between B6 and C3H. Of the 492,250 SNPs compiled largely from the Celera mouse sequence database <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> that were outside repeat regions and polymorphic between B6 and DBA, 404,095 (~82.1%) fell in regions that were <ul>n</ul>ot in <ul>IBD</ul> (nIBD). Of the 18,774 autosomal genes represented on the BXD microarray, 10,259 (54.6%) were contained within regions that were nIBD between the two parental strains B6 and DBA (which is according to expectation, since 100&#8211;45.6 or 54.4% of the total genomic sequence is in nIBD regions between these two strains, as noted above), whereas 279 of the 338 CEGs (82.5%) were in nIBD (p = 8.87 &#215; 10<sup>-28 </sup>as determined by the Fisher exact test). Of the 19,197 autosomal genes represented on the microarray for the BXH study, 11,628 (60.6%) genes were in nIBD regions between B6 and C3H (again according to expectation since 100&#8211;40.4 or ~59.6% of mouse genomic sequence is in nIBD regions between these two strains), whereas 3,219 out of the 3,769 CEGs (85.4%) were nIBD between these two strains (p = 4.08 &#215; 10<sup>-296</sup>, by the Fisher exact test). The significant enrichment of CEGs in nIBD regions is expected since there should be <it>cis</it>-variants near the CEGs that result in <it>cis</it>-linkage by altering one or more regulatory elements, and by definition these variants are likely to be largely biased towards the nIBD regions.</p>
         </sec>
         <sec>
            <st>
               <p>SNPs perturbing transcription factor binding sites (TFBS)</p>
            </st>
            <p>For finding the overlap of known transcription factor binding sites with SNPs, 30 bp sequences around the SNPs (total 61 bp) were taken, and the experimentally determined human, rat and mouse binding sites from the TRANSFAC<sup>&#174; </sup>database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> were mapped to these sequences (both B6 and DBA alleles) using BLASTN. Only sites which mapped with a threshold of 95% identity were kept. 16 distinct binding sites mapped in this way overlapped with SNPs between B6 and DBA. However of these, none overlapped with <it>cis</it>-SNPs that were in the promoter or non-coding region of any of the genes.</p>
            <p>Consequently, transcription factor binding site predictions were made on the 61 bp sequences around the SNPs as described below. Although TFBSs are often short, we took a length of 30 bp on either side of the SNPs, since in the TRANSFAC<sup>&#174; </sup>database the longest vertebrate position weight matrix was 30 bp. Site predictions were made with the vertebrate position weight matrix models (PWMs) available from the TRANSFAC<sup>&#174; </sup>database (version 6.3) <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> using the MATCH&#8482; software <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. The individual TFBS prediction cutoff scores were given by TRANSFAC<sup>&#174; </sup>(based on an algorithm that minimizes the sum of the false positive and false negative rates of predictions using known sites). With the application of the TRANSFAC thresholds, the scores of individual sites ranged from 0.751 to 1. For each SNP, two sequences were generated containing the B6 and DBA alleles. Both sequences were scored and the change in score of the binding site prediction due to a SNP was recorded. In the analyses we have reported, we did not chose a threshold for the score difference due to a SNP intersecting with a TFBS prediction. Once a binding site was predicted with the threshold given by TRANSFAC, any change to the score of that site was considered as a potential perturbation that could represent an alteration of binding of the TF to that site leading to a change in expression. We also used an increased threshold of 0.01 for score changes to a site by an SNP, and that provided very similar results and identical conclusions (data not shown).</p>
         </sec>
         <sec>
            <st>
               <p>Correlations between transcription factors and genes from mouse body atlas</p>
            </st>
            <p>Expression profiles of all known mouse genes were determined over 145 tissues and cell lines (called 'Body atlas' data set) as described previously <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp>. Spearman (rank order) correlations were determined between the mRNA levels of each of the known transcription factors and all other genes and correlates with p-value &lt; 0.01 were stored for analysis.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>DG: Conceived the study, performed analyses and wrote the paper</p>
         <p>TX: Performed analyses and helped in writing the paper</p>
         <p>MA: Performed analyses</p>
         <p>SWE: Performed some analyses and provided advice</p>
         <p>GL: Provided technical support</p>
         <p>SSW: Performed experiments and generated data</p>
         <p>EES: Conceived the study, performed analyses, helped in writing of the paper, and secured funding</p>
         <p>All authors have read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank John Lamb and Thomas Drake for helpful discussions. Aldons J. Lusis is thanked for helpful comments on the manuscript. Barmak Modrek, Archie Russell, Jin Ma and Jun Zhu thanked for technical support. This work was supported in part by NIH grants HL30568 and DK 071673.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>To a future of genetic medicine</p>
            </title>
            <aug>
               <au>
                  <snm>Chakravarti</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>822</fpage>
            <lpage>823</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057281</pubid>
                  <pubid idtype="pmpid" link="fulltext">11236997</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Regulatory polymorphisms underlying complex disease traits</p>
            </title>
            <aug>
               <au>
                  <snm>Knight</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>J Mol Med</source>
            <pubdate>2005</pubdate>
            <volume>83</volume>
            <fpage>97</fpage>
            <lpage>109</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00109-004-0603-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">15592805</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Tomso</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>Toxicol Appl Pharmacol</source>
            <pubdate>2005</pubdate>
            <volume>207</volume>
            <fpage>84</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.taap.2004.09.024</pubid>
                  <pubid idtype="pmpid" link="fulltext">16002116</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Cis-acting regulatory variation in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Pastinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hudson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <fpage>647</fpage>
            <lpage>650</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1101659</pubid>
                  <pubid idtype="pmpid" link="fulltext">15499010</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Allele-specific gene expression differences in humans</p>
            </title>
            <aug>
               <au>
                  <snm>Buckland</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2004</pubdate>
            <volume>13 Spec No 2</volume>
            <fpage>R255</fpage>
            <lpage>60</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddh227</pubid>
                  <pubid idtype="pmpid" link="fulltext">15358732</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>An integrative genomics approach to infer causal associations between gene expression and disease</p>
            </title>
            <aug>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Lamb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guhathakurta</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sieberts</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Monks</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Reitman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Leonardson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Thieringer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Metzger</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Castle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kash</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Sachs</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <fpage>710</fpage>
            <lpage>717</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1589</pubid>
                  <pubid idtype="pmpid" link="fulltext">15965475</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease</p>
            </title>
            <aug>
               <au>
                  <snm>Hubner</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wallace</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Zimdahl</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Petretto</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Schulz</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Maciver</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Mueller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hummel</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Monti</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zidek</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Musilova</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kren</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Causton</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Game</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Born</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cook</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Kurtz</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Whittaker</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pravenec</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Aitman</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <fpage>243</fpage>
            <lpage>253</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1522</pubid>
                  <pubid idtype="pmpid" link="fulltext">15711544</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'</p>
            </title>
            <aug>
               <au>
                  <snm>Bystrykh</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Weersing</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Dontje</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pletcher</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Wiltshire</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Su</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Vellenga</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Manly</snm>
                  <fnm>KF</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Chesler</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Alberts</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Cooke</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>de Haan</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <fpage>225</fpage>
            <lpage>232</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1497</pubid>
                  <pubid idtype="pmpid" link="fulltext">15711547</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Detection of regulatory variation in mouse genes</p>
            </title>
            <aug>
               <au>
                  <snm>Cowles</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Hirschhorn</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Altshuler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32</volume>
            <fpage>432</fpage>
            <lpage>437</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng992</pubid>
                  <pubid idtype="pmpid" link="fulltext">12410233</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Genetics of gene expression surveyed in maize, mouse and man</p>
            </title>
            <aug>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Monks</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Che</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Colinayo</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ruff</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Milligan</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Lamb</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Cavet</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Linsley</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Friend</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>422</volume>
            <fpage>297</fpage>
            <lpage>302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01434</pubid>
                  <pubid idtype="pmpid" link="fulltext">12646919</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Cis-acting expression quantitative trait loci in mice</p>
            </title>
            <aug>
               <au>
                  <snm>Doss</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>681</fpage>
            <lpage>691</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1088296</pubid>
                  <pubid idtype="pmpid" link="fulltext">15837804</pubid>
                  <pubid idtype="doi">10.1101/gr.3216905</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Local Regulatory Variation in Saccharomyces cerevisiae</p>
            </title>
            <aug>
               <au>
                  <snm>Ronald</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brem</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Whittle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kruglyak</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <fpage>e25</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1189075</pubid>
                  <pubid idtype="pmpid" link="fulltext">16121257</pubid>
                  <pubid idtype="doi">10.1371/journal.pgen.0010025</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus</p>
            </title>
            <aug>
               <au>
                  <snm>Horikawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Oda</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Orho-Melander</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hara</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hinokio</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lindner</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Mashima</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schwarz</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>del Bosque-Plata</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Oda</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yoshiuchi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Colilla</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Polonsky</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Concannon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Iwasaki</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schulze</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Baier</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Bogardus</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Groop</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Boerwinkle</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hanis</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>GI</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <fpage>163</fpage>
            <lpage>175</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/79876</pubid>
                  <pubid idtype="pmpid" link="fulltext">11017071</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The gene encoding phosphodiesterase 4D confers risk of ischemic stroke</p>
            </title>
            <aug>
               <au>
                  <snm>Gretarsdottir</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thorleifsson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Reynisdottir</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Manolescu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jonsdottir</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jonsdottir</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gudmundsdottir</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bjarnadottir</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Einarsson</snm>
                  <fnm>OB</fnm>
               </au>
               <au>
                  <snm>Gudjonsdottir</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gudmundsson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gudmundsdottir</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Andrason</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gudmundsdottir</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Sigurdardottir</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chou</snm>
                  <fnm>TT</fnm>
               </au>
               <au>
                  <snm>Nahmias</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Goss</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sveinbjornsdottir</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Valdimarsson</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Jakobsson</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Agnarsson</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Gudnason</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Thorgeirsson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fingerle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gurney</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gudbjartsson</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Frigge</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Kong</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stefansson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gulcher</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>35</volume>
            <fpage>131</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1245</pubid>
                  <pubid idtype="pmpid" link="fulltext">14517540</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Genetic loci determining bone density in mice with diet-induced atherosclerosis</p>
            </title>
            <aug>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hannani</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kabo</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Krass</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Colinayo</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Greaser</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>Goldin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Physiol Genomics</source>
            <pubdate>2001</pubdate>
            <volume>5</volume>
            <fpage>205</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11328966</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Genetic and Genomic Analysis of a Fat Mass Trait with Complex Inheritance Reveals Marked Sex Specificity</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yehya</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <fpage>e15</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1359071</pubid>
                  <pubid idtype="pmpid" link="fulltext">16462940</pubid>
                  <pubid idtype="doi">10.1371/journal.pgen.0020015</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Mapping mendelian factors underlying quantitative traits using RFLP linkage maps</p>
            </title>
            <aug>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1989</pubdate>
            <volume>121</volume>
            <fpage>185</fpage>
            <lpage>199</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1203601</pubid>
                  <pubid idtype="pmpid" link="fulltext">2563713</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome</p>
            </title>
            <aug>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Wade</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Hinds</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Patil</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>1493</fpage>
            <lpage>1500</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">509258</pubid>
                  <pubid idtype="pmpid" link="fulltext">15289472</pubid>
                  <pubid idtype="doi">10.1101/gr.2627804</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels</p>
            </title>
            <aug>
               <au>
                  <snm>Cervino</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Laurie</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Tokiwa</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Castellini</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sachs</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2005</pubdate>
            <volume>86</volume>
            <fpage>505</fpage>
            <lpage>517</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ygeno.2005.07.010</pubid>
                  <pubid idtype="pmpid" link="fulltext">16126366</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>A comparison of whole-genome shotgun-derived mouse chromosome 16 and the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Mural</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
               <au>
                  <snm>Miklos</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Wides</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Nadeau</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Kodira</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Evangelista</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Gan</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Heiman</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Merkulov</snm>
                  <fnm>GV</fnm>
               </au>
               <au>
                  <snm>Milshina</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Naik</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shue</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Yan</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yooseph</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Biddick</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bolanos</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Delcher</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Dew</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Fasulo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Flanigan</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Huson</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Kravitz</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Mobarry</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Reinert</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Remington</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>XH</fnm>
               </au>
               <au>
                  <snm>Nusskern</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Lai</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Lei</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zhong</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Guan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ji</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>ZY</fnm>
               </au>
               <au>
                  <snm>Zhong</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Xiao</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Chiang</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Yandell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wortman</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Amanatides</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Hladun</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Pratts</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Woodford</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Gropman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Houck</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Tompkins</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Haynes</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jacob</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chin</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Dahlke</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Sanders</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Levitsky</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Majoros</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Xia</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Donnelly</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Newman</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Glodek</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kraft</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Nodell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ali</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>An</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Baldwin-Pitts</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Beeson</snm>
                  <fnm>KY</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Carnes</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Carver</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Caulk</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Center</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Coyne</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Crowder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Danaher</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Davenport</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Desilets</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Dietz</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Doup</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dullaghan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ferriera</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fosler</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Gire</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Gluecksmann</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gocayne</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gray</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hart</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Haynes</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hoover</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Howland</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ibegwam</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jalali</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Johns</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kline</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>MacCawley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Magoon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mann</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>May</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>McIntosh</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Mehta</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Moy</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Moy</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Nuri</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Prudhomme</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Puri</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Qureshi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Raley</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Reardon</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Regier</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Romblad</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Schutz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sitter</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Smallwood</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sprague</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Strong</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sylvester</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tint</snm>
                  <fnm>NN</fnm>
               </au>
               <au>
                  <snm>Tsonis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Windsor</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Zaveri</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chaturvedi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gabrielian</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Ke</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Pfannkoch</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Barnstead</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stephenson</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <fpage>1661</fpage>
            <lpage>1671</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1069193</pubid>
                  <pubid idtype="pmpid" link="fulltext">12040188</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p/>
            </title>
            <aug>
               <au>
                  <cnm>http://www.ncbi.nlm.nih.gov/projects/SNP/</cnm>
               </au>
            </aug>
         </bibl>
         <bibl id="B22">
            <title>
               <p>d2_cluster: a validated method for clustering EST and full-length cDNAsequences</p>
            </title>
            <aug>
               <au>
                  <snm>Burke</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Davison</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hide</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>1135</fpage>
            <lpage>1142</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">310833</pubid>
                  <pubid idtype="pmpid" link="fulltext">10568753</pubid>
                  <pubid idtype="doi">10.1101/gr.9.11.1135</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>A comprehensive transcript index of the human genome generated using microarrays and computational approaches</p>
            </title>
            <aug>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>GuhaThakurta</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Holder</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ying</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Svetnik</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Leonardson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hart</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cavet</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Castle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>McDonagh</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kasarskis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Margarint</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Caceres</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Garrett-Engele</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Tsinoremas</snm>
                  <fnm>NF</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>R73</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545593</pubid>
                  <pubid idtype="pmpid" link="fulltext">15461792</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-10-r73</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs</p>
            </title>
            <aug>
               <au>
                  <snm>Cawley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bekiranov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ng</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Kapranov</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sekinger</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Kampa</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Piccolboni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sementchenko</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Drenkow</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yamanaka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brubaker</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tammana</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Helt</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Struhl</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gingeras</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2004</pubdate>
            <volume>116</volume>
            <fpage>499</fpage>
            <lpage>509</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(04)00127-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">14980218</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons</p>
            </title>
            <aug>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Locksley</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Blankespoor</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>ZE</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>288</volume>
            <fpage>136</fpage>
            <lpage>140</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.288.5463.136</pubid>
                  <pubid idtype="pmpid" link="fulltext">10753117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Human-mouse genome comparisons to locate regulatory sites</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Palumbo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fickett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <fpage>225</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/79965</pubid>
                  <pubid idtype="pmpid" link="fulltext">11017083</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Distinguishing regulatory DNA from neutral sites</p>
            </title>
            <aug>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eswara</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>64</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430974</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529307</pubid>
                  <pubid idtype="doi">10.1101/gr.817703</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Identification of conserved regulatory elements by comparative genome analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mendoza</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Engstrom</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Jareborg</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
            </aug>
            <source>J Biol</source>
            <pubdate>2003</pubdate>
            <volume>2</volume>
            <fpage>13</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">193685</pubid>
                  <pubid idtype="pmpid" link="fulltext">12760745</pubid>
                  <pubid idtype="doi">10.1186/1475-4924-2-13</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Genomic regulatory regions: insights from comparative sequence analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Sidow</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>604</fpage>
            <lpage>610</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.gde.2003.10.001</pubid>
                  <pubid idtype="pmpid" link="fulltext">14638322</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Enrichment of regulatory signals in conserved non-coding genomic sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hennenhalli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Workman</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>871</fpage>
            <lpage>877</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.10.871</pubid>
                  <pubid idtype="pmpid" link="fulltext">11673231</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The UCSC Genome Browser Database</p>
            </title>
            <aug>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>YT</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>51</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165576</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519945</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Human-mouse alignments with BLASTZ</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>103</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430961</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529312</pubid>
                  <pubid idtype="doi">10.1101/gr.809403</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Initial sequencing and comparative analysis of the mouse genome</p>
            </title>
            <aug>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Agarwala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ainscough</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Alexandersson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>An</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Barlow</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Berry</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bloom</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botcherby</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bray</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Bult</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cawley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Chinwalla</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Clee</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Cook</snm>
                  <fnm>LL</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Coulson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Cuff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Curwen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cutts</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>David</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Delehaunty</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Deri</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Dewey</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dickens</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dodge</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Dunn</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Emes</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Eswara</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eyras</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Felsenfeld</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Fewell</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Foley</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Frankel</snm>
                  <fnm>WN</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Gage</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Glusman</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gnerre</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Goldman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Goodstadt</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Grafham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Graves</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Gregory</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guyer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hillier</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hlavina</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Holzer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hsu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hua</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hunt</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Jaffe</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Joy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kamal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Karlsson</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kasprzyk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Keibler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kells</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Kirby</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kucherlapati</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Kulbokas</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Kulp</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Landers</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Leger</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Leonard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lloyd</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lucas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Mardis</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mauceli</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>McCarthy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>McLaren</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>McLay</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>McPherson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Meldrim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Meredith</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miner</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Mongin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mott</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mullikin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Muzny</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Nash</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>JO</fnm>
               </au>
               <au>
                  <snm>Nhan</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Nicol</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ning</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Okazaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Overton-Larty</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Parra</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pepin</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pevzner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Plumb</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pohl</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ponce</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Potter</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Rust</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Santos</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sapojnikov</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Seaman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Searle</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sharpe</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sheridan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shownkeen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sims</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Slater</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Spencer</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Stabenau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stange-Thomann</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Suyama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tesler</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Torrents</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Trevaskis</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Tromp</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ucla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ureta-Vidal</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vinson</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Von Niederhausern</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Wade</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Weiss</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Wendl</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>West</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Wetterstrand</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wierzbowski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Willey</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Winter</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Worley</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Wyman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>520</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01262</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>TRANSFAC: transcriptional regulation, from patterns to profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Geffers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gossling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Haubrock</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Karas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Kloos</snm>
                  <fnm>DU</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lewicki-Potapov</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Michael</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Munch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rotert</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Saxel</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Scheer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thiele</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>374</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165555</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520026</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg108</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Specificity, free energy and information content in protein-DNA interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Fields</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1998</pubdate>
            <volume>23</volume>
            <fpage>109</fpage>
            <lpage>113</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0968-0004(98)01187-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">9581503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>DNA binding sites: representation and discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>16</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.1.16</pubid>
                  <pubid idtype="pmpid" link="fulltext">10812473</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Additivity in protein-DNA interactions: how good an approximation is it?</p>
            </title>
            <aug>
               <au>
                  <snm>Benos</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>4442</fpage>
            <lpage>4451</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">137142</pubid>
                  <pubid idtype="pmpid" link="fulltext">12384591</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf578</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Computational identification of transcriptional regulatory elements in DNA sequence</p>
            </title>
            <aug>
               <au>
                  <snm>GuhaThakurta</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>3585</fpage>
            <lpage>3598</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1524905</pubid>
                  <pubid idtype="pmpid" link="fulltext">16855295</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>PupaSNP Finder: a web tool for finding SNPs with putative effect at transcriptional level</p>
            </title>
            <aug>
               <au>
                  <snm>Conde</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Vaquerizas</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Santoyo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Al-Shahrour</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ruiz-Llorente</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Robledo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dopazo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>W242</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441576</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215388</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Identification of functional SNPs in the 5-prime flanking sequences of human genes</p>
            </title>
            <aug>
               <au>
                  <snm>Mottagui-Tabar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Faghihi</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Mizuno</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Engstrom</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Wahlestedt</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>18</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">550646</pubid>
                  <pubid idtype="pmpid" link="fulltext">15717931</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-6-18</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>PromoLign: a database for upstream region analysis and SNPs</p>
            </title>
            <aug>
               <au>
                  <snm>Zhao</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>McLeod</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2004</pubdate>
            <volume>23</volume>
            <fpage>534</fpage>
            <lpage>539</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.20049</pubid>
                  <pubid idtype="pmpid" link="fulltext">15146456</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>rSNP_Guide: an integrated database-tools system for studying SNPs and site-directed mutations in transcription factor binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Ponomarenko</snm>
                  <fnm>JV</fnm>
               </au>
               <au>
                  <snm>Orlova</snm>
                  <fnm>GV</fnm>
               </au>
               <au>
                  <snm>Merkulova</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Gorshkova</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Fokin</snm>
                  <fnm>ON</fnm>
               </au>
               <au>
                  <snm>Vasiliev</snm>
                  <fnm>GV</fnm>
               </au>
               <au>
                  <snm>Frolov</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Ponomarenko</snm>
                  <fnm>MP</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2002</pubdate>
            <volume>20</volume>
            <fpage>239</fpage>
            <lpage>248</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.10116</pubid>
                  <pubid idtype="pmpid" link="fulltext">12325018</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>MATCH: A tool for searching transcription factor binding sites in DNA sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Gossling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Cheremushkin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3576</fpage>
            <lpage>3579</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169193</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824369</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg585</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Applied bioinformatics for the identification of regulatory elements</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>276</fpage>
            <lpage>287</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1315</pubid>
                  <pubid idtype="pmpid" link="fulltext">15131651</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Computational prediction of transcription-factor binding site locations</p>
            </title>
            <aug>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>5</volume>
            <fpage>201</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395725</pubid>
                  <pubid idtype="pmpid" link="fulltext">14709165</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-5-1-201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>A gene expression map for Caenorhabditis elegans</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Lund</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kiraly</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Duke</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stuart</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Eizinger</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wylie</snm>
                  <fnm>BN</fnm>
               </au>
               <au>
                  <snm>Davidson</snm>
                  <fnm>GS</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>293</volume>
            <fpage>2087</fpage>
            <lpage>2092</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1061603</pubid>
                  <pubid idtype="pmpid" link="fulltext">11557892</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Pilpel</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>318</volume>
            <fpage>71</fpage>
            <lpage>81</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(02)00026-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">12054769</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>A gene-coexpression network for global discovery of conserved genetic modules</p>
            </title>
            <aug>
               <au>
                  <snm>Stuart</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SK</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <fpage>249</fpage>
            <lpage>255</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1087447</pubid>
                  <pubid idtype="pmpid" link="fulltext">12934013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Experimental annotation of the human genome using microarray technology</p>
            </title>
            <aug>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Garrett-Engele</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>McDonagh</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Loerch</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Leonardson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Cavet</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>LF</fnm>
               </au>
               <au>
                  <snm>Altschuler</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tsang</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Schimmack</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Schelter</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Koch</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ziman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Cundiff</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Castle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Krolewski</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Burchard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kidd</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Phillips</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Linsley</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>922</fpage>
            <lpage>927</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057141</pubid>
                  <pubid idtype="pmpid" link="fulltext">11237012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Castle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Garrett-Engele</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Duenwald</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Loerch</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Parrish</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>R66</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">328455</pubid>
                  <pubid idtype="pmpid" link="fulltext">14519201</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-10-r66</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Functional analysis of human promoter polymorphisms</p>
            </title>
            <aug>
               <au>
                  <snm>Hoogendoorn</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Guy</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bowen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Buckland</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <fpage>2249</fpage>
            <lpage>2254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddg246</pubid>
                  <pubid idtype="pmpid" link="fulltext">12915441</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>A HANDful of questions: the molecular biology of the heart and neural crest derivatives (HAND)-subclass of basic helix-loop-helix transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Firulli</snm>
                  <fnm>AB</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2003</pubdate>
            <volume>312</volume>
            <fpage>27</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1119(03)00669-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">12909338</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>A survey of genetic and epigenetic variation affecting human gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Pastinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sladek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gurd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sammak</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lepage</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lavergne</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Villeneuve</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gaudin</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Brandstrom</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Verner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kingsley</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Harmsen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Labuda</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Vohl</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Naumova</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Sinnett</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hudson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Physiol Genomics</source>
            <pubdate>2004</pubdate>
            <volume>16</volume>
            <fpage>184</fpage>
            <lpage>193</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14583597</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>In silico genetics: identification of a functional element regulating H2-Ealpha gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Liao</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Guo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Allard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ng</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shafer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Puech</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>McPherson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Foernzler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Peltz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Usuka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <fpage>690</fpage>
            <lpage>695</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1100636</pubid>
                  <pubid idtype="pmpid" link="fulltext">15499019</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Mooney</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>44</fpage>
            <lpage>56</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/6.1.44</pubid>
                  <pubid idtype="pmpid" link="fulltext">15826356</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Abundant raw material for cis-regulatory evolution in humans</p>
            </title>
            <aug>
               <au>
                  <snm>Rockman</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Wray</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>1991</fpage>
            <lpage>2004</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12411608</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>A high proportion of polymorphisms in the promoters of brain expressed genes influences transcriptional activity</p>
            </title>
            <aug>
               <au>
                  <snm>Buckland</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Hoogendoorn</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Guy</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Buxbaum</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Haroutunian</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2004</pubdate>
            <volume>1690</volume>
            <fpage>238</fpage>
            <lpage>249</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15511631</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Strong bias in the location of functional promoter polymorphisms</p>
            </title>
            <aug>
               <au>
                  <snm>Buckland</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Hoogendoorn</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Guy</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2005</pubdate>
            <volume>26</volume>
            <fpage>214</fpage>
            <lpage>223</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.20207</pubid>
                  <pubid idtype="pmpid" link="fulltext">16086313</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>A high proportion of chromosome 21 promoter polymorphisms influence transcriptional activity</p>
            </title>
            <aug>
               <au>
                  <snm>Buckland</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Hoogendoorn</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Guy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Gene Expr</source>
            <pubdate>2004</pubdate>
            <volume>11</volume>
            <fpage>233</fpage>
            <lpage>239</lpage>
            <xrefbib>
               <pubid idtype="pmpid">15200235</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Mapping common regulatory variants to human haplotypes</p>
            </title>
            <aug>
               <au>
                  <snm>Pastinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gurd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gaudin</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dore</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lemire</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lepage</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Harmsen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hudson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>3963</fpage>
            <lpage>3971</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddi420</pubid>
                  <pubid idtype="pmpid" link="fulltext">16301213</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Mapping cis-acting regulatory variation in recombinant congenic strains</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Greenwood</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Sinnett</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Fortin</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Brunet</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fortin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Takane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Skamene</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pastinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hallett</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hudson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Sladek</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Physiol Genomics</source>
            <pubdate>2006</pubdate>
            <volume>25</volume>
            <fpage>294</fpage>
            <lpage>302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1152/physiolgenomics.00168.2005</pubid>
                  <pubid idtype="pmpid" link="fulltext">16449383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene</p>
            </title>
            <aug>
               <au>
                  <snm>Yuh</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Bolouri</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Davidson</snm>
                  <fnm>EH</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1998</pubdate>
            <volume>279</volume>
            <fpage>1896</fpage>
            <lpage>1902</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.279.5358.1896</pubid>
                  <pubid idtype="pmpid" link="fulltext">9506933</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>r_SNP guide examples</p>
            </title>
            <url>http://wwwmgs.bionet.nsc.ru/mgs/programs/rsnp/images/</url>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Human Gene Mutation Database (HGMD): 2003 update</p>
            </title>
            <aug>
               <au>
                  <snm>Stenson</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mort</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Phillips</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Shiel</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>NS</fnm>
               </au>
               <au>
                  <snm>Abeysinghe</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Krawczak</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2003</pubdate>
            <volume>21</volume>
            <fpage>577</fpage>
            <lpage>581</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.10212</pubid>
                  <pubid idtype="pmpid" link="fulltext">12754702</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>A common genetic variant is associated with adult and childhood obesity</p>
            </title>
            <aug>
               <au>
                  <snm>Herbert</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gerry</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>McQueen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Heid</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Pfeufer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Illig</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Wichmann</snm>
                  <fnm>HE</fnm>
               </au>
               <au>
                  <snm>Meitinger</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hunter</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>FB</fnm>
               </au>
               <au>
                  <snm>Colditz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hinney</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hebebrand</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Koberwitz</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ardlie</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lyon</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hirschhorn</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Laird</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Lenburg</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Lange</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Christman</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2006</pubdate>
            <volume>312</volume>
            <fpage>279</fpage>
            <lpage>283</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1124779</pubid>
                  <pubid idtype="pmpid" link="fulltext">16614226</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>TCF7L2 polymorphisms and progression to diabetes in the Diabetes Prevention Program</p>
            </title>
            <aug>
               <au>
                  <snm>Florez</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Jablonski</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Bayley</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Pollin</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>de Bakker</snm>
                  <fnm>PI</fnm>
               </au>
               <au>
                  <snm>Shuldiner</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Knowler</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Nathan</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Altshuler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>N Engl J Med</source>
            <pubdate>2006</pubdate>
            <volume>355</volume>
            <fpage>241</fpage>
            <lpage>250</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJMoa062418</pubid>
                  <pubid idtype="pmpid" link="fulltext">16855264</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Transcriptional regulatory code of a eukaryotic genome</p>
            </title>
            <aug>
               <au>
                  <snm>Harbison</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Rinaldi</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Macisaac</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Danford</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Hannett</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Tagne</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Reynolds</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Yoo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jennings</snm>
                  <fnm>EG</fnm>
               </au>
               <au>
                  <snm>Zeitlinger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pokholok</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rolfe</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Takusagawa</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Gifford</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Fraenkel</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>431</volume>
            <fpage>99</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02800</pubid>
                  <pubid idtype="pmpid" link="fulltext">15343339</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits</p>
            </title>
            <aug>
               <au>
                  <snm>Mehrabian</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Allayee</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Stockton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Drake</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Castellani</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lamb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lusis</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <fpage>1224</fpage>
            <lpage>1233</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1619</pubid>
                  <pubid idtype="pmpid" link="fulltext">16200066</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>UCSC mouse-human alignments</p>
            </title>
            <url>http://genome-archive.cse.ucsc.edu/goldenPath/mm4/vsHg16/</url>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay</p>
            </title>
            <aug>
               <au>
                  <snm>Hardenbol</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Belmont</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mackenzie</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bruckner</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brundage</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Boudreau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chow</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eberle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Erbilgin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Falkowski</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fitzgerald</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ghose</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Iartchouk</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Jain</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Karlin-Neumann</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Miao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Moorhead</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Namsaraev</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pasternak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Prakash</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Willis</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>269</fpage>
            <lpage>275</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">546528</pubid>
                  <pubid idtype="pmpid" link="fulltext">15687290</pubid>
                  <pubid idtype="doi">10.1101/gr.3185605</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Microarray standard data set and figures of merit for comparing data processing methods and experiment designs</p>
            </title>
            <aug>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Cavet</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Stepaniants</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Duenwald</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kleinhanz</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>956</fpage>
            <lpage>965</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg126</pubid>
                  <pubid idtype="pmpid" link="fulltext">12761058</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>QTL Cartographer</p>
            </title>
            <url>http://statgen.ncsu.edu/qtlcart/</url>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Nonsense-mediated mRNA decay: molecular insights and mechanistic variations across species</p>
            </title>
            <aug>
               <au>
                  <snm>Conti</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Izaurralde</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Curr Opin Cell Biol</source>
            <pubdate>2005</pubdate>
            <volume>17</volume>
            <fpage>316</fpage>
            <lpage>325</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ceb.2005.04.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15901503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Analysis of segmental duplications and genome assembly in the mouse</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Ventura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rocchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>789</fpage>
            <lpage>801</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">479105</pubid>
                  <pubid idtype="pmpid" link="fulltext">15123579</pubid>
                  <pubid idtype="doi">10.1101/gr.2238404</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Map of genomic duplications in mouse</p>
            </title>
            <url>http://mouseparalogy.gs.washington.edu/</url>
         </bibl>
         <bibl id="B76">
            <title>
               <p>Use of a dense single nucleotide polymorphism map for in silico mapping in the mouse</p>
            </title>
            <aug>
               <au>
                  <snm>Pletcher</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>McClurg</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Batalov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Su</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Barnes</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Lagler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Korstanje</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Nusskern</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bogue</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Mural</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Paigen</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wiltshire</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2004</pubdate>
            <volume>2</volume>
            <fpage>e393</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">526179</pubid>
                  <pubid idtype="pmpid" link="fulltext">15534693</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0020393</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
