<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-12-122</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Opinion</dochead>
      <bibl>
         <title>
            <p>Multi-species sequence comparison: the next frontier in genome annotation</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Dubchak</snm>
               <fnm>Inna</fnm>
               <insr iid="I1"/>
               <email>ildubchak@lbl.gov</email>
            </au>
            <au id="A2">
               <snm>Frazer</snm>
               <fnm>Kelly</fnm>
               <insr iid="I2"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA</p>
            </ins>
            <ins id="I2">
               <p>Perlegen Sciences, 2021 Stierlin Ct., Mountain View, CA 94043, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>12</issue>
         <fpage>122</fpage>
         <url>http://genomebiology.com/2003/4/12/122</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-12-122</pubid>
               <pubid idtype="pmpid">14659006</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>27</day>
               <month>11</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Multi-species sequence comparison: the next frontier in genome annotation</p>
      </shorttitle>
      <shortabs>
         <p>Most current computational tools have been designed for pairwise comparisons of DNA sequences, and efficient extension of these tools to multiple species will require knowledge of the ideal evolutionary distance to choose and the development of new algorithms for alignment, analysis of conservation, and visualization of results.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Multi-species comparisons of DNA sequences are more powerful for discovering functional sequences than pairwise DNA sequence comparisons. Most current computational tools have been designed for pairwise comparisons, and efficient extension of these tools to multiple species will require knowledge of the ideal evolutionary distance to choose and the development of new algorithms for alignment, analysis of conservation, and visualization of results.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>Comparison of DNA sequences from different species is an extremely efficient way to identify functional DNA elements - both coding regions and transcriptional control regions that lie beyond the coding sequences of genes. Several recent reviews of comparative sequence analysis <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> describe this fast-growing field and the computational resources that are currently available for a wide range of biological investigations. Most of the large-scale comparative studies completed to date have been based on pairwise comparison of sequences; such studies have resulted in the identification of new genes <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>, and have proved efficient at discovering functional elements in non-coding genomic intervals <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. Several groups have aligned the entire human and mouse genomes <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp> and have presented comprehensive statistical data on the patterns of DNA conservation between the two species.</p>
         <p>Recent comparative studies demonstrate that adding additional species to the analysis provides an even more powerful approach for detecting functionally important elements, because characteristic signatures - such as open reading frames and splice-site consensus sequences within genes, and motifs within regulatory elements - are easier to detect when they are conserved in multiple species <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. For example, a recent large-scale study of over 12 megabases (Mb) of sequences from 12 species, derived from genomic regions orthologous to a 1.8 Mb region on human chromosome 7 that contains ten genes <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, demonstrated that some highly conserved elements revealed by multiple sequence alignments could not be reliably identified with any set of parameters in a pairwise human-mouse alignment.</p>
         <p>As the number of available complete genome sequences increases, there is a clear need to understand what we can learn from multiple-species sequence comparisons. Studies of this type will require the development of new comparative algorithms and computational tools, such as multi-genome alignment techniques, analysis of conservation and visualization of comparative results. Developing easy-to-use and efficient techniques is not trivial, however, given that the algorithms should be capable of handling a whole range of evolutionary distances between multiple species and of providing new insights into biology.</p>
      </sec>
      <sec>
         <st>
            <p>Selecting multiple species for comparative analysis</p>
         </st>
         <p>The comparison of DNA sequences between evolutionarily distantly related species, such as humans and pufferfish, which diverged approximately 450 million years ago, primarily identifies the coding sequences as conserved <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> - because transcribed protein coding sequences are highly functionally constrained, and thus change very slowly during evolution. The comparison of DNA sequences between species that diverged from a common ancestor around 40-80 million years ago - such as humans and mice, or two species of fruitflies (<it>Drosophila melanogaster </it>and <it>Drosophila pseudoobscura</it>), or two species of nematodes (<it>Caenorhabditis elegans </it>and <it>Caenorhabditis briggsae</it>) <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> - results in identifying as evolutionarily conserved both coding sequences and a significant number of noncoding sequences. Only a limited number of conserved noncoding sequences that have been identified through sequence comparisons between species at this evolutionary distance have been characterized functionally, however. Among those that have had their functions assigned are transcriptional regulatory elements of genes in close proximity <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> or genes as far away from each other as 200 kb <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Comparative analyses of genomic DNA from closely related species, such as humans and chimpanzees, on the other hand, identifies those sequences that have changed in recent evolutionary history <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Some of these sequence changes may have been partly responsible for the speciation of ancestral primates. Thus, comparison of a segment of DNA with the sequences of multiple species at different evolutionary distances allows one to identify coding sequences, conserved noncoding elements with regulatory functions, and those sequences that are unique for a given species. A recent report by Cooper <it>et al</it>. <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> proposed a method for quantitatively assessing the effectiveness of a comparative sequence analysis to identify new information in a genome: it uses the 'phylogenetic scope', representing the range of organisms that share a last common ancestor whose sequence can be inferred by adding each genome to the analysis. The comparative studies described below demonstrate that the evolutionary distance of the species in a sequence comparison analysis is critical for discovering potentially functional sequence elements.</p>
         <sec>
            <st>
               <p>The stem cell leukemia genomic interval</p>
            </st>
            <p>For many years the mouse and human genomes, which diverged from a common ancestor about 65-75 million years ago, have been extensively used for comparative studies <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>; but it is still an open question as to which species should be added to this comparative analysis to derive the most information content. Among several recent studies providing guidance for selecting additional species is the investigation of the stem cell leukemia (<it>SCL</it>) genomic interval, originally based on a human-mouse sequence comparison <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, and later expanded to include three additional species: chicken, pufferfish and zebrafish <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. This analysis demonstrates that mouse-human alignments show high levels of sequence similarity for all coding exons and for all eight known murine regulatory regions of the <it>SCL </it>locus. Human-mouse-chicken alignments identified the similarity of all coding exons and also discovered protein-binding motifs in five of the known regulatory regions. Thus, inclusion of the chicken DNA sequences allowed for superior functional annotation of a subset of the regulatory regions that had already been identified by the human-mouse comparison.</p>
            <p>Pairwise mouse-pufferfish and mouse-zebrafish sequence alignments identified only some of the coding exons, and found similarity for only two of the eight known regulatory regions in the pufferfish comparison; and no significant similarity was found for any known regulatory region in the zebrafish comparison <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. This analysis suggests that comparative analysis of zebrafish and mammalian genomic sequences might be of limited value for the identification of functionally significant noncoding sequences in the <it>SCL </it>region; and these results are consistent with what is expected on the basis of the evolutionary distance of the species analyzed.</p>
         </sec>
         <sec>
            <st>
               <p><it>Drosophila melanogaster </it>compared with other species</p>
            </st>
            <p>The analysis of conservation between <it>Drosophila melanogaster </it>and four other <it>Drosophila </it>species (<it>D. erecta</it>, <it>D. pseudoobscura</it>, <it>D. willistoni</it>, and <it>D. littoralis</it>) that have different divergence times (6-15, 46, 53 and 61-65 million years, respectively) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> has generated several important conclusions to guide further functional studies of these species <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. One conclusion is that the addition of a third species could reveal functional constraints in otherwise nonsignificant pairwise exon comparisons. All <it>D. melanogaster </it>genes identified in divergent species show evidence of functional constraint; and including more distantly related species defines the exact position of short regulatory elements that are hard to find in the long regions of non-coding sequence conservation observed in closely related species. Non-coding conserved sequences have also been found to be spatially clustered, and these clusters can be used to predict enhancer sequences <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. This work provided a solid basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and <it>cis</it>-regulatory sequences in <it>D. melanogaster</it>: <it>D. pseudoobscura</it>, which has recently been sequenced, was recognized as the best for discovery of functional genomic features, and adding <it>D. willistoni </it>to the comparison allows the dissection of regions of the <it>Drosophila </it>genome under different levels of functional constraint.</p>
         </sec>
         <sec>
            <st>
               <p>
                  <it>Saccharomyces cerevisiae</it>
               </p>
            </st>
            <p>Using multiple alignments in the <it>S. cerevisiae </it>and related fungal genome annotation projects <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> provided a powerful demonstration of functional analysis, yielding results that would be difficult to obtain by other computational and experimental methods. A comprehensive comparison of the genome of the yeast <it>S. cerevisiae </it>with those of three related <it>Saccharomyces </it>species (<it>S. paradoxus</it>, <it>S. mikatae </it>and <it>S. bayanus</it>) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> yielded a major revision to the yeast gene catalog, reducing the total count by about 500 genes. In addition, motif analysis automatically identified a number of genome-wide elements, including most known regulatory motifs and numerous new motifs suitable for biological study.</p>
         </sec>
         <sec>
            <st>
               <p>Multiple primate analysis</p>
            </st>
            <p>Another approach to multiple species sequence analysis, 'phylogenetic shadowing' <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, is used for comparison of evolutionarily closely related species. It demonstrates the utility of sequence comparisons within the primate group for discovering common mammalian, as well as primate-specific, functional elements in the human genome, which could not be achieved by comparison of more evolutionarily distant species. Rubin and colleagues <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> showed that the high information content of comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates, such that sequence from as few as four or six primate species compared with humans might be sufficient for the identification of a large fraction of functional elements in the human genome, many of which are likely to be missed by human-mouse comparisons. While the number of multi-species comparative studies grows, it is becoming clear that reasonable selection of species for comparison of a particular genomic interval is still to a large extent an intuitive process, with some guidance from previous successful comparative studies.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Multi-species sequence alignment and analysis of conservation</p>
         </st>
         <p>As well as selecting a set of species that provide maximum functional content, the quality of the sequence alignment must also be sufficient to the task in hand. Single pairwise comparisons of sequences do not allow for the detection of conserved sequence strings with high precision, given that functional elements - such as transcriptional-regulator binding sites - are quite short compared to the surrounding nonfunctional sequence. Thus, functional signals can sometimes be indistinguishable from the 'noise' that results from aligning divergent nonfunctional sequences. The hope is that multiple sequence alignment provides a way to increase the sensitivity of the search for regulatory signals.</p>
         <p>The area of sequence alignment is well developed, but many of its problems are far from being completely resolved, especially for multiple species <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Alignment methods can be roughly divided into local alignments, which produce optimal similarity scores between subregions of the two sequences, and global alignments, which generate optimal similarity scores over the entire length of the two sequences. Global alignments attempt to find a monotonically increasing map between the letters of each sequence, in the process rejecting alignments that overlap or cross over. A recently published review on comparative genomics gives more details of the various kinds of alignment <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Unfortunately, a comprehensive study of the strengths and weaknesses of different alignments algorithms applied to different biological problems has yet to appear.</p>
         <p>The local and global alignment methods that generate pairwise comparisons can also be used for multiple species, but multiple alignments are considerably more difficult to compute because of statistical complexity and the difficulties of scoring the results. Progressive multiple alignment is a heuristic technique that uses successive applications of a pairwise alignment algorithm. The best-known progressive alignment program, CLULSTALW <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, is very efficient in aligning proteins and short nucleotide sequences, but it is not suitable for long genomic regions <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Below we describe new alignment techniques that can handle long DNA sequences efficiently.</p>
         <sec>
            <st>
               <p>Global alignments</p>
            </st>
            <p>Two recently developed algorithms, MLAGAN <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B30">30</abbr></abbrgrp> and MAVID <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>, are designed for global alignment of both evolutionarily close and distant megabase-length genomic sequences. The MLAGAN <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B30">30</abbr></abbrgrp> algorithm assumes that the phylogenetic tree is known, as is usually the case for large vertebrate genomes. The program is based on progressive alignment: a multiple alignment of <it>K </it>sequences is constructed in <it>K-1 </it>pairwise alignment steps, such that in each step two sequences, or intermediate multiple alignments, are aligned.</p>
            <p>MLAGAN uses LAGAN as the global pairwise-alignment subroutine, and introduces new methods for scoring and refining a multiple alignment. It also aligns the sequences in the order of the given phylogenetic tree. For example, MLAGAN aligns sequences from human, chimpanzee, mouse, rat, and chicken, in the following order: first, human-chimpanzee; second, mouse-rat; third, human-chimpanzee to mouse-rat; fourth, human-chimpanzee-mouse-rat to chicken. Each alignment step merges two sequences or alignments into a larger alignment, effectively building a profile of all the sequences. The results obtained with MLAGAN on the cystic fibrosis (<it>CFTR</it>) genomic region <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, suggest that multiple alignments are better than pairwise alignments at aligning conserved exons between distant species: it was precise enough to refine mis-annotated splicing sites.</p>
            <p>MAVID <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp> is a progressive global alignment program that works by recursively aligning the 'alignments' at ancestral nodes of the guide phylogenetic tree. At each internal node, ancestral sequences are inferred from the existing alignments using maximum likelihood, and these alignments are then aligned using the global aligner AVID <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. The multiple alignment is used to build a phylogenetic tree for the sequences, which is subsequently used as a basis for identifying conserved regions in the alignment.</p>
         </sec>
         <sec>
            <st>
               <p>Local alignments</p>
            </st>
            <p>MultiPipMaker <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp> uses multiple pairwise local alignments of secondary sequences against a reference sequence to create a crude multiple alignment that is subsequently refined to generate a true multiple alignment. Analysis of multiple alignments generated by MultiPipMaker <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> allowed for discovery of regulatory elements in the mammalian <it>WNT2 </it>genomic region, and confirmed the phylogenetic inference that horses are evolutionarily more closely related to cats than to cows <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Alignments between the human sequence and the sequence of each of the other 12 species used in the analysis <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> showed, as expected, that the fraction of sequence that can be aligned generally decreases with increasing evolutionary distance from humans (except for mouse and rat).</p>
            <p>Another program, Multiz, developed for large-scale comparison of multiple sequences, takes BLASTZ/axtBest <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> as the pairwise input. This program has been used for the alignment of the mouse and the rat draft assemblies to the human genome <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Motif finding</p>
            </st>
            <p>'Phylogenetic footprinting' <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> aims to discover specific protein-binding sites within regulatory regions of multiple sequences on the basis of phylogenetic relationships. It is a method that is mostly applied to promoter regions of orthologous genes. Sumiyama with coauthors <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> attained good results by using multiple sequence comparison combined with a small window size (where the window is the region analyzed in each sub-comparison). This high-resolution procedure can predict the binding sites of transcription factors and reveal polymorphisms in control elements between phylogenetic clades. Phylogenetic footprinting was applied to the <it>Hoxc8 </it>early enhancer region, where it successfully identified a known protein-binding <it>cis</it>-regulatory motif that had previously been analyzed in depth by functional methods <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. The authors demonstrated that an eight-species analysis is clearly superior to the conventional two-species methodology for this type of study.</p>
            <p>Another group of specialized phylogenetic footprinting algorithms finds the most conserved motifs among the input sequences, as measured by a parsimony score of the underlying phylogenetic tree <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. These algorithms have been used successfully to identify a variety of regulatory elements, some known and some novel, in sets of diverse vertebrate DNA sequences. Although phylogenetic footprinting methods show a lot of promising results, their use requires prior information about the location of orthologous regions in genomic intervals of interest. Multiple sequence alignments can help in defining these regions by finding longer conserved regions that can serve as guides to functionally important elements <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Analysis of conservation</p>
            </st>
            <p>The most obvious but difficult question to ask in comparative studies is how to define a functionally significant level of sequence conservation between species. Although two-way comparison is effective for discovery of evolutionarily constrained elements, distinguishing them from conserved sequences that are present due to lack of sufficient divergence time is not straightforward and requires knowledge of the neutral substitution rate <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B41">41</abbr></abbrgrp>. In the majority of comparative genomics studies the definition of a significant level of conservation between two species has been intuitive, or based on biological experience. For example, aligning sequences in divergent noncoding regions proved useful in analyzing the enhancer in the &#946;-globin locus-control region <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. The conventional cutoff of 70-75% conservation over 100 base-pairs for the human-mouse comparison has produced discoveries of several important biologically functional elements <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B43">43</abbr></abbrgrp>. One of the major obstacles to applying a single universal conservation criterion for potential regulatory regions is the substantial variation in the underlying mutation rates from region to region <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B41">41</abbr></abbrgrp>. Conservation scores that incorporate the local neutral substitution rate are now available for the human and mouse genomes <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, and they can help to determine if a particular sequence is likely to be functional.</p>
            <p>A more detailed analysis of interspecies pairwise genomic sequence alignments, aiming to distinguish regulatory regions from neutrally evolving DNA, has appeared recently <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. This study proposed scoring procedures that evaluate alignments for properties other than overall percentage identity, although highly conserved noncoding sequences have proven to be good indicators of regulatory elements; among these procedures are discrimination on the basis of frequencies of nucleotide pairs or gaps, in combination with scoring procedures that include the alignment context, using frequencies of short runs of alignment columns. This study <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> thus gave a good start for extensive testing of these measures.</p>
            <p>Adding genomic sequences from multiple vertebrates to the analysis makes the problem of estimating conservation even less trivial. Expanding pairwise analyses of conservation to multiple sequence alignments would require calculation of the neutral substitution rate between all pairs of sequences. That would give a weighted contribution of each sequence in the multiple alignment, but would also requires much more detailed evolutionary information than is available now. A three-way comparison makes it possible to enrich a pairwise alignment, and a simplified method for calculating a level of active non-coding conservation in such a comparison <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> is based on the supposition that actively conserved human-mouse noncoding sequences are likely to be present in additional mammals, whereas noncoding regions that are similar because of an insufficient accumulation of random mutations will not be present in other mammals.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Visualization of results</p>
         </st>
         <p>Visualization of results is a critical aspect of comparative sequence analysis, since manual examination of alignment on the scale of long genomic regions presents a significant challenge and is not efficient. Alignment-browsing systems should identify regions that exhibit properties suggestive of a particular biological function, for example well-conserved segments within an alignment, or matching the consensus sequence for a specific transcription-factor-binding site <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
         <p>There are several publicly available visualization tools for long pairwise DNA alignments. PipMaker <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B34">34</abbr></abbrgrp> represents the level of conservation in ungapped regions of a BLASTZ local alignment as horizontal dashes. VISTA <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp> displays comparative data in the form of a curve, where conservation is calculated in a sliding window of a gapped global alignment. SynPlot <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> also generates a curve plot calculated from a global alignment, but displays it slightly differently. All three tools can also be used to visualize multiple pairwise alignments <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B12">12</abbr><abbr bid="B23">23</abbr><abbr bid="B34">34</abbr></abbrgrp>, but one of the sequences needs to be selected as a reference, and the level of conservation is displayed on its scale. The same principle of selecting a reference sequence is utilized for whole genomes in the UCSC genome browser <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B48">48</abbr></abbrgrp> and the VISTA browser <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B49">49</abbr></abbrgrp>.</p>
         <p>Figure <figr fid="F1">1</figr> shows a multiple pairwise VISTA display of a 5 kilobase fragment of the <it>CFTR </it>region aligned by MLAGAN <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. This view is based on the coordinates of the human sequence and displays the level of conservation between human and all other sequences in the multiple alignment. The first exon of the <it>CAV1 </it>gene is clearly well conserved across all 11 species, including the pufferfish <it>Fugu</it>. The upstream region of the <it>CAV1 </it>gene (at 183 kb) has a distinct area of non-coding conservation across most of the pairwise comparisons, ranging from human/mouse to human/chicken. On the other hand, there are some peaks of non-coding conservation (at 181 kb) that are found in some mammalian species, but not others.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Multi-VISTA display of a 5 kilobase fragment of the MLAGAN alignment of the <it>CFTR </it>region <abbrgrp><abbr bid="B16">16</abbr></abbrgrp></p>
            </caption>
            <text>
               <p>Multi-VISTA display of a 5 kilobase fragment of the MLAGAN alignment of the <it>CFTR </it>region <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. This view is based on the coordinates of the human sequence and displays the level of conservation between human and all other sequences in the multiple alignment. A fragment of the <it>CAV1 </it>gene is shown as an arrow above the plots. The following cutoffs were used to show the conserved regions: above 80% over 100 bp for chimpanzee, baboon, cat, dog, cow, pig, mouse and rat; above 65% over 100 bp for chicken; above 50% over 100 bp for <it>Fugu </it>and zebrafish.</p>
            </text>
            <graphic file="gb-2003-4-12-122-1"/>
         </fig>
         <p>Knowing the phylogenetic relationship among species is important for building and analyzing multiple alignments, so visualizing sequence alignment data while taking phylogenetic trees into account presents a significant advance. A recently developed new program from the VISTA family, Phylo-VTSTA (short for Phylogenetic VISTA) <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>, uses phylogenetic relationships as a guide to display and analyze the level of conservation across internal nodes of the phylogenetic tree. Using the entire multiple alignment, not a reference sequence, as a base in the <it>x </it>axis of the visualization allows for additional options, such as presentation of comparative data together with available annotations for all sequences, and computation of a measure of similarity for any node of the phylogenetic tree.</p>
         <p>In conclusion, pairwise sequence comparisons of the complete genomes of human and mouse have brought the revolutionary discovery that more than half of the functionally conserved sequences in the human genome are not protein-encoding <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Unfortunately, pairwise studies also make it clear that functional noncoding sequences are not easily distinguished from non-functional segments that happen to have accumulated very few mutations since the last common ancestor. Initial reports suggest that multi-species DNA comparisons have greater potential for filtering out evolutionarily neutral regions, and should therefore provide a more reliable basis for decoding and annotating genomic sequences at high resolution. This would improve our ability to discover non-protein-coding functional elements, which are currently poorly understood in comparison to their coding counterparts. Thus, we face the exciting prospect of discovering which species are the most informative in comparative studies, developing sophisticated algorithms for multi-sequence alignment and analysis of conservation, and building new effective visualization techniques for comparative data.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors are grateful to Nameeta Shah, Michael Brudno, Olivier Couronne, Shyam Prabhakar, Len Pennacchio and Alexander Poliakov for help and discussion. ID was partially supported by the Programs for Genomic Applications grant from the NHLBI/NIH.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Cross-species sequence comparisons: a review of methods and available resources.</p>
            </title>
            <aug>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>1</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.222003</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529301</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Genomic strategies to identify mammalian regulatory sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>100</fpage>
            <lpage>109</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11253049</pubid>
                  <pubid idtype="doi">10.1038/35052548</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Sequence first. Ask questions later.</p>
            </title>
            <aug>
               <au>
                  <snm>Sidow</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2002</pubdate>
            <volume>111</volume>
            <fpage>13</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12372296</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Comparative genomics: genome-wide analysis in metazoan eukaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ureta-Vidal</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ettwiller</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>251</fpage>
            <lpage>262</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1043</pubid>
                  <pubid idtype="pmpid" link="fulltext">12671656</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Comparative genomics approaches to study organism similarities and differences.</p>
            </title>
            <aug>
               <au>
                  <snm>Wei</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Shon</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Biomed Inform</source>
            <pubdate>2002</pubdate>
            <volume>35</volume>
            <fpage>142</fpage>
            <lpage>150</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1532-0464(02)00506-3</pubid>
                  <pubid idtype="pmpid">12474427</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Human and mouse gene structure: comparative analysis and application to exon prediction.</p>
            </title>
            <aug>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>950</fpage>
            <lpage>958</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.7.950</pubid>
                  <pubid idtype="pmpid" link="fulltext">10899144</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Integrating genomic homology into gene structure prediction.</p>
            </title>
            <aug>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Duan</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17 Suppl 1</volume>
            <fpage>S140</fpage>
            <lpage>S148</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11473003</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Parra</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Keibler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lyle</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ucla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>1140</fpage>
            <lpage>1145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0337561100</pubid>
                  <pubid idtype="pmpid" link="fulltext">12552088</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing.</p>
            </title>
            <aug>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Olivier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hubacek</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Fruchart</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Krauss</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>169</fpage>
            <lpage>173</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1064852</pubid>
                  <pubid idtype="pmpid" link="fulltext">11588264</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons.</p>
            </title>
            <aug>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Locksley</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Blankespoor</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>ZE</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>288</volume>
            <fpage>136</fpage>
            <lpage>140</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.288.5463.136</pubid>
                  <pubid idtype="pmpid" link="fulltext">10753117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Oeltjen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>959</fpage>
            <lpage>966</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9331366</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Transcriptional regulation of the stem cell leukemia gene (SCL) - comparative analysis of five vertebrate SCL loci.</p>
            </title>
            <aug>
               <au>
                  <snm>Gottgens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Barton</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Sinclair</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Knudsen</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Grafham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bentley</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>749</fpage>
            <lpage>759</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186570</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997341</pubid>
                  <pubid idtype="doi">10.1101/gr.45502</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Initial sequencing and comparative analysis of the mouse genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Agarwala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ainscough</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Alexandersson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>An</snm>
                  <fnm>P</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>520</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01262</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Strategies and tools for whole genome alignments.</p>
            </title>
            <aug>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bray</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ishkhanov</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ryaboy</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>73</fpage>
            <lpage>80</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12529308</pubid>
                  <pubid idtype="doi">10.1101/gr.762503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Human-mouse alignments with BLASTZ.</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>103</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.809403</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529312</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Do</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Davydov</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Sidow</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <cnm>NISC Comparative Sequencing Program</cnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>721</fpage>
            <lpage>731</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.926603</pubid>
                  <pubid idtype="pmpid" link="fulltext">12654723</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Comparative analyses of multi-species sequences from targeted genomic regions.</p>
            </title>
            <aug>
               <au>
                  <snm>Thomas</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Touchman</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Blakesley</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Bouffard</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Beckstrom-Sternberg</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Margulies</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Siepel</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>McDowell</snm>
                  <fnm>JC</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>424</volume>
            <fpage>788</fpage>
            <lpage>793</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01858</pubid>
                  <pubid idtype="pmpid" link="fulltext">12917688</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Whole-genome shotgun assembly and analysis of the genome of <it>Fugu rubripes</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Aparicio</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stupka</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Putnam</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Chia</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Dehal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Christoffels</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rash</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hoon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <fpage>1301</fpage>
            <lpage>1310</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1072104</pubid>
                  <pubid idtype="pmpid" link="fulltext">12142439</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Conservation, regulation, synteny, and introns in a large-scale <it>C. briggsae </it>- <it>C. elegans </it>genomic alignment.</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Zahler</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1115</fpage>
            <lpage>1125</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.8.1115</pubid>
                  <pubid idtype="pmpid" link="fulltext">10958630</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates.</p>
            </title>
            <aug>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Hinds</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Pant</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Patil</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>341</fpage>
            <lpage>346</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.554603</pubid>
                  <pubid idtype="pmpid" link="fulltext">12618364</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Phylogenetic shadowing of primate sequences to find functional regions of the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Boffelli</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>McAuliffe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>299</volume>
            <fpage>1391</fpage>
            <lpage>1394</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1081331</pubid>
                  <pubid idtype="pmpid" link="fulltext">12610304</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sidow</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <cnm>NISC Comparative Sequencing Program</cnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>813</fpage>
            <lpage>820</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.1064503</pubid>
                  <pubid idtype="pmpid" link="fulltext">12727901</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Long-range comparison of human and mouse <it>SCL </it>loci: localized regions of sensitivity to restriction endonucleases correspond precisely with peaks of conserved noncoding sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Gottgens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Barton</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Grafham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bentley</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>87</fpage>
            <lpage>97</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.153001</pubid>
                  <pubid idtype="pmpid" link="fulltext">11156618</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <aug>
               <au>
                  <snm>Powell</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Progress and Prospects in Evolutionary Biology: The Drosophila Model</source>
            <publisher>Oxford: Oxford University Press</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Assessing the impact of comparative genomic sequence data on the functional annotation of the <it>Drosophila </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Rincon-Limas</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gnirke</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0086.1</fpage>
            <lpage>0086.20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12537575</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0086</pubid>
                  <pubid idtype="pmcid">151188</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Sequencing and comparison of yeast species to identify genes and regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Patterson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Endrizzi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>423</volume>
            <fpage>241</fpage>
            <lpage>254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01644</pubid>
                  <pubid idtype="pmpid" link="fulltext">12748633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Comparison of genomic DNA sequences: solved and unsolved problems.</p>
            </title>
            <aug>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>391</fpage>
            <lpage>397</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.5.391</pubid>
                  <pubid idtype="pmpid" link="fulltext">11331233</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7984417</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>A comprehensive comparison of multiple sequence alignment programs.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Plewniak</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Poch</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>2682</fpage>
            <lpage>2690</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148477</pubid>
                  <pubid idtype="pmpid" link="fulltext">10373585</pubid>
                  <pubid idtype="doi">10.1093/nar/27.13.2682</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>MLAGAN</p>
            </title>
            <url>http://lagan.stanford.edu</url>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The MAVID multiple alignment server.</p>
            </title>
            <aug>
               <au>
                  <snm>Bray</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3525</fpage>
            <lpage>3526</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169029</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824358</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg623</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>MAVID</p>
            </title>
            <url>http://baboon.math.berkeley.edu/mavid</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>AVID: a global alignment program.</p>
            </title>
            <aug>
               <au>
                  <snm>Bray</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>97</fpage>
            <lpage>102</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.789803</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529311</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>PipMaker and MultiPipMaker</p>
            </title>
            <url>http://bio.cse.psu.edu/pipmaker/</url>
         </bibl>
         <bibl id="B35">
            <title>
               <p>MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weirauch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Riemer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <cnm>NISC Comparative Sequencing Program</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3518</fpage>
            <lpage>3524</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168985</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824357</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg579</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Genome browser at UCSC</p>
            </title>
            <url>http://genome.ucsc.edu</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Embryonic epsilon and gamma globin genes of a prosimian primate (<it>Galago crassicaudatus</it>). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints.</p>
            </title>
            <aug>
               <au>
                  <snm>Tagle</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Koop</snm>
                  <fnm>BF</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Slightom</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Hess</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>RT</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1988</pubdate>
            <volume>203</volume>
            <fpage>439</fpage>
            <lpage>455</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3199442</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>An efficient <it>cis</it>-element discovery method using multiple sequence comparisons based on evolutionary relationships.</p>
            </title>
            <aug>
               <au>
                  <snm>Sumiyama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Ruddle</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2001</pubdate>
            <volume>71</volume>
            <fpage>260</fpage>
            <lpage>262</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.2000.6422</pubid>
                  <pubid idtype="pmpid" link="fulltext">11161821</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Identifying functional elements by comparative DNA sequence analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1143</fpage>
            <lpage>1144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.197101</pubid>
                  <pubid idtype="pmpid" link="fulltext">11435394</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Discovery of regulatory elements by a computational method for phylogenetic footprinting.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>739</fpage>
            <lpage>748</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186562</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997340</pubid>
                  <pubid idtype="doi">10.1101/gr.6902</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Conserved noncoding sequences are reliable guides to regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>369</fpage>
            <lpage>372</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02081-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">10973062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>D</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>13</fpage>
            <lpage>26</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.844103</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529302</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Conserved E boxes function as part of the enhancer in hypersensitive site 2 of the &#946;-globin locus control region. Role of basic helix-loop-helix proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1997</pubdate>
            <volume>272</volume>
            <fpage>369</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.272.1.369</pubid>
                  <pubid idtype="pmpid" link="fulltext">8995271</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Distinguishing regulatory DNA from neutral sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eswara</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>64</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.817703</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529307</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Active conservation of noncoding sequences revealed by three-way species comparisons.</p>
            </title>
            <aug>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Mayor</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1304</fpage>
            <lpage>1306</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.142200</pubid>
                  <pubid idtype="pmpid" link="fulltext">10984448</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>VISTA: visualizing global DNA sequence alignments of arbitrary length.</p>
            </title>
            <aug>
               <au>
                  <snm>Mayor</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>1046</fpage>
            <lpage>1047</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.11.1046</pubid>
                  <pubid idtype="pmpid" link="fulltext">11159318</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>VISTA</p>
            </title>
            <url>http://www-gsd.lbl.gov/vista</url>
         </bibl>
         <bibl id="B48">
            <title>
               <p>The human genome browser at UCSC.</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Pringle</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Zahler</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>996</fpage>
            <lpage>1006</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.229102. Article published online before print in May 2002</pubid>
                  <pubid idtype="pmpid" link="fulltext">12045153</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>VISTA Genome Browser</p>
            </title>
            <url>http://pipeline.lbl.gov</url>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Phylo-VISTA: an interactive visualization tool for multiple DNA sequence alignments.</p>
            </title>
            <aug>
               <au>
                  <snm>Shah</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Pennacchio</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bethel</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Hamann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <inpress/>
         </bibl>
      </refgrp>
   </bm>
</art>
