<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-8-r47</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Recent segmental and gene duplications in the mouse genome</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Cheung</snm>
               <fnm>Joseph</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2">
               <snm>Wilson</snm>
               <mi>D</mi>
               <fnm>Michael</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A3">
               <snm>Zhang</snm>
               <fnm>Junjun</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A4">
               <snm>Khaja</snm>
               <fnm>Razi</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A5">
               <snm>MacDonald</snm>
               <mi>R</mi>
               <fnm>Jeffrey</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A6">
               <snm>Heng</snm>
               <mi>HQ</mi>
               <fnm>Henry</fnm>
               <insr iid="I3"/>
            </au>
            <au id="A7">
               <snm>Koop</snm>
               <mi>F</mi>
               <fnm>Ben</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A8" ca="yes">
               <snm>Scherer</snm>
               <mi>W</mi>
               <fnm>Stephen</fnm>
               <insr iid="I1"/>
               <email>steve@genet.sickkids.on.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, and Department of Molecular and Medical Genetics, University of Toronto, 555 University Avenue, Toronto, ON M5G 1X8, Canada</p>
            </ins>
            <ins id="I2">
               <p>Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, V8W 3N5, Canada</p>
            </ins>
            <ins id="I3">
               <p>Wayne State University School of Medicine, Detroit, MI 48202, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>8</issue>
         <fpage>R47</fpage>
         <url>http://genomebiology.com/2003/4/8/R47</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-8-r47</pubid>
               <pubid idtype="pmpid">12914656</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>28</day>
               <month>2</month>
               <year>2003</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>22</day>
               <month>5</month>
               <year>2003</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>17</day>
               <month>6</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>9</day>
               <month>7</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Cheung et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <shorttitle>
         <p>Recent segmental and gene duplications in the mouse genome</p>
      </shorttitle>
      <shortabs>
         <p>BLAST-based computational heuristics were used to identify large and recent segmental duplications in the mouse genome sequence. Here a database of recently duplicated regions of the mouse genome is presented.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (&#8805; 5 kb) and recent (&#8805; 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of <it>Mater</it>, a maternal-effect gene essential for embryogenesis in mice.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010015">Model organisms</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The evolutionary trajectory of duplicated genes has been an active area of investigation since gene duplication was first recognized as an important force in species evolution <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The availability of new sequence data and analyses has challenged the hypothesis suggesting most duplicated genes are destined to lose their function and become pseudogenes, with a few exceptions establishing new biological roles (reviewed in <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>). It is believed that the occurrence of gene duplication would result in relaxed selection of redundant copies permitting genes to evolve specialized sub-functions <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. Moreover, nearly identical genomic regions provide important substrates for chromosomal rearrangements that permit rapid evolutionary changes to occur in a short period of time <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
         <p>An estimated 3.5-5% of the human genome has undergone recent duplication <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>, and these segmental duplications (also termed duplicons or low-copy repeats) are found to be hot spots, or predisposition sites, for the occurrence of nonallelic homologous recombination. This recombination can lead to genomic mutations such as deletion, duplication, inversion, or translocation, resulting in human disease <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Many mouse strains with chromosomal aberrations are known <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, and it remains to be seen whether segmental duplications have a role in any of these genomic mutations.</p>
         <p>A high-quality draft genome sequence not only makes it possible to expand the known set of duplicate genes, but also reveals their genomic context. This genomic context contains the regulatory and structural elements responsible for gene expression that need to be interrogated for a better understanding of the mechanisms and consequences of gene duplication. Furthermore, an accurate and well-annotated mouse genome is an essential resource for many in the biomedical research community, especially those who use the sequence to design and interpret transgenic, mutagenesis, microarray, and proteomic studies.</p>
         <p>Several lines of evidence show that the whole-genome shotgun (WGS) approach yielded a high-quality draft sequence that covers roughly 96% of the euchromatic genome excluding chromosome Y (a female C57BL/6J mouse was used in the sequencing project) <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The WGS sequence reads were assembled into sequence contigs using sequence-assembly programs to produce the February 2002 MGSCv3 working draft <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. The newly released February 2003 assembly was a hybrid assembly comprising 705 megabases (Mb) of finished bacterial artificial chromosome (BAC) sequences incorporated into the MGSCv3 assembly.</p>
         <p>We previously analyzed several versions of the human genome draft assemblies (NCBI Builds 28, 29, and 30) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, and found substantial potential genome assembly errors in all builds, including approximately 40 Mb of sequence in Build 30. These assembly errors probably arose from difficulties in merging finished sequence or from incorrectly assigning sequence contigs into the genome assembly. In such cases, completely identical or nearly identical sequences (due to allelic differences or sequencing errors) would be present at distinct regions in the genome sequence. These sequence misassignment errors would yield near-perfect duplication artifacts, detected as having extremely high sequence identities (exceeding 99.5% and over 5 kilobase (kb) in length), in genome assemblies. However, a small subset of such results could represent duplications that arose from very recent evolutionary events and will require further experimental analysis.</p>
         <p>A number of web-based resources, specifically those provided by the National Center for Biotechnology Information (NCBI), Ensembl (at the European Bioinformatics Institute and Sanger Centre), and the University of California Santa Cruz (UCSC), make the genome sequence and associated annotations readily accessible. Because of the success of the mouse genome sequencing consortium (MSGC), investigators worldwide are utilizing the draft 'as is' in both medical and evolutionary studies. In this paper we show that even though the genome assembly is still in draft form, an initial analysis of the sequence can reveal novel genomic duplications and demarcate regions of the genome that require additional examination.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>We performed a search for all recent segmental duplications that were larger than 5 kb in size and showed greater than 90% sequence identity from both the February 2002 (numerical results for February 2002 assembly are presented at our web site <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>) and the February 2003 mouse genome sequence assemblies <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Our method was based on pairwise (mega-) BLAST2 <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> sequence comparisons between entire chromosome sequences. From our analysis of the February 2003 assembly, a total of 33.6 Mb (1.2%) of the genome sequence (2,695 Mb) was found to be involved in recent segmental duplications (Table <tblr tid="T1">1</tblr>) and 8.9 Mb of this sequence was unmapped data (found in the unmapped chromosome sequence). On the basis of the 20 mapped chromosomes, more than 712 distinct intrachromosomal segmental duplications, comprising 19.9 Mb of sequence (Figure <figr fid="F1">1</figr>), and 475 distinct interchromosomal duplications, comprising 7.1 Mb of sequence, were identified. We also found that 57% of the duplications were in tandem, which we defined as two related intrachromosomal duplicons located within 200 kb of one another.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Intrachromosomal segmental duplications identified in the mouse genome (chromosomes 1-X; results are based on the February 2003 assembly)</p>
            </caption>
            <text>
               <p>Intrachromosomal segmental duplications identified in the mouse genome (chromosomes 1-X; results are based on the February 2003 assembly). Each line represents a duplicated module and connects a paralogous duplicon pair. Red, 99-100% sequence identity; purple, 96-98%; green, 93-95%; and blue, 90-92%. Correspondences to chromosome ideograms (obtained from Ensembl) are only crude. Graphics were produced using GenomePixelizer <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
            </text>
            <graphic file="gb-2003-4-8-r47-1"/>
         </fig>
         <tbl id="T1" hint_layout="double">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Recent segmental duplication in the mouse genome</p>
            </caption>
            <tblbdy cols="8">
               <r>
                  <c ca="left">
                     <p>Chromosome</p>
                  </c>
                  <c ca="right">
                     <p>Chromosome length</p>
                  </c>
                  <c ca="right">
                     <p>Intrachromosomal duplication</p>
                  </c>
                  <c ca="center">
                     <p>%</p>
                  </c>
                  <c ca="right">
                     <p>Interchromosomal duplication</p>
                  </c>
                  <c ca="center">
                     <p>%</p>
                  </c>
                  <c ca="right">
                     <p>Total</p>
                  </c>
                  <c ca="center">
                     <p>%</p>
                  </c>
               </r>
               <r>
                  <c cspan="8">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1</p>
                  </c>
                  <c ca="right">
                     <p>195,869,683</p>
                  </c>
                  <c ca="right">
                     <p>1,392,568</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
                  <c ca="right">
                     <p>238,739</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>1,552,908</p>
                  </c>
                  <c ca="center">
                     <p>0.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>2</p>
                  </c>
                  <c ca="right">
                     <p>181,423,755</p>
                  </c>
                  <c ca="right">
                     <p>1,106,879</p>
                  </c>
                  <c ca="center">
                     <p>0.6</p>
                  </c>
                  <c ca="right">
                     <p>173,602</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>1,184,304</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>3</p>
                  </c>
                  <c ca="right">
                     <p>160,674,399</p>
                  </c>
                  <c ca="right">
                     <p>790,500</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>158,011</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>948,511</p>
                  </c>
                  <c ca="center">
                     <p>0.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>4</p>
                  </c>
                  <c ca="right">
                     <p>152,921,959</p>
                  </c>
                  <c ca="right">
                     <p>1,743,027</p>
                  </c>
                  <c ca="center">
                     <p>1.1</p>
                  </c>
                  <c ca="right">
                     <p>647,795</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>1,921,970</p>
                  </c>
                  <c ca="center">
                     <p>1.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5</p>
                  </c>
                  <c ca="right">
                     <p>149,719,773</p>
                  </c>
                  <c ca="right">
                     <p>1,102,772</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
                  <c ca="right">
                     <p>761,950</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>1,560,683</p>
                  </c>
                  <c ca="center">
                     <p>1.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6</p>
                  </c>
                  <c ca="right">
                     <p>149,950,539</p>
                  </c>
                  <c ca="right">
                     <p>2,042,585</p>
                  </c>
                  <c ca="center">
                     <p>1.4</p>
                  </c>
                  <c ca="right">
                     <p>562,415</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>2,339,839</p>
                  </c>
                  <c ca="center">
                     <p>1.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>7</p>
                  </c>
                  <c ca="right">
                     <p>134,401,573</p>
                  </c>
                  <c ca="right">
                     <p>1,655,438</p>
                  </c>
                  <c ca="center">
                     <p>1.2</p>
                  </c>
                  <c ca="right">
                     <p>713,287</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>2,038,845</p>
                  </c>
                  <c ca="center">
                     <p>1.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>8</p>
                  </c>
                  <c ca="right">
                     <p>128,923,138</p>
                  </c>
                  <c ca="right">
                     <p>738,203</p>
                  </c>
                  <c ca="center">
                     <p>0.6</p>
                  </c>
                  <c ca="right">
                     <p>331,970</p>
                  </c>
                  <c ca="center">
                     <p>0.3</p>
                  </c>
                  <c ca="right">
                     <p>1,005,575</p>
                  </c>
                  <c ca="center">
                     <p>0.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>9</p>
                  </c>
                  <c ca="right">
                     <p>124,467,299</p>
                  </c>
                  <c ca="right">
                     <p>437,352</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>188,427</p>
                  </c>
                  <c ca="center">
                     <p>0.2</p>
                  </c>
                  <c ca="right">
                     <p>623,089</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>10</p>
                  </c>
                  <c ca="right">
                     <p>130,738,012</p>
                  </c>
                  <c ca="right">
                     <p>345,768</p>
                  </c>
                  <c ca="center">
                     <p>0.3</p>
                  </c>
                  <c ca="right">
                     <p>258,429</p>
                  </c>
                  <c ca="center">
                     <p>0.2</p>
                  </c>
                  <c ca="right">
                     <p>604,197</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>11</p>
                  </c>
                  <c ca="right">
                     <p>122,862,689</p>
                  </c>
                  <c ca="right">
                     <p>900,355</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
                  <c ca="right">
                     <p>127,774</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>1,012,479</p>
                  </c>
                  <c ca="center">
                     <p>0.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>12</p>
                  </c>
                  <c ca="right">
                     <p>114,462,600</p>
                  </c>
                  <c ca="right">
                     <p>1,139,786</p>
                  </c>
                  <c ca="center">
                     <p>1.0</p>
                  </c>
                  <c ca="right">
                     <p>374,365</p>
                  </c>
                  <c ca="center">
                     <p>0.3</p>
                  </c>
                  <c ca="right">
                     <p>1,404,279</p>
                  </c>
                  <c ca="center">
                     <p>1.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13</p>
                  </c>
                  <c ca="right">
                     <p>116,242,670</p>
                  </c>
                  <c ca="right">
                     <p>855,835</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
                  <c ca="right">
                     <p>547,462</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>1,349,974</p>
                  </c>
                  <c ca="center">
                     <p>1.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>14</p>
                  </c>
                  <c ca="right">
                     <p>115,844,145</p>
                  </c>
                  <c ca="right">
                     <p>450,161</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>451,782</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>748,465</p>
                  </c>
                  <c ca="center">
                     <p>0.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>15</p>
                  </c>
                  <c ca="right">
                     <p>104,111,694</p>
                  </c>
                  <c ca="right">
                     <p>443,805</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>43,937</p>
                  </c>
                  <c ca="center">
                     <p>0.0</p>
                  </c>
                  <c ca="right">
                     <p>487,742</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>16</p>
                  </c>
                  <c ca="right">
                     <p>98,986,639</p>
                  </c>
                  <c ca="right">
                     <p>389,255</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>67,290</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>456,545</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>17</p>
                  </c>
                  <c ca="right">
                     <p>93,529,596</p>
                  </c>
                  <c ca="right">
                     <p>1,329,664</p>
                  </c>
                  <c ca="center">
                     <p>1.4</p>
                  </c>
                  <c ca="right">
                     <p>660,440</p>
                  </c>
                  <c ca="center">
                     <p>0.7</p>
                  </c>
                  <c ca="right">
                     <p>1,760,982</p>
                  </c>
                  <c ca="center">
                     <p>1.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>18</p>
                  </c>
                  <c ca="right">
                     <p>91,041,441</p>
                  </c>
                  <c ca="right">
                     <p>162,916</p>
                  </c>
                  <c ca="center">
                     <p>0.2</p>
                  </c>
                  <c ca="right">
                     <p>58,996</p>
                  </c>
                  <c ca="center">
                     <p>0.1</p>
                  </c>
                  <c ca="right">
                     <p>210,422</p>
                  </c>
                  <c ca="center">
                     <p>0.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>19</p>
                  </c>
                  <c ca="right">
                     <p>61,093,376</p>
                  </c>
                  <c ca="right">
                     <p>328,909</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>193,387</p>
                  </c>
                  <c ca="center">
                     <p>0.3</p>
                  </c>
                  <c ca="right">
                     <p>479,687</p>
                  </c>
                  <c ca="center">
                     <p>0.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>X</p>
                  </c>
                  <c ca="right">
                     <p>149,996,094</p>
                  </c>
                  <c ca="right">
                     <p>2,592,361</p>
                  </c>
                  <c ca="center">
                     <p>1.7</p>
                  </c>
                  <c ca="right">
                     <p>574,950</p>
                  </c>
                  <c ca="center">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>3,018,682</p>
                  </c>
                  <c ca="center">
                     <p>2.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>chrUn*</p>
                  </c>
                  <c ca="right">
                     <p>117,911,829</p>
                  </c>
                  <c ca="right">
                     <p>6,049,538</p>
                  </c>
                  <c ca="center">
                     <p>5.1</p>
                  </c>
                  <c ca="right">
                     <p>5,710,057</p>
                  </c>
                  <c ca="center">
                     <p>4.8</p>
                  </c>
                  <c ca="right">
                     <p>8,885,604</p>
                  </c>
                  <c ca="center">
                     <p>7.5</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Total</p>
                  </c>
                  <c ca="right">
                     <p>2,695,172,903</p>
                  </c>
                  <c ca="right">
                     <p>25,997,677</p>
                  </c>
                  <c ca="center">
                     <p>1.0</p>
                  </c>
                  <c ca="right">
                     <p>12,845,065</p>
                  </c>
                  <c ca="center">
                     <p>0.5</p>
                  </c>
                  <c ca="right">
                     <p>33,594,782</p>
                  </c>
                  <c ca="center">
                     <p>
                        1.2
                     </p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>The analysis is based on the February 2003 mouse genome assembly. *chrUn, unmapped chromosome sequence.</p>
            </tblfn>
         </tbl>
         <p>Duplications can be found in all chromosomes analyzed, with chromosomes 6, 7, 17, and X having the highest, and chromosome 18 having the least, duplicated content (Table <tblr tid="T1">1</tblr>, Figure <figr fid="F1">1</figr>). Substantial amounts (8.9 Mb) of the duplicated content are found in the unmapped chromosome (ChrUn) sequence, suggesting that the correct chromosomal assignment of these segments remains a major assembly challenge. It is possible that small subsets of these duplications are due to chimeric reads and other sequencing artifacts and thus should not be part of the finished genome sequence. On the other hand, these unmapped duplicated sequences represent true duplications that have been excluded from the assembly. One example of this occurs with a member of the mouse Bcl2 family of apoptosis regulators, Bcl2a1. Bcl2a1 contains four highly similar genes (> 97% identical at the nucleotide level) that have been mapped together on chromosome 9 of the C57BL/6 and 129SV genomes <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. Currently, the Bcl2a1 genes are not assembled on the mapped chromosome and are found in three distinct unmapped contigs. In the human genome only one copy of <it>BCL2A1 </it>is found, although a recent, independent 8.5 kb tandem duplication containing the last exon of <it>BCL2A1 </it>has occurred, forming a novel <it>BCL2A1</it>-related transcript (AF249277). An example of a region that has changed between assemblies is the <it>Amy2 </it>locus. <it>Amy2 </it>is known to vary in copy number between inbred strains of mice <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. In the February 2002 assembly, only one copy of the <it>Amy2 </it>gene resided on chromosome 3 in addition to a second copy found on a large 10 kb unmapped contig. In addition, partial high identity matches (> 95%) to four distinct unmapped contigs were found (note that these partial copies were not detected in our analysis as they are less than 5 kb long). In the February 2003 assembly, six <it>Amy2 </it>genes exist, which is close to the five <it>Amy2</it>-like genes that were detected in the genome of strain A/J mice using quantitative densitometry of Southern blots <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. It is, however, important to note that a gap, not bridged by a clone, still exists between the <it>Amy2 </it>locus and the <it>Amy1 </it>gene, and so the copy number in the C57BL/6J genome assembly may still vary.</p>
         <p>We analyzed the distribution of segmental duplication content by sorting the duplications into six different sequence-similarity categories: 90-92%, 92-94%, 94-96%, 96-98%, 98-99.5%, and 99.5-100%, for both the February 2002 and 2003 assembly builds (Table <tblr tid="T2">2</tblr>). The amount of duplication content appears to be unevenly distributed across these categories, with a distinct rise in the 94-96% category. This might suggest recent duplicative events in the mouse genome have not occurred at a steady rate. However, it is unclear at this point how these results were affected by the draft status of the genome assemblies. Between the 2002 and the 2003 assembly builds we found that the amount of duplication content is nearly the same within each percent category except for the 99.5-100% category, which contained 4.8 Mb of sequence in 2002 and 18.5 Mb in 2003 (Table <tblr tid="T2">2</tblr>). Furthermore, we determined that the majority (88%) of the duplicated sequence in the 99.5-100% category occurred intrachromosomally, within 200 kb of each other. Using the assembly component tables (provided by UCSC <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>), which contain information about the underlying makeup of the February 2003 genome assembly (shotgun-assembled scaffolds and BAC sequences), we found that 215/216 (99.5%) of these duplications involved a BAC sequence. Hence, we suspect that the large increase in near-identical duplications could be the result of sequence misassignment errors arising from the inherent difficulty of merging finished BAC sequence with shotgun sequence contigs.</p>
         <tbl id="T2" hint_layout="single">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Comparison between genome assemblies</p>
            </caption>
            <tblbdy cols="3">
               <r>
                  <c ca="left">
                     <p>Sequence identity level</p>
                  </c>
                  <c ca="right">
                     <p>February 2002 assembly*</p>
                  </c>
                  <c ca="right">
                     <p>February 2003 assembly</p>
                  </c>
               </r>
               <r>
                  <c cspan="3">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c cspan="3" ca="left">
                     <p>Duplication content (bp)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>90-92%</p>
                  </c>
                  <c ca="right">
                     <p>4,966,470</p>
                  </c>
                  <c ca="right">
                     <p>3,543,429</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>92-94%</p>
                  </c>
                  <c ca="right">
                     <p>15,685,840</p>
                  </c>
                  <c ca="right">
                     <p>13,981,642</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>94-96%</p>
                  </c>
                  <c ca="right">
                     <p>17,533,730</p>
                  </c>
                  <c ca="right">
                     <p>17,970,287</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>96-98%</p>
                  </c>
                  <c ca="right">
                     <p>11,539,392</p>
                  </c>
                  <c ca="right">
                     <p>11,731,958</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>98-99.5%</p>
                  </c>
                  <c ca="right">
                     <p>5,865,024</p>
                  </c>
                  <c ca="right">
                     <p>5,487,899</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="3" ca="left">
                     <p><sup>&#8224;</sup>Potential sequence misassignment error detected (bp)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>99.5-100%</p>
                  </c>
                  <c ca="right">
                     <p>4,832,594</p>
                  </c>
                  <c ca="right">
                     <p>18,456,096</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>The comparison is of duplication content by sequence identity and potential sequence misassignment errors between the February 2002 (MGSCv3) and February 2003 (a hybrid assembly of MGSCv3 with 705 Mb finished BAC sequence) genome assemblies. *Analysis of the duplication content for February 2002 assembly can be found at <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.<sup>&#8224;</sup>Sequences detected to show extremely high percent identity duplications are likely to be genome assembly artifacts and were not included in the duplication content shown in Table <tblr tid="T1">1</tblr>.</p>
            </tblfn>
         </tbl>
         <p>We previously observed that the human genome sequence assembled by Celera's WGS method <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> showed poor quality in regions with near-identical segmental duplications <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. To assess the finishing status of duplicated regions in the WGS mouse genome assembly (February 2002 MGSCv3 assembly), we calculated the amount of unfinished sequence (regions with gaps or Ns) within the immediate neighborhood (20 kb) of each duplicon (the unmapped chromosome sequence was excluded from this analysis). We observed substantially higher amounts of unfinished sequence (number of Ns) in these regions. Whereas 8.0% of the assembly is comprised of Ns, regions harboring duplications contain an average of 12.2%. This average rises to 16.6% for duplications with more than 98% sequence identity (statistics can be obtained from our website <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). This suggests that the WGS assembler had difficulty assembling regions containing recent sequence duplication and that these regions are good candidates for finishing using clone resources.</p>
         <p>Using the NCBI Refseq and Ensembl mouse gene annotation, we identified 675 genes that mapped to duplicated regions of the mouse genome (a full list of genes can be obtained from our website <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>); 414 of these genes were found to be fully contained within a segmental duplication, thus representing the best candidates for whole-gene duplication. While it is likely that some of these duplicate copies have become pseudogenes, others may have evolved specialized functions <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Moreover, we sought to use the identified gene sequences, which were expressed sequence tags (ESTs) and/or cDNAs, as experimentally derived resources to help validate the genomic duplication content presented in this study. We aligned duplicated gene sequences to each genomic region using UCSC BLAT <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and determined their percent identity matches. Unambiguous gene-to-genomic identity matches were established for all 128 gene pairs we examined. Each gene sequence was mapped to their respective genomic region with at least 99.1% identity (examples are shown in Table <tblr tid="T3">3</tblr>; a full table is available at <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). We also examined the identified duplicated genes using their InterPro protein-domain classification present in 608 Ensembl genes to see whether specific kinds of genes or protein domains have been preferentially duplicated. We found that genes containing protein domains related to signal transduction (rhodopsin-like G-protein-coupled receptor superfamily), olfaction (olfactory receptors, vomeronasal receptors) immunity (immunoglobulin/MHC, serine protease), and drug metabolism (cytochrome P450) are significantly enriched (by at least threefold) (Table <tblr tid="T4">4</tblr>).</p>
         <tbl id="T3" hint_layout="double">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Examples of recent mouse gene duplications</p>
            </caption>
            <tblbdy cols="9">
               <r>
                  <c ca="left">
                     <p>Locus1*</p>
                  </c>
                  <c ca="left">
                     <p>Gene</p>
                  </c>
                  <c ca="left">
                     <p>Percent identity<sup>&#8224;</sup></p>
                  </c>
                  <c ca="left">
                     <p>Annotation</p>
                  </c>
                  <c ca="left">
                     <p>Locus2*</p>
                  </c>
                  <c ca="left">
                     <p>Gene</p>
                  </c>
                  <c ca="left">
                     <p>Percent identity<sup>&#8224;</sup></p>
                  </c>
                  <c ca="left">
                     <p>Annotation</p>
                  </c>
                  <c ca="left">
                     <p>Duplication % identity<sup>&#8225;</sup></p>
                  </c>
               </r>
               <r>
                  <c cspan="9">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1 F</p>
                  </c>
                  <c ca="left">
                     <p>NM_009888</p>
                  </c>
                  <c ca="left">
                     <p>99.6</p>
                  </c>
                  <c ca="left">
                     <p>Cfh (Complement component factor h)</p>
                  </c>
                  <c ca="left">
                     <p>1 F</p>
                  </c>
                  <c ca="left">
                     <p>M29010</p>
                  </c>
                  <c ca="left">
                     <p>99.0</p>
                  </c>
                  <c ca="left">
                     <p>Complement factor H-related protein mRNA</p>
                  </c>
                  <c ca="left">
                     <p>97.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>3 G1</p>
                  </c>
                  <c ca="left">
                     <p>NM_009669</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Amy2 (Amylase 2, pancreatic)</p>
                  </c>
                  <c ca="left">
                     <p>3 G1</p>
                  </c>
                  <c ca="left">
                     <p>M11896</p>
                  </c>
                  <c ca="left">
                     <p>99.6</p>
                  </c>
                  <c ca="left">
                     <p>Pancreatic amylase B-1</p>
                  </c>
                  <c ca="left">
                     <p>97.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_053184</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Ugt2a1 (UDP glycosyltransferase 2 A1)</p>
                  </c>
                  <c ca="left">
                     <p>5 E2</p>
                  </c>
                  <c ca="left">
                     <p>BF144793</p>
                  </c>
                  <c ca="left">
                     <p>99.6</p>
                  </c>
                  <c ca="left">
                     <p>cDNA clone IMAGE:4021939</p>
                  </c>
                  <c ca="left">
                     <p>95.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_009467</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Ugt2b5(UDP-glucuronosyl-transferase 2b5)</p>
                  </c>
                  <c ca="left">
                     <p>5 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_053215</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>RIKEN cDNA 0610033E06 gene</p>
                  </c>
                  <c ca="left">
                     <p>93.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5 E4</p>
                  </c>
                  <c ca="left">
                     <p>NM_008620</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Mpa2 (macrophage activation 2)</p>
                  </c>
                  <c ca="left">
                     <p>5 E4</p>
                  </c>
                  <c ca="left">
                     <p>BC007143</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>Similar to macrophage activation 2</p>
                  </c>
                  <c ca="left">
                     <p>90.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5 G1</p>
                  </c>
                  <c ca="left">
                     <p>NM_029693</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>RIKEN cDNA 1700123K08</p>
                  </c>
                  <c ca="left">
                     <p>7 B2</p>
                  </c>
                  <c ca="left">
                     <p>NM_027702</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>RIKEN cDNA 4933421I07 gene</p>
                  </c>
                  <c ca="left">
                     <p>91.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 C1</p>
                  </c>
                  <c ca="left">
                     <p>NM_053238</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>V1rc8 (Vomeronasal 1 receptor, C8)</p>
                  </c>
                  <c ca="left">
                     <p>6 C1</p>
                  </c>
                  <c ca="left">
                     <p>NM_053239</p>
                  </c>
                  <c ca="left">
                     <p>99.7</p>
                  </c>
                  <c ca="left">
                     <p>V1rc9 (Vomeronasal 1 receptor, C9)</p>
                  </c>
                  <c ca="left">
                     <p>95.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 D1</p>
                  </c>
                  <c ca="left">
                     <p>NM_011467</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Spr (sepiapterin reductase)</p>
                  </c>
                  <c ca="left">
                     <p>6 D1</p>
                  </c>
                  <c ca="left">
                     <p>BE862957</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>EST sequence</p>
                  </c>
                  <c ca="left">
                     <p>95.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F1</p>
                  </c>
                  <c ca="left">
                     <p>AI505330</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Similar to initiation factor eIF-4AI</p>
                  </c>
                  <c ca="left">
                     <p>6 F1</p>
                  </c>
                  <c ca="left">
                     <p>AI503670</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Similar to initiation factor eIF-4AI</p>
                  </c>
                  <c ca="left">
                     <p>98.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F2</p>
                  </c>
                  <c ca="left">
                     <p>NM_008646</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Mug2 (Murinoglobulin 2)</p>
                  </c>
                  <c ca="left">
                     <p>6 F2</p>
                  </c>
                  <c ca="left">
                     <p>NM_008645</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Mug1 (Murinoglobulin 1)</p>
                  </c>
                  <c ca="left">
                     <p>94.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_020257</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Dcl1 (c-type lectin 1)</p>
                  </c>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_027562</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>4632413B12Rik (C-lectin related protein)</p>
                  </c>
                  <c ca="left">
                     <p>90.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_008463</p>
                  </c>
                  <c ca="left">
                     <p>99.7</p>
                  </c>
                  <c ca="left">
                     <p>Klra5 (Killer cell lectin-like receptor, A5)</p>
                  </c>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_008464</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>Klra6 (Killer cell lectin-like receptor, A6)</p>
                  </c>
                  <c ca="left">
                     <p>90.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_010649</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Klra4 (Killer cell lectin-like receptor A4)</p>
                  </c>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_016659</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Klra1 (Killer cell lectin-like receptor A1)</p>
                  </c>
                  <c ca="left">
                     <p>91.4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_010737</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Klrb1b (Killer cell lectin-like receptor 1b)</p>
                  </c>
                  <c ca="left">
                     <p>6 F3</p>
                  </c>
                  <c ca="left">
                     <p>NM_008527</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Klrb1c (Killer cell lectin-like receptor 1c)</p>
                  </c>
                  <c ca="left">
                     <p>90.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>7 A2</p>
                  </c>
                  <c ca="left">
                     <p>NM_011860</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Mater (Maternal effect gene)</p>
                  </c>
                  <c ca="left">
                     <p>7 A1</p>
                  </c>
                  <c ca="left">
                     <p>AK016782</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Similar to Mater protein</p>
                  </c>
                  <c ca="left">
                     <p>96.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>7 B1</p>
                  </c>
                  <c ca="left">
                     <p>NM_032541</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Hamp hepcidin antimicrobial peptide</p>
                  </c>
                  <c ca="left">
                     <p>7 B1</p>
                  </c>
                  <c ca="left">
                     <p>AK007975</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Prohepcidin homolog</p>
                  </c>
                  <c ca="left">
                     <p>92.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>7 B2</p>
                  </c>
                  <c ca="left">
                     <p>NM_010115</p>
                  </c>
                  <c ca="left">
                     <p>99</p>
                  </c>
                  <c ca="left">
                     <p>Klk13 (Kallikrein 13)</p>
                  </c>
                  <c ca="left">
                     <p>7 B2</p>
                  </c>
                  <c ca="left">
                     <p>NM_008454</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Klk16 (Kallikrein 16)</p>
                  </c>
                  <c ca="left">
                     <p>92.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>8 D1</p>
                  </c>
                  <c ca="left">
                     <p>L11333</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Carboxylesterase</p>
                  </c>
                  <c ca="left">
                     <p>8 D1</p>
                  </c>
                  <c ca="left">
                     <p>NM_144511</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Es31</p>
                  </c>
                  <c ca="left">
                     <p>95.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>9 F4</p>
                  </c>
                  <c ca="left">
                     <p>NM_130864</p>
                  </c>
                  <c ca="left">
                     <p>99.6</p>
                  </c>
                  <c ca="left">
                     <p>Acaa acetyl-Coenzyme A acyltransferase</p>
                  </c>
                  <c ca="left">
                     <p>9 F4</p>
                  </c>
                  <c ca="left">
                     <p>BC019882</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Similar to acetyl-CoA acyltransferase</p>
                  </c>
                  <c ca="left">
                     <p>96.6</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>10 B3</p>
                  </c>
                  <c ca="left">
                     <p>NM_013532</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Gp49a (Glycoprotein 49A)</p>
                  </c>
                  <c ca="left">
                     <p>10 B3</p>
                  </c>
                  <c ca="left">
                     <p>NM_008147</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Gp49b (glycoprotein 49B)</p>
                  </c>
                  <c ca="left">
                     <p>96.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>10 D2</p>
                  </c>
                  <c ca="left">
                     <p>NM_017372</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Lyzs (Lysozyme)</p>
                  </c>
                  <c ca="left">
                     <p>10 D2</p>
                  </c>
                  <c ca="left">
                     <p>NM_013590</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Lzp-s (P lysozyme structural)</p>
                  </c>
                  <c ca="left">
                     <p>95.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>11 A3.2</p>
                  </c>
                  <c ca="left">
                     <p>NM_172792</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>hypothetical protein 4932414J04</p>
                  </c>
                  <c ca="left">
                     <p>17 D</p>
                  </c>
                  <c ca="left">
                     <p>AK03001</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Tyrosine protein kinase/cysteine-rich region</p>
                  </c>
                  <c ca="left">
                     <p>94.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>11 B1.3</p>
                  </c>
                  <c ca="left">
                     <p>NM_011396</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Slc22a5 (Solute carrier family 22)</p>
                  </c>
                  <c ca="left">
                     <p>11 B1.3</p>
                  </c>
                  <c ca="left">
                     <p>NM_019723</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Slc22a9 (solute carrier family 22)</p>
                  </c>
                  <c ca="left">
                     <p>91.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>11 D</p>
                  </c>
                  <c ca="left">
                     <p>NM_021347</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Gsdm (Gasdermin)</p>
                  </c>
                  <c ca="left">
                     <p>11 D</p>
                  </c>
                  <c ca="left">
                     <p>NM_029727</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>2200001G21Rik</p>
                  </c>
                  <c ca="left">
                     <p>94.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>12 F1</p>
                  </c>
                  <c ca="left">
                     <p>BC002065</p>
                  </c>
                  <c ca="left">
                     <p>99.6</p>
                  </c>
                  <c ca="left">
                     <p>Serine protease inhibitor 2-1</p>
                  </c>
                  <c ca="left">
                     <p>12 F1</p>
                  </c>
                  <c ca="left">
                     <p>BY761363</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>EST sequence</p>
                  </c>
                  <c ca="left">
                     <p>92.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>12 F1</p>
                  </c>
                  <c ca="left">
                     <p>NM_013772</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Tcl1b3 (T-cell leukemia/lymphoma 1B, 3)</p>
                  </c>
                  <c ca="left">
                     <p>12 F1</p>
                  </c>
                  <c ca="left">
                     <p>NM_013776</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Tcl1b5 (T-cell leukemia/lymphoma 1B, 5)</p>
                  </c>
                  <c ca="left">
                     <p>95.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13 A1</p>
                  </c>
                  <c ca="left">
                     <p>NM_013778</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>Akr1c13 (Aldo-keto reductase 1, C13)</p>
                  </c>
                  <c ca="left">
                     <p>13 A1</p>
                  </c>
                  <c ca="left">
                     <p>NM_013777</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>Akrc12 (Aldo-keto reductase 1, C12)</p>
                  </c>
                  <c ca="left">
                     <p>96.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13 A3</p>
                  </c>
                  <c ca="left">
                     <p>NM_008864</p>
                  </c>
                  <c ca="left">
                     <p>99.2</p>
                  </c>
                  <c ca="left">
                     <p>Csh1 (chorionic somatomammotrophin 1)</p>
                  </c>
                  <c ca="left">
                     <p>13 A3.3</p>
                  </c>
                  <c ca="left">
                     <p>AK082929</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Similar to placental lactogen 1</p>
                  </c>
                  <c ca="left">
                     <p>98.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13 A4</p>
                  </c>
                  <c ca="left">
                     <p>NM_011456</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Spi14 (Serine Protease Inhibitor 14)</p>
                  </c>
                  <c ca="left">
                     <p>13 A4</p>
                  </c>
                  <c ca="left">
                     <p>NM_011455</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Spi13 (serine protease inhibitor 13)</p>
                  </c>
                  <c ca="left">
                     <p>95.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13 D1</p>
                  </c>
                  <c ca="left">
                     <p>NM_010872</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Birc1b (Neuronal apoptosis inhibitory 2)</p>
                  </c>
                  <c ca="left">
                     <p>13 D1</p>
                  </c>
                  <c ca="left">
                     <p>NM_008670</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Birc1a (Neuronal apoptosis inhibitory 1)</p>
                  </c>
                  <c ca="left">
                     <p>90.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>14 C1</p>
                  </c>
                  <c ca="left">
                     <p>NM_010373</p>
                  </c>
                  <c ca="left">
                     <p>99.7</p>
                  </c>
                  <c ca="left">
                     <p>Gzme (Granzyme E)</p>
                  </c>
                  <c ca="left">
                     <p>14 C1</p>
                  </c>
                  <c ca="left">
                     <p>NM_010372</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>Gzmd (Granzyme D)</p>
                  </c>
                  <c ca="left">
                     <p>94.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>14 C2</p>
                  </c>
                  <c ca="left">
                     <p>NM_172603</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>4933417L10Rik</p>
                  </c>
                  <c ca="left">
                     <p>14 C3</p>
                  </c>
                  <c ca="left">
                     <p>BE381578</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>EST sequence</p>
                  </c>
                  <c ca="left">
                     <p>94.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>15 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_007781</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Csf2rb2 (Colony stimulating factor 2, &#946;-2)</p>
                  </c>
                  <c ca="left">
                     <p>15 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_007780</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>Csf2rb1 (Colony stimulating factor 2, &#946;-1)</p>
                  </c>
                  <c ca="left">
                     <p>95.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>15 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_010005</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Cyp2d10 (Cytochrome P450, 2d10)</p>
                  </c>
                  <c ca="left">
                     <p>15 E2</p>
                  </c>
                  <c ca="left">
                     <p>NM_010006</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Cyp2d9 (cytochrome P450, 2d9)</p>
                  </c>
                  <c ca="left">
                     <p>92.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>16 B1</p>
                  </c>
                  <c ca="left">
                     <p>NM_023125</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Kng (Kininogen)</p>
                  </c>
                  <c ca="left">
                     <p>16 B1</p>
                  </c>
                  <c ca="left">
                     <p>BI330914</p>
                  </c>
                  <c ca="left">
                     <p>99.1</p>
                  </c>
                  <c ca="left">
                     <p>EST sequence</p>
                  </c>
                  <c ca="left">
                     <p>90.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>16 B3</p>
                  </c>
                  <c ca="left">
                     <p>M92418</p>
                  </c>
                  <c ca="left">
                     <p>99.8</p>
                  </c>
                  <c ca="left">
                     <p>MS2 (Cysteine proteinase inhibitor)</p>
                  </c>
                  <c ca="left">
                     <p>16 B3</p>
                  </c>
                  <c ca="left">
                     <p>BB654253</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>EST sequence</p>
                  </c>
                  <c ca="left">
                     <p>95.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>17 B2</p>
                  </c>
                  <c ca="left">
                     <p>NM_009780</p>
                  </c>
                  <c ca="left">
                     <p>99.9</p>
                  </c>
                  <c ca="left">
                     <p>C4 (Complement component 4)</p>
                  </c>
                  <c ca="left">
                     <p>17 B2</p>
                  </c>
                  <c ca="left">
                     <p>M21576</p>
                  </c>
                  <c ca="left">
                     <p>99.5</p>
                  </c>
                  <c ca="left">
                     <p>Slp (MHC sex-limited protein)</p>
                  </c>
                  <c ca="left">
                     <p>96.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>X A2</p>
                  </c>
                  <c ca="left">
                     <p>NM_008955</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Psx1 (Placenta specific homeobox 1)</p>
                  </c>
                  <c ca="left">
                     <p>X A2</p>
                  </c>
                  <c ca="left">
                     <p>NM_023894</p>
                  </c>
                  <c ca="left">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Homeobox protein GPBOX</p>
                  </c>
                  <c ca="left">
                     <p>91.6</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Locations of duplicons by mouse chromosome banding; locus 1 and 2 represent a duplication pair. <sup>&#8224;</sup>Alignment percent identity between gene and genomic sequences showing correct matches. <sup>&#8225; </sup>% similarity: average DNA percent identity between paralogous gene/transcript sequences in locus 1 and 2 (duplicated pair)</p>
            </tblfn>
         </tbl>
         <tbl id="T4" hint_layout="double">
            <title>
               <p>Table 4</p>
            </title>
            <caption>
               <p>Protein domain enrichment found in recently duplicated mouse genes*</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>InterPro entry ID</p>
                  </c>
                  <c ca="left">
                     <p>Protein domain description</p>
                  </c>
                  <c ca="center">
                     <p>Number found in 608 duplicated genes</p>
                  </c>
                  <c ca="center">
                     <p>Number found in all 16,515 annotated genes in genome</p>
                  </c>
                  <c ca="center">
                     <p>Enrichment<sup>&#8224;</sup></p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR000276</p>
                  </c>
                  <c ca="left">
                     <p>Rhodopsin-like GPCR superfamily</p>
                  </c>
                  <c ca="center">
                     <p>135</p>
                  </c>
                  <c ca="center">
                     <p>1229</p>
                  </c>
                  <c ca="center">
                     <p>3.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR000725</p>
                  </c>
                  <c ca="left">
                     <p>Olfactory receptor</p>
                  </c>
                  <c ca="center">
                     <p>103</p>
                  </c>
                  <c ca="center">
                     <p>861</p>
                  </c>
                  <c ca="center">
                     <p>3.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR003006</p>
                  </c>
                  <c ca="left">
                     <p>Immunoglobulin/major histocompatibility complex</p>
                  </c>
                  <c ca="center">
                     <p>46</p>
                  </c>
                  <c ca="center">
                     <p>372</p>
                  </c>
                  <c ca="center">
                     <p>3.4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR004072</p>
                  </c>
                  <c ca="left">
                     <p>Vomeronasal receptor, type 1</p>
                  </c>
                  <c ca="center">
                     <p>31</p>
                  </c>
                  <c ca="center">
                     <p>108</p>
                  </c>
                  <c ca="center">
                     <p>7.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001909</p>
                  </c>
                  <c ca="left">
                     <p>KRAB box</p>
                  </c>
                  <c ca="center">
                     <p>23</p>
                  </c>
                  <c ca="center">
                     <p>103</p>
                  </c>
                  <c ca="center">
                     <p>6.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001254</p>
                  </c>
                  <c ca="left">
                     <p>Serine protease, trypsin family</p>
                  </c>
                  <c ca="center">
                     <p>21</p>
                  </c>
                  <c ca="center">
                     <p>117</p>
                  </c>
                  <c ca="center">
                     <p>4.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002401</p>
                  </c>
                  <c ca="left">
                     <p>E-class P450, group I</p>
                  </c>
                  <c ca="center">
                     <p>20</p>
                  </c>
                  <c ca="center">
                     <p>61</p>
                  </c>
                  <c ca="center">
                     <p>8.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001128</p>
                  </c>
                  <c ca="left">
                     <p>Cytochrome P450</p>
                  </c>
                  <c ca="center">
                     <p>20</p>
                  </c>
                  <c ca="center">
                     <p>68</p>
                  </c>
                  <c ca="center">
                     <p>8.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR007086</p>
                  </c>
                  <c ca="left">
                     <p>Zn-finger, C2H2 subtype</p>
                  </c>
                  <c ca="center">
                     <p>20</p>
                  </c>
                  <c ca="center">
                     <p>139</p>
                  </c>
                  <c ca="center">
                     <p>3.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001314</p>
                  </c>
                  <c ca="left">
                     <p>Chymotrypsin serine protease, family S1</p>
                  </c>
                  <c ca="center">
                     <p>19</p>
                  </c>
                  <c ca="center">
                     <p>108</p>
                  </c>
                  <c ca="center">
                     <p>4.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002403</p>
                  </c>
                  <c ca="left">
                     <p>E-class P450, group IV</p>
                  </c>
                  <c ca="center">
                     <p>17</p>
                  </c>
                  <c ca="center">
                     <p>56</p>
                  </c>
                  <c ca="center">
                     <p>8.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002397</p>
                  </c>
                  <c ca="left">
                     <p>B-class P450</p>
                  </c>
                  <c ca="center">
                     <p>13</p>
                  </c>
                  <c ca="center">
                     <p>29</p>
                  </c>
                  <c ca="center">
                     <p>11.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001304</p>
                  </c>
                  <c ca="left">
                     <p>C-type lectin</p>
                  </c>
                  <c ca="center">
                     <p>13</p>
                  </c>
                  <c ca="center">
                     <p>96</p>
                  </c>
                  <c ca="center">
                     <p>3.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR000215</p>
                  </c>
                  <c ca="left">
                     <p>Serpin</p>
                  </c>
                  <c ca="center">
                     <p>12</p>
                  </c>
                  <c ca="center">
                     <p>48</p>
                  </c>
                  <c ca="center">
                     <p>6.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002402</p>
                  </c>
                  <c ca="left">
                     <p>E-class P450, group II</p>
                  </c>
                  <c ca="center">
                     <p>9</p>
                  </c>
                  <c ca="center">
                     <p>14</p>
                  </c>
                  <c ca="center">
                     <p>18.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR006046</p>
                  </c>
                  <c ca="left">
                     <p>Glycoside hydrolase family 13</p>
                  </c>
                  <c ca="center">
                     <p>7</p>
                  </c>
                  <c ca="center">
                     <p>8</p>
                  </c>
                  <c ca="center">
                     <p>23.0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR006047</p>
                  </c>
                  <c ca="left">
                     <p>Alpha amylase, catalytic domain</p>
                  </c>
                  <c ca="center">
                     <p>7</p>
                  </c>
                  <c ca="center">
                     <p>10</p>
                  </c>
                  <c ca="center">
                     <p>19.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001400</p>
                  </c>
                  <c ca="left">
                     <p>Somatotropin hormone</p>
                  </c>
                  <c ca="center">
                     <p>7</p>
                  </c>
                  <c ca="center">
                     <p>32</p>
                  </c>
                  <c ca="center">
                     <p>6.1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR006048</p>
                  </c>
                  <c ca="left">
                     <p>Alpha amylase, C-terminal all-beta domain</p>
                  </c>
                  <c ca="center">
                     <p>6</p>
                  </c>
                  <c ca="center">
                     <p>7</p>
                  </c>
                  <c ca="center">
                     <p>24.7</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002018</p>
                  </c>
                  <c ca="left">
                     <p>Carboxylesterase, type B</p>
                  </c>
                  <c ca="center">
                     <p>6</p>
                  </c>
                  <c ca="center">
                     <p>13</p>
                  </c>
                  <c ca="center">
                     <p>12.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR004073</p>
                  </c>
                  <c ca="left">
                     <p>Vomeronasal receptor, type 2</p>
                  </c>
                  <c ca="center">
                     <p>6</p>
                  </c>
                  <c ca="center">
                     <p>13</p>
                  </c>
                  <c ca="center">
                     <p>12.3</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001039</p>
                  </c>
                  <c ca="left">
                     <p>Major histocompatibility complex protein, class I</p>
                  </c>
                  <c ca="center">
                     <p>6</p>
                  </c>
                  <c ca="center">
                     <p>17</p>
                  </c>
                  <c ca="center">
                     <p>9.9</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR001828</p>
                  </c>
                  <c ca="left">
                     <p>Extracellular ligand-binding receptor</p>
                  </c>
                  <c ca="center">
                     <p>6</p>
                  </c>
                  <c ca="center">
                     <p>29</p>
                  </c>
                  <c ca="center">
                     <p>5.5</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002213</p>
                  </c>
                  <c ca="left">
                     <p>UDP-glucoronosyl/UDP-glucosyl transferase</p>
                  </c>
                  <c ca="center">
                     <p>5</p>
                  </c>
                  <c ca="center">
                     <p>12</p>
                  </c>
                  <c ca="center">
                     <p>11.8</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR002448</p>
                  </c>
                  <c ca="left">
                     <p>Odour-binding protein</p>
                  </c>
                  <c ca="center">
                     <p>4</p>
                  </c>
                  <c ca="center">
                     <p>9</p>
                  </c>
                  <c ca="center">
                     <p>13.2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>IPR000068</p>
                  </c>
                  <c ca="left">
                     <p>Extracellular calcium-sensing receptor</p>
                  </c>
                  <c ca="center">
                     <p>4</p>
                  </c>
                  <c ca="center">
                     <p>10</p>
                  </c>
                  <c ca="center">
                     <p>11.0</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Only Ensembl gene annotation (608 genes) was used in this analysis. <sup>&#8224;</sup>All results shown are statistically significant with <it>p</it>-values &lt; 10<sup>-5 </sup>(chi<sup>2 </sup>test).</p>
            </tblfn>
         </tbl>
         <p>From this list of genes, we performed a detailed analysis of <it>Mater</it>, a maternal-effect gene of potential medical importance. <it>Mater </it>encodes an autoantigen in a mouse model for human autoimmune premature ovarian failure <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Knockout studies have shown that it is essential for early embryonic development in mice <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. <it>Mater </it>encodes a protein of 1,111 amino acids from a 3.5 kb transcript that spans 57 kb on mouse chromosome 7. A 42 kb segmental duplication involving two duplicons (DUP1, where <it>Mater </it>is located; DUP2, where a novel <it>Mater2 </it>is located) are situated about 5 Mb apart and in an inverted orientation (Figure <figr fid="F2">2</figr>). DUP1 and DUP2 are on average 91.1% identical over the entire 42 kb genomic region, with a 96.6% average in the exonic regions. Furthermore, we identified an intron-less <it>Mater </it>pseudogene (<it>MaterP</it>), which shares 87% DNA sequence identity to <it>Mater</it>, at a location 10 Mb proximal to <it>Mater </it>(Figure <figr fid="F2">2</figr>; see Additional data files for a detailed comparative genomic analysis of the <it>Mater </it>locus). The mapping locations of these duplications have been confirmed by fluorescence <it>in situ </it>hybridization (FISH) (Figure <figr fid="F3">3</figr>). Thus, <it>Mater </it>serves as one example of a gene that has been knocked out in mice but for which there is a second, highly similar transcript whose biological role is not yet known.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>The genomic organization of the <it>Mater </it>duplication</p>
            </caption>
            <text>
               <p>The genomic organization of the <it>Mater </it>duplication. <b>(a) </b>Location of the <it>Mater </it>duplication. A snapshot view of GMOD browser (details can be found at <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). <b>(b) </b>Chromosomal view (mouse chromosome 7) of the three <it>Mater </it>duplication locations (DUP1, DUP2, <it>MaterP</it>). <b>(c) </b>Graphical view of the sequence similarity between DUP1 and DUP2 shown by GenomePixelizer. DUP2 is situated in an inverse orientation with respect to DUP1. Red, 99-100% sequence identity; purple, 96-98%; green, 93-95%; blue, 90-92%; black, 85-89%. <b>(d) </b>Graphical view of the sequence similarity between DUP1 and the <it>MaterP </it>region. As shown, <it>MaterP </it>is an intron-less, retrotransposed pseudogene. Blue, 90-92% sequence identity; black, 85-89%.</p>
            </text>
            <graphic file="gb-2003-4-8-r47-2"/>
         </fig>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>FISH detection of <it>Mater </it>duplication</p>
            </caption>
            <text>
               <p>FISH detection of <it>Mater </it>duplication. <b>(a) </b>Metaphase FISH showing three pairs of signals (yellow) detected on mouse chromosome 7 using BAC clone RP23-225F5 (detection frequency of 70%) mapping to duplicated <it>Mater </it>regions. <b>(b) </b>DAPI banding of the same partial mitotic figures for the identification of mouse chromosome 7. A control probe RP23-464L20 was mapped to a single location in the F2 region (data not shown).</p>
            </text>
            <graphic file="gb-2003-4-8-r47-3"/>
         </fig>
         <p>In addition, we were interested in determining whether any of the 675 genes have undergone recent (&#8805; 90% sequence identity over &#8805; 5 kb) and independent duplication in the human genome. Some of these genes could be recently evolving via the 'birth and death model of evolution' which has been used to describe the evolution of the major histocompatibility complex (MHC) and immunoglobulin multigene families <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. This model describes genes that are repeatedly created through duplication, with some genes becoming fixed while others are rendered nonfunctional by deleterious mutations <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>.</p>
         <p>We examined the 675 duplicated mouse genes using best reciprocal BLAST hits to identify their putative human orthologs. We subsequently analyzed regions containing these putative orthologs for recent sequence duplication in the human genome. Sixteen of the 675 genes were found to be involved in recent, independent gene duplication in mouse and human (see Table <tblr tid="T5">5</tblr>). Some of these regions containing whole-gene duplications are part of multigene families known to be evolving via duplication and are found in tandem duplicated arrays in both species (that is, the <it>Amy2</it>, <it>H2-Q1</it>, <it>Gsta1</it>, and <it>Olfr54 </it>genes). An interesting example of a recent and apparently independent whole-gene duplication that occurred in mouse and human involves <it>Bmp8a </it>and a second intronic transcript <it>Oxct2</it>. Of the partial gene duplications, the recent duplication within the <it>Tnxb </it>gene and its human ortholog <it>TNXB </it>(found at the MHC III locus of mouse chromosome 17 and human 6p21) is particularly intriguing. In humans, this locus consists of a tandem array of genes (<it>RP</it>, <it>C4</it>, <it>CYP21</it>, and <it>TNXB </it>(RCCX)), which through gene duplication, can exist as mono-, di- and tri-modular forms in the caucasian population <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Recent studies have also shown the presence of a deletion haplotype in one individual, leading to a fusion of the <it>TNXA</it>/<it>TNXB </it>gene on one chromosome and a duplication of <it>CYP21 </it>on the other chromosome <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. Furthermore, complex haplotypes of the complement genes (<it>C4A </it>and <it>C4B</it>) residing in the RCCX module have been characterized and postulated to have a role in individual susceptibility to infection and autoimmune disease <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. A closer inspection of the genomic region surrounding this recent duplication in the mouse reveals that the C57BL/6J duplication encompasses homologous genes (<it>Tnxb</it>, <it>Slp </it>(a <it>C4 </it>paralog), <it>Cyp21a1</it>, and <it>C4</it>). Similarly, in humans, this orthologous region of the mouse genome has been shown to undergo multiple recombination events, giving rise to a variety of haplotypes <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Overall, many of the genes that have recently experienced duplications in the mouse and human genomes are of biomedical and evolutionary interest. The complexity and polymorphic nature of these recent duplications underscores the need for, and the difficulty of, performing the detailed structural and functional analyses that will help discern their true genomic organization, evolutionary history, and biological implications.</p>
         <tbl id="T5" hint_layout="single">
            <title>
               <p>Table 5</p>
            </title>
            <caption>
               <p>Genes that have undergone recent duplication in both the mouse and human genome*</p>
            </caption>
            <tblbdy cols="2">
               <r>
                  <c ca="left">
                     <p>Refseq<sup>&#8224;</sup></p>
                  </c>
                  <c ca="left">
                     <p>Gene description</p>
                  </c>
               </r>
               <r>
                  <c cspan="2">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_007534</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>B-cell leukemia/lymphoma 2 related protein A1b (Bcl2a1b)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_007558</p>
                  </c>
                  <c ca="left">
                     <p>Bone morphogenetic protein 8a (Bmp8a)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_007812</p>
                  </c>
                  <c ca="left">
                     <p>Cytochrome P450, 2a5 (Cyp2a5)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_008181</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Glutathione S-transferase, alpha 1 (Ya) (Gsta1)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_010390</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Histocompatibility 2, Q region locus 1 (H2Q1)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_009467</p>
                  </c>
                  <c ca="left">
                     <p>UDP-glucuronosyltransferase 2 family, member 5 (Ugt2b5)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_009669</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Amylase 2, pancreatic (Amy2)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_009888</p>
                  </c>
                  <c ca="left">
                     <p>Complement component factor h (Cfh)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_010856</p>
                  </c>
                  <c ca="left">
                     <p>Myosin heavy chain, cardiac muscle, adult (Myhca)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_013778</p>
                  </c>
                  <c ca="left">
                     <p>Aldo-keto reductase family 1, member C13 (Akr1c13)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_022033</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>3-Oxoacid CoA transferase 2 (Oxct2) (Imbedded in the Bmp8a gene)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_026419</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Elastase 3B, pancreatic (Ela3b)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_031176</p>
                  </c>
                  <c ca="left">
                     <p>Tenascin XB (Tnxb)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_130864</p>
                  </c>
                  <c ca="left">
                     <p>Acetyl-coenzyme A acyltransferase (peroxisomal 3-oxoacyl-Coenzyme athiolase) (Acaa)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>NM_010997</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Olfactory receptor 54 (Olfr54)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NM_031170</p>
                  </c>
                  <c ca="left">
                     <p>Keratin complex 2, basic, gene 8 (Krt2-8)</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Six hundred and seventy-five duplicated mouse gene sequences were aligned to the June 2002 human genome assembly by BLAST (with an initial expected value cutoff of &lt;10<sup>-10</sup>). The best aligned human genes were subsequently used for reciprocal BLAST alignments (against the mouse genome sequence) to establish a putative orthologous relationship between the mouse and human gene pairs. Using results from our human genome duplication analysis <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B37">37</abbr></abbrgrp>, we examined regions of the human genome where the human genes are involved in recent segmental duplication. <sup>&#8224;</sup>Italics represents genes that are entirely within a duplication in the mouse genome.</p>
            </tblfn>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>Our current analysis of the presence and organization of recent segmental duplications in the mouse genome has identified recent gene-duplication events and potentially problematic regions of the mouse genome assembly. At a practical level, identifying regions with segmental duplication will be useful in highlighting the most dynamic regions of any mammalian genome assembly. For the genome-sequencing community, these potential misassemblies/putative duplications can become initial targets for clone-based finishing; and for the biologist, they can serve as sentinels for regions of the genome most likely to change in subsequent assemblies. Additional hierarchical shotgun sequencing effort <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> will undoubtedly be critical to finish the mouse genome sequence and reveal additional duplicated regions that are incomplete at the moment.</p>
         <p>Many of the duplicated genes are of evolutionary and medical importance (that is, genes involved in immune defense, olfaction, and drug metabolism). Knowledge of these duplicated regions could be important for accurately mapping mutants derived from ethylnitrosourea (ENU) mutagenesis, designing targeting vectors for embryonic stem cell alterations, and validating putative single-nucleotide polymrophisms (SNPs) that may have arisen from recently duplicated sequences rather than allelic variants <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The ability to create large, sophisticated targeting vectors, by engineering BACs using homologous recombination in <it>Escherichia coli </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, should prove very useful for designing <it>in vivo </it>experiments aimed at dissecting the function of recently duplicated genes. Knowledge of all recent duplications in mouse may also highlight regions subject to chromosomal rearrangement and polymorphism within and between species, and provide an opportunity to model the stability of such genomic architecture in a mammalian genome.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Genome sequence and chromosome-wide BLAST</p>
            </st>
            <p>We obtained the February 2002 (MGSCv3) and February 2003 mouse genome assemblies (lower-case repeat-masked sequences), as well as the assembly component tables, through the UCSC Human Genome Browser website <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. For each assembly, detection of intrachromosomal segmental duplications involved comparing each of the 20 masked chromosome sequences (excluding the Y chromosome not targeted by the MGSC) and the masked unmapped chromosome sequence against itself by BLAST2 <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> (21 comparisons made). Interchromosomal analysis of segmental duplications involved pairwise comparisons between each of the 21 chromosomes (420 comparisons made). Analyses were repeated with the exclusion of the unmapped chromosome sequence to examine its contribution to the overall duplication content (results posted at <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). All BLAST results were subsequently parsed to eliminate low-quality and fragmented alignments under the following criteria: BLAST results having &#8805; 90% sequence identity, &#8805; a length of 80 bp, and with expected value &#8804; 10<sup>-30</sup>.</p>
         </sec>
         <sec>
            <st>
               <p>Parsing of BLAST results and duplication detection</p>
            </st>
            <p>Each BLAST report was sorted by chromosomal coordinates. All identical hits (same coordinate alignments), including suboptimal BLAST alignments recognized by multiple, overlapping alignments, as well as mirror hits (reverse coordinate alignments) from the BLAST results of the intrachromosomal set, were removed. Contiguous alignments separated by a distance of less than 3 kb, then 5 kb, and subsequently 9 kb, were joined stepwise into modules in order to traverse masked repetitive sequences and to overcome breaks in the BLAST alignments caused by insertions/deletions and sequence gaps. Such contiguous sequence-alignment modules represent sequence similarity between the subject and query chromosome sequence in question (at their respective positional coordinates) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Potential sequence misassignment errors are results detected to have > 99.5% sequence identity with another region.</p>
         </sec>
         <sec>
            <st>
               <p>Online database for recent segmental duplications</p>
            </st>
            <p>We overlaid all duplication content and regions containing potential sequence misassignment errors onto the mouse genome sequence, which can be viewed using the interactive Generic Genome Browser <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> hosted at our website <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Results and analyses were presented for both the February 2002 and February 2003, each as a separate database. Results are also summarized in tables that include information on chromosomal coordinates, band locations, size of duplications, level of identity between duplicated copies, as well as genes mapped to these regions <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Graphical representation for intrachromosomal duplications was generated using the visualization tool GenomePixelizer (Figure <figr fid="F1">1</figr>) <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of recent gene duplications</p>
            </st>
            <p>We obtained the NCBI Refseq gene annotation file (refGene.txt.gz) from the UCSC Downloads website <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and the Ensembl gene annotation (Mus_musculus.cdna.fa.gz; Ensembl Known genes only) from the Ensembl website <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Genes that mapped to duplicated regions of the mouse genome were identified using their chromosome sequence coordinates. In total, 439 Refseq and 608 Ensembl annotated genes were found to be involved in duplicated regions; together these two datasets made up 675 unique gene annotations (372 overlapped annotations). To establish gene-pair relationships between duplicated gene sequences, each of the 238 NCBI RefSeq genes that were found to be fully contained within a segmental duplication was searched against the UCSC and NCBI GenBank database for spliced ESTs, full-length cDNAs and additional annotated genes. A total of 128 gene pairs were established, most of which are likely to be novel gene paralogs previously unknown in the literature. In the analysis of protein-domain enrichment in duplicated genes, InterPro annotation for duplicated genes (608 Ensembl genes) as well as for the entire gene set (16,515) was obtained from Ensembl EnsMart <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. We counted the number of times a protein-domain class is found in each gene set and tabulated our results (see Table <tblr tid="T4">4</tblr>). To examine the subset of genes that had undergone recent duplication in the human genome, each of the duplicated gene sequences was aligned to the June 2002 human genome assembly by BLAST (with an initial expected value cutoff of &lt;10<sup>-10</sup>). The best-aligned human genes were subsequently used for reciprocal BLAST alignments (against the mouse genome sequence) to establish a putative orthologous relationship between the mouse and human gene pairs. Using results from our human genome duplication analysis <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, we examined regions of the human genome where the human genes were involved in recent segmental duplication.</p>
         </sec>
         <sec>
            <st>
               <p>Fluorescence <it>in situ</it> hybridization (FISH)</p>
            </st>
            <p>Mouse lymphocytes were isolated from the spleen and cultured at 37&#176;C in RPMI 1640 medium supplemented with fetal calf serum, concanavalin A and lipopolysaccharide. After 44 hours, the cultured lymphocytes were treated with bromodeoxyuridine for an additional 14 hours. The synchronized cells were washed and recultured at 37&#176;C for 4 hours in a-minimal Eagle's medium with thymidine. Chromosome slides were made by conventional methods including hypotonic treatment, fixation and air-drying <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. BAC probes RP23-225F5 (mapped to the <it>Mater </it>locus (DUP1) by BAC-end sequences) and RP23-464L20 (a control probe) were biotinylated respectively. Hybridization and detection were carried out according to <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. FISH signals were observed under fluorescent microscopy using FITC and DAPI filters. Images were captured by CCD camera.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>Further analysis of Mater duplication, including a figure showing a multiple percent identity plot of <it>Mater </it>versus <it>Mater2</it>, <it>MATER </it>(human), and <it>MaterP</it>, is available (additional data file <supplr sid="s1">1</supplr>).</p>
         <suppl id="s1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Further analysis of Mater duplication</p>
            </caption>
            <text>
               <p>Further analysis of Mater duplication</p>
            </text>
            <file name="gb-2003-4-8-r47-s1.pdf">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank John Taylor and Duane Martindale for critical comments on the manuscript. This work was supported by the Canadian Institutes of Health Research (CIHR) and Genome Canada to S.W.S. B.F.K. is supported by the CIHR and M.D.W. is supported by the Michael Smith Foundation for Health Research (MSFHR). S.W.S. is an Investigator of CIHR and International Scholar of the Howard Hughes Medical Institute.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <aug>
               <au>
                  <snm>Ohno</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Evolution by Gene Duplication</source>
            <publisher>New York: Springer</publisher>
            <pubdate>1970</pubdate>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Splitting pairs: the diverging fates of duplicated genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Prince</snm>
                  <fnm>VE</fnm>
               </au>
               <au>
                  <snm>Pickett</snm>
                  <fnm>FB</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>827</fpage>
            <lpage>837</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg928</pubid>
                  <pubid idtype="pmpid" link="fulltext">12415313</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Preservation of duplicate genes by complementary, degenerative mutations.</p>
            </title>
            <aug>
               <au>
                  <snm>Force</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lynch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pickett</snm>
                  <fnm>FB</fnm>
               </au>
               <au>
                  <snm>Amores</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yan</snm>
                  <fnm>YL</fnm>
               </au>
               <au>
                  <snm>Postlethwait</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1999</pubdate>
            <volume>151</volume>
            <fpage>1531</fpage>
            <lpage>1545</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10101175</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Selection in the evolution of gene duplications.</p>
            </title>
            <aug>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0008.1</fpage>
            <lpage>0008.9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2002-3-2-research0008</pubid>
                  <pubid idtype="pmpid" link="fulltext">11864370</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions.</p>
            </title>
            <aug>
               <au>
                  <snm>Fan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Newman</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Linardopoulou</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1663</fpage>
            <lpage>1672</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.338402</pubid>
                  <pubid idtype="pmpid" link="fulltext">12421752</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Initial sequencing and analysis of the human genome.</p>
            </title>
            <aug>
               <au>
                  <cnm>International Human Genome Sequencing Consortium</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>860</fpage>
            <lpage>921</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057062</pubid>
                  <pubid idtype="pmpid" link="fulltext">11237011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Segmental duplications: organization and impact within the current human genome project assembly.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Yavor</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Massa</snm>
                  <fnm>HF</fnm>
               </au>
               <au>
                  <snm>Trask</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1005</fpage>
            <lpage>1017</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.GR-1871R</pubid>
                  <pubid idtype="pmpid" link="fulltext">11381028</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Recent segmental duplications in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Reinert</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Samonte</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <fpage>1003</fpage>
            <lpage>1007</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1072047</pubid>
                  <pubid idtype="pmpid" link="fulltext">12169732</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Genome-wide detection of segmental duplications and assembly errors in the human genome sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Cheung</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Estivill</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Khaja</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>MacDonald</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lau</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>R25</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2003-4-4-r25</pubid>
                  <pubid idtype="pmpid" link="fulltext">12702206</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Segmental duplications: an 'expanding' role in genomic instability and disease.</p>
            </title>
            <aug>
               <au>
                  <snm>Emanuel</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Shaikh</snm>
                  <fnm>TH</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>791</fpage>
            <lpage>800</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35093500</pubid>
                  <pubid idtype="pmpid" link="fulltext">11584295</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The Jackson Laboratory JAX Strain Information</p>
            </title>
            <url>http://jaxmice.jax.org/info/chromosomal_abberati.html</url>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Initial sequencing and comparative analysis of the mouse genome.</p>
            </title>
            <aug>
               <au>
                  <cnm>Mouse Genome Sequencing Consortium</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>520</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01262</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Whole-genome sequence assembly for mammalian genomes: Arachne 2.</p>
            </title>
            <aug>
               <au>
                  <snm>Jaffe</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gnerre</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mauceli</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>91</fpage>
            <lpage>96</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.828403</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529310</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>TCAG: mouse recent segmental duplication homepage</p>
            </title>
            <url>http://chr7.ocgc.ca/mousedup</url>
         </bibl>
         <bibl id="B15">
            <title>
               <p>NCBI Mouse Genome Resources</p>
            </title>
            <url>http://www.ncbi.nih.gov/genome/guide/mouse</url>
         </bibl>
         <bibl id="B16">
            <title>
               <p>BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Lett</source>
            <pubdate>1999</pubdate>
            <volume>174</volume>
            <fpage>247</fpage>
            <lpage>250</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1097(99)00149-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">10339815</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>The bcl-2 family member, Bcl2a1, maps to mouse chromosome 9 and human chromosome 15.</p>
            </title>
            <aug>
               <au>
                  <snm>Lin</snm>
                  <fnm>EY</fnm>
               </au>
               <au>
                  <snm>Kozak</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Orlofsky</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Prystowsky</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Mamm Genome</source>
            <pubdate>1997</pubdate>
            <volume>8</volume>
            <fpage>293</fpage>
            <lpage>294</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s003359900418</pubid>
                  <pubid idtype="pmpid" link="fulltext">9096119</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Multiple gene duplication and expression of mouse bcl-2-related genes, A1.</p>
            </title>
            <aug>
               <au>
                  <snm>Hatakeyama</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hamasaki</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Negishi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>DY</fnm>
               </au>
               <au>
                  <snm>Sendo</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Nakayama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nakayama</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Int Immunol</source>
            <pubdate>1998</pubdate>
            <volume>10</volume>
            <fpage>631</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/intimm/10.5.631</pubid>
                  <pubid idtype="pmpid">9645611</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Evolution of the amylase multigene family. YBR/Ki mice express a pancreatic amylase gene which is silent in other strains.</p>
            </title>
            <aug>
               <au>
                  <snm>Gumucio</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Wiebauer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dranginis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Samuelson</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Treisman</snm>
                  <fnm>LO</fnm>
               </au>
               <au>
                  <snm>Caldwell</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Antonucci</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Meisler</snm>
                  <fnm>MH</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1985</pubdate>
            <volume>260</volume>
            <fpage>13483</fpage>
            <lpage>13489</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2414282</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Termination of transcription in the mouse alpha-amylase gene Amy-2a occurs at multiple sites downstream of the polyadenylation site.</p>
            </title>
            <aug>
               <au>
                  <snm>Hagenbuchle</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Wellauer</snm>
                  <fnm>PK</fnm>
               </au>
               <au>
                  <snm>Cribbs</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Schibler</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1984</pubdate>
            <volume>38</volume>
            <fpage>737</fpage>
            <lpage>744</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6091898</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>UCSC Genome Bioinformatics</p>
            </title>
            <url>http://genome.ucsc.edu</url>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The sequence of the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Mural</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
               <au>
                  <snm>Yandell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>291</volume>
            <fpage>1304</fpage>
            <lpage>1351</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1058040</pubid>
                  <pubid idtype="pmpid" link="fulltext">11181995</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Discovery of the human genome sequence in the public and private databases.</p>
            </title>
            <aug>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>R808</fpage>
            <lpage>R811</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(01)00490-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">11676931</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A mouse gene encoding an oocyte antigen associated with autoimmune premature ovarian failure.</p>
            </title>
            <aug>
               <au>
                  <snm>Tong</snm>
                  <fnm>ZB</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Endocrinology</source>
            <pubdate>1999</pubdate>
            <volume>140</volume>
            <fpage>3720</fpage>
            <lpage>3726</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1210/en.140.8.3720</pubid>
                  <pubid idtype="pmpid" link="fulltext">10433232</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p><it>Mater</it>, a maternal effect gene required for early embryonic development in mice.</p>
            </title>
            <aug>
               <au>
                  <snm>Tong</snm>
                  <fnm>ZB</fnm>
               </au>
               <au>
                  <snm>Gold</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pfeifer</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Dorward</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Bondy</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Dean</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>26</volume>
            <fpage>267</fpage>
            <lpage>268</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/81547</pubid>
                  <pubid idtype="pmpid" link="fulltext">11062459</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Evolution by the birth-and-death process in multigene families of the vertebrate immune system.</p>
            </title>
            <aug>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Sitnikova</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>7799</fpage>
            <lpage>7806</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.94.15.7799</pubid>
                  <pubid idtype="pmpid" link="fulltext">9223266</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Deficiencies of human complement component C4A and C4B and heterozygosity in length variants of RP-C4-CYP21-TNX (RCCX) modules in caucasians. The load of RCCX genetic diversity on major histocompatibility complex-associated disease.</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchong</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rupert</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Chung</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Sotos</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Zipf</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Rennebohm</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Yung Yu</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Exp Med</source>
            <pubdate>2000</pubdate>
            <volume>191</volume>
            <fpage>2183</fpage>
            <lpage>2196</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1084/jem.191.12.2183</pubid>
                  <pubid idtype="pmpid" link="fulltext">10859342</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>An unequal crossover event in RCCX modules of the human MHC resulting in the formation of a TNXB/TNXA hybrid and deletion of the CYP21A.</p>
            </title>
            <aug>
               <au>
                  <snm>Jaatinen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chung</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Ruuskanen</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lokki</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Hum Immunol</source>
            <pubdate>2002</pubdate>
            <volume>63</volume>
            <fpage>683</fpage>
            <lpage>689</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0198-8859(02)00416-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">12121677</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Genetic sophistication of human complement components C4A and C4B and RP-C4-CYP21-TNX (RCCX) modules in the major histocompatibility complex.</p>
            </title>
            <aug>
               <au>
                  <snm>Chung</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Rennebohm</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Lokki</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Blanchong</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>CY</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2002</pubdate>
            <volume>71</volume>
            <fpage>823</fpage>
            <lpage>837</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1086/342777</pubid>
                  <pubid idtype="pmpid" link="fulltext">12226794</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Three extra copies of a C4-related gene in H-2w7 mice are C4/Slp hybrid genes generated by multiple recombinational events.</p>
            </title>
            <aug>
               <au>
                  <snm>Pattanakitsakul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nakayama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nonaka</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>1990</pubdate>
            <volume>32</volume>
            <fpage>431</fpage>
            <lpage>439</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2272665</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Chromosomal regions containing high-density and ambiguous-mapped single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Estivill</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pujana</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Nakabayashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>LC</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <fpage>1987</fpage>
            <lpage>1995</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/11.17.1987</pubid>
                  <pubid idtype="pmpid" link="fulltext">12165560</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Engineering the mouse genome with bacterial artificial chromosomes to create multipurpose alleles.</p>
            </title>
            <aug>
               <au>
                  <snm>Testa</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Vintersten</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Benes</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pijnappel</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Chambers</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>AF</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2003</pubdate>
            <volume>21</volume>
            <fpage>443</fpage>
            <lpage>447</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt804</pubid>
                  <pubid idtype="pmpid" link="fulltext">12627172</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>The generic genome browser: a building block for a model organism system database.</p>
            </title>
            <aug>
               <au>
                  <snm>Stein</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Shu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Caudy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mangone</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nickerson</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Arva</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1599</fpage>
            <lpage>1610</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.403602</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368253</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>GenomePixelizer-a visualization program for comparative genomics within and between species.</p>
            </title>
            <aug>
               <au>
                  <snm>Kozik</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kochetkova</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Michelmore</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>335</fpage>
            <lpage>336</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.2.335</pubid>
                  <pubid idtype="pmpid" link="fulltext">11847088</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Ensembl Mouse Genome Server</p>
            </title>
            <url>http://www.ensembl.org/Mus_musculus/</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Ensembl EnsMart</p>
            </title>
            <url>http://www.ensembl.org/EnsMart/</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>TCAG: human recent segmental duplication homepage</p>
            </title>
            <url>http://chr7.ocgc.ca/humandup</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Modes of DAPI banding and simultaneous <it>in situ </it>hybridization.</p>
            </title>
            <aug>
               <au>
                  <snm>Heng</snm>
                  <fnm>HHQ</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>L-C</fnm>
               </au>
            </aug>
            <source>Chromosoma</source>
            <pubdate>1993</pubdate>
            <volume>102</volume>
            <fpage>325</fpage>
            <lpage>332</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8325164</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>High resolution mapping of mammalian genes by <it>in situ </it>hybridization to free chromatin.</p>
            </title>
            <aug>
               <au>
                  <snm>Heng</snm>
                  <fnm>HHQ</fnm>
               </au>
               <au>
                  <snm>Squire</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>L-C</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1992</pubdate>
            <volume>89</volume>
            <fpage>9509</fpage>
            <lpage>9513</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1384055</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p><it>Mater </it>encodes a maternal protein in mice with a leucine-rich repeat domain homologous to porcine ribonuclease inhibitor.</p>
            </title>
            <aug>
               <au>
                  <snm>Tong</snm>
                  <fnm>ZB</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Dean</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Mamm Genome</source>
            <pubdate>2000</pubdate>
            <volume>11</volume>
            <fpage>281</fpage>
            <lpage>287</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s003350010053</pubid>
                  <pubid idtype="pmpid" link="fulltext">10754103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>A human homologue of mouse <it>Mater</it>, a maternal effect gene essential for early embryonic development.</p>
            </title>
            <aug>
               <au>
                  <snm>Tong</snm>
                  <fnm>ZB</fnm>
               </au>
               <au>
                  <snm>Bondy</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Hum Reprod</source>
            <pubdate>2002</pubdate>
            <volume>17</volume>
            <fpage>903</fpage>
            <lpage>911</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/humrep/17.4.903</pubid>
                  <pubid idtype="pmpid" link="fulltext">11925379</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>rVista for comparative sequence-based discovery of functional transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Loots</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Ovcharenko</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>832</fpage>
            <lpage>839</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.225502. Article published online before print in April 2002</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997350</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Comparative analysis of the gene-dense ACHE/TFR2 region on human chromosome 7q22 with the orthologous region on mouse chromosome 5.</p>
            </title>
            <aug>
               <au>
                  <snm>Wilson</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Riemer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Martindale</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Schnupf</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Boright</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Hardy</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>LC</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>1352</fpage>
            <lpage>1365</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/29.6.1352</pubid>
                  <pubid idtype="pmpid" link="fulltext">11239002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>PipMaker - a web server for aligning two genomic DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Riemer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bouck</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>577</fpage>
            <lpage>586</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.4.577</pubid>
                  <pubid idtype="pmpid" link="fulltext">10779500</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>VISTA: visualizing global DNA sequence alignments of arbitrary length.</p>
            </title>
            <aug>
               <au>
                  <snm>Mayor</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>1046</fpage>
            <lpage>1047</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.11.1046</pubid>
                  <pubid idtype="pmpid" link="fulltext">11159318</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>NCBI BLAST 2 Sequences</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/blast/bl2seq/bl2.html</url>
         </bibl>
      </refgrp>
   </bm>
</art>
