<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-11-r74</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Ohshima</snm>
               <fnm>Kazuhiko</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2">
               <snm>Hattori</snm>
               <fnm>Masahira</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
            </au>
            <au id="A3">
               <snm>Yada</snm>
               <fnm>Tetsusi</fnm>
               <insr iid="I4"/>
            </au>
            <au id="A4">
               <snm>Gojobori</snm>
               <fnm>Takashi</fnm>
               <insr iid="I5"/>
            </au>
            <au id="A5">
               <snm>Sakaki</snm>
               <fnm>Yoshiyuki</fnm>
               <insr iid="I2"/>
               <insr iid="I4"/>
            </au>
            <au id="A6" ca="yes">
               <snm>Okada</snm>
               <fnm>Norihiro</fnm>
               <insr iid="I1"/>
               <email>nokada@bio.titech.ac.jp</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>School and Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8501, Japan</p>
            </ins>
            <ins id="I2">
               <p>RIKEN Genomic Sciences Center, 1-7-22, Suehiro Tsurumi, Yokohama, Kanagawa 230-0045, Japan</p>
            </ins>
            <ins id="I3">
               <p>Laboratory of Genome Information, Kitasato Institute for Life Science, Kitasato University, 1-15-1, Kitasato, Sagamihara, Kanagawa 228-8555, Japan</p>
            </ins>
            <ins id="I4">
               <p>Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan</p>
            </ins>
            <ins id="I5">
               <p>Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Yata 1111, Mishima, Shizuoka 411-8540, Japan</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>11</issue>
         <fpage>R74</fpage>
         <url>http://genomebiology.com/2003/4/11/R74</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">14611660</pubid>
               <pubid idtype="doi">10.1186/gb-2003-4-11-r74</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>22</day>
               <month>7</month>
               <year>2003</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>2</day>
               <month>9</month>
               <year>2003</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>25</day>
               <month>9</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>28</day>
               <month>10</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Ohshima et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <shorttitle>
         <p>Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates</p>
      </shorttitle>
      <shortabs>
         <p>The first comprehensive analysis of human processed pseudogenes (PPs) using all known human genes as queries is presented. The data suggest a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Abundant pseudogenes are a feature of mammalian genomes. Processed pseudogenes (PPs) are reverse transcribed from mRNAs. Recent molecular biological studies show that mammalian long interspersed element 1 (L1)-encoded proteins may have been involved in PP reverse transcription. Here, we present the first comprehensive analysis of human PPs using all known human genes as queries.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The human genome was queried and 3,664 candidate PPs were identified. The most abundant were copies of genes encoding keratin 18, glyceraldehyde-3-phosphate dehydrogenase and ribosomal protein L21. A simple method was developed to estimate the level of nucleotide substitutions (and therefore the age) of PPs. A Poisson-like age distribution was obtained with a mean age close to that of the Alu repeats, the predominant human short interspersed elements. These data suggest a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates. The peak period of amplification of these two distinct retrotransposons was estimated to be 40-50 million years ago. Concordant amplification of certain L1 subfamilies with PPs and Alus was observed.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>We suggest that a burst of formation of PPs and Alus occurred in the genome of ancestral primates. One possible mechanism is that proteins encoded by members of particular L1 subfamilies acquired an enhanced ability to recognize cytosolic RNAs <it>in trans</it>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The abundance of pseudogenes is a remarkable feature of mammalian genomes. Aptly named, pseudogenes are copies of specific genes and are present in every mammalian chromosome <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. In general, pseudogenes are thought to be nonfunctional <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> as they have accumulated vast numbers of mutations during evolution and have lost the ability to be transcribed. Pseudogenes fall into two distinct categories depending on the mechanism by which they are generated: processed pseudogenes (PPs) are reverse transcribed from mRNAs (and thus do not contain introns) whereas nonprocessed pseudogenes arise from duplications of genomic DNA <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B4">4</abbr></abbrgrp>. Among the abundant PPs, there are a substantial number of 'processed genes' or 'retrogenes' of novel function that also derive from mRNAs of various intron-containing genes <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>In addition to PPs, mammalian genomes contain a large number of retrotransposons (retroposons) that represent a reverse flow of genetic information via RNA <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. In humans, short interspersed elements (SINEs) and long interspersed elements (LINEs) occupy over 30% of the genome <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Progress in LINE1 (L1) molecular biology has enabled L1 'retrotransposition' studies in cultured HeLa cells <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. Recent work <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp> shows that mammalian L1-encoded proteins may have been involved in the reverse transcription of PP and Alu <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. Furthermore, L1-encoded proteins predominantly mobilize the RNA in which they are encoded <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. This so-called '<it>cis </it>preference' explains the fact that, among the overwhelming number of nonfunctional L1 RNAs, recent mutagenic L1 insertions in humans and mice are derived from a progenitor L1 RNA that contained intact open reading frames (ORFs) <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. In fact, Moran's group estimated that a functional L1 mobilizes nonfunctional L1 RNAs and other cellular mRNAs <it>in trans </it>at frequencies of only 0.2%-0.9% and 0.01%-0.05%, respectively, relative to processes involving <it>cis </it>RNA <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. This finding also raised the question of how human Alu repeats could have been amplified <it>in trans </it>to their present level of approximately 10% of the human genome, given that L1-encoded proteins preferentially mobilize their own transcripts. Boeke proposed that Alu RNA secondary structure could have positioned this RNA on the ribosome in a manner that promoted effective interactions with L1-encoded proteins <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B27">27</abbr></abbrgrp>.</p>
         <p>The initial analysis of the human genome draft sequence by the International Human Genome Sequencing Consortium provided the first comprehensive view of retroposons such as LINEs and SINEs, although the description of PPs was largely ignored <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The Celera report briefly described a preliminary analysis of PPs <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. Here, we present the first comprehensive analysis of human PPs using all known human genes as queries. These PPs were derived from 6% of all annotated human genes, and our data suggest a possible burst of PP genesis early in primate evolution.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Whole-genome screening for human PPs and their content</p>
            </st>
            <p>We initially searched for PPs that exhibit sequence similarity to any of the transcripts from the 21,921 genes annotated by the Ensembl project <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. The fact that PPs contain few if any introns enabled our search to generate 3,664 PP candidates (Table <tblr tid="T1">1</tblr> and Additional data file 1; pseudogenes generated by DNA duplication contained many introns and were eliminated). These candidate PPs represented a minimum set because not all human genes have yet been annotated <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> and the search included only those PPs whose length is more than 90% of the respective mRNA. If the estimated 35,000 human genes <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B28">28</abbr></abbrgrp> had been used in the search and shorter PPs included in the analysis, over 7,000 PPs would have been expected.</p>
            <tbl id="T1" hint_layout="single">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Processed pseudogene content of the human genome</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>Gene class*</p>
                     </c>
                     <c ca="center">
                        <p>Genes that generated PPs</p>
                     </c>
                     <c ca="center">
                        <p>PPs</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Annotated genes<sup>&#8224;</sup></b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Enzymes</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Kinase</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dehydrogenase</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Transferase</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Peptidase</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Phosphatase</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Synthase</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Synthetase</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Translocase</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Protease</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Reductase</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Phospholipase</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RNA polymerase</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Others</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>63</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>148</p>
                     </c>
                     <c ca="center">
                        <p>298</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Structural proteins</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ribosomal proteins</p>
                     </c>
                     <c ca="center">
                        <p>31</p>
                     </c>
                     <c ca="center">
                        <p>416</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Actin-related proteins</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Keratin</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ribosomal proteins (mitochondrial)</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Tubulin</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Histone</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Myosin</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dynein</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Kinesin</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>522</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Others</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ligand-binding proteins<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>56</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Transcription factor<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RNA-binding proteins<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Translation initiation/termination</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Proteasome</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Heat-shock protein</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Solute carrier</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Zinc finger protein<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ring finger protein<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Nuclear ribonucleoprotein<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Autoantigen</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Receptor</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Splicing factor<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DEAD/H box polypeptide</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Carcinoma-associated antigen</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Channel</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Thioredoxin</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Others</p>
                     </c>
                     <c ca="center">
                        <p>295</p>
                     </c>
                     <c ca="center">
                        <p>464</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>431</p>
                     </c>
                     <c ca="center">
                        <p>732</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total annotated genes</p>
                     </c>
                     <c ca="center">
                        <p>639</p>
                     </c>
                     <c ca="center">
                        <p>1,552</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Hypothetical genes<sup>&#167;</sup></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>660</p>
                     </c>
                     <c ca="center">
                        <p>2,112</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Grand total</p>
                     </c>
                     <c ca="center">
                        <p>1,299</p>
                     </c>
                     <c ca="center">
                        <p>3,664</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The functional annotation of NCBI Reference Sequence (RefSeq) collection (v2003.01.06) was used for this classification <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. Respective genes were classified into only one category. <sup>&#8224;</sup>Ensembl gene transcripts (v1.1.0) which are correspond to the RefSeq collection (v2003.01.06). <sup>&#8225;</sup>These seven gene classes were classified as 'Ligand binding' in Figure <figr fid="F1">1a,b</figr> for simplicity. <sup>&#167;</sup>Ensembl gene transcripts (v1.1.0) that do not correspond to the RefSeq collection (v2003.01.06).</p>
               </tblfn>
            </tbl>
            <p>Parental genes of human PPs are of various types, including those for enzymes, structural proteins and regulatory proteins such as ligand-binding proteins and transcription factors (Table <tblr tid="T1">1</tblr>). Of the total PPs analyzed, the relative frequency of those derived from genes encoding enzymes, structural proteins and ligand-binding proteins was 19%, 34%, and 9% (Figure <figr fid="F1">1b</figr>), respectively, whereas the PP parental genes for structural proteins constituted only 9% of the total parental genes (Figure <figr fid="F1">1a</figr>). Among 1,299 parental genes identified in this study, kinases, ribosomal proteins and ligand-binding proteins were predominant.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Difference between the profiles of the PP parental genes and PPs in the human genome</p>
               </caption>
               <text>
                  <p>Difference between the profiles of the PP parental genes and PPs in the human genome. <b>(a) </b>Classifications of the PP parental genes. <b>(b) </b>Classifications of the PPs. Gene classes were based on the functional annotation of the NCBI Reference Sequence collection [61]  for the respective genes (see Table <tblr tid="T1">1</tblr>) and were further integrated into four main classes. Ligand-binding proteins, transcription factors, RNA-binding proteins, zinc finger protein, ring finger proteins, nuclear ribonucleoproteins and splicing factors were classified as 'Ligand binding'.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-1"/>
            </fig>
            <p>Table <tblr tid="T2">2</tblr> shows a compilation of the abundant PPs in the human genome (see Additional data file 2). The three most abundant types of human PPs were derived from the genes for keratin 18, glyceraldehyde-3-phosphate dehydrogenase (GAPD) and ribosomal protein L21 (RP L21). These genes generated at least 52, 43 and 38 copies of PPs, respectively, in the genome. Keratin 18 is commonly expressed in internal epithelia and is one of the earliest intermediate filament proteins expressed during embryogenesis <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The genes for GAPD and ribosomal proteins are housekeeping genes. These data suggest that mRNAs for keratin 18, GAPD and RPL21 were highly expressed or stable in either the germline cells or at an early stage of development, as heritable copies of these genes must have been reverse transcribed in one of those two instances.</p>
            <tbl id="T2" hint_layout="double">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>The most abundant PPs in the human genome</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>PP number*</p>
                     </c>
                     <c ca="left">
                        <p>Ensembl ID</p>
                     </c>
                     <c ca="left">
                        <p>RefSeq ID</p>
                     </c>
                     <c ca="left">
                        <p>Gene name</p>
                     </c>
                     <c ca="center">
                        <p>mRNA (bases)<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>GC content<sup>&#8225;</sup></p>
                     </c>
                     <c ca="center">
                        <p>Chromosome</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>52</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000228652</p>
                     </c>
                     <c ca="left">
                        <p>NM_000224</p>
                     </c>
                     <c ca="left">
                        <p>Keratin 18 (KRT18)</p>
                     </c>
                     <c ca="center">
                        <p>1,311</p>
                     </c>
                     <c ca="center">
                        <p>0.59</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000229239</p>
                     </c>
                     <c ca="left">
                        <p>NM_002046</p>
                     </c>
                     <c ca="left">
                        <p>Glyceraldehyde-3-phosphate dehydrogenase (GAPD)</p>
                     </c>
                     <c ca="center">
                        <p>975</p>
                     </c>
                     <c ca="center">
                        <p>0.55</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>38</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000241454</p>
                     </c>
                     <c ca="left">
                        <p>NM_000982</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L21 (RPL21)</p>
                     </c>
                     <c ca="center">
                        <p>623</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>36</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000264258</p>
                     </c>
                     <c ca="left">
                        <p>NM_000993</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L31 (RPL31)</p>
                     </c>
                     <c ca="center">
                        <p>412</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000226734</p>
                     </c>
                     <c ca="left">
                        <p>NM_000995</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L34 (RPL34)</p>
                     </c>
                     <c ca="center">
                        <p>382</p>
                     </c>
                     <c ca="center">
                        <p>0.44</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000256818</p>
                     </c>
                     <c ca="left">
                        <p>NM_001019</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S15a (RPS15A)</p>
                     </c>
                     <c ca="center">
                        <p>440</p>
                     </c>
                     <c ca="center">
                        <p>0.45</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000202773</p>
                     </c>
                     <c ca="left">
                        <p>NM_000970</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L6 (RPL6)</p>
                     </c>
                     <c ca="center">
                        <p>861</p>
                     </c>
                     <c ca="center">
                        <p>0.47</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000241929</p>
                     </c>
                     <c ca="left">
                        <p>NM_000969</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L5 (RPL5)</p>
                     </c>
                     <c ca="center">
                        <p>951</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000255320</p>
                     </c>
                     <c ca="left">
                        <p>NM_002128</p>
                     </c>
                     <c ca="left">
                        <p>High-mobility group box 1 (HMGB1)</p>
                     </c>
                     <c ca="center">
                        <p>971</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000245458</p>
                     </c>
                     <c ca="left">
                        <p>NM_001032</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S29 (RPS29)</p>
                     </c>
                     <c ca="center">
                        <p>195</p>
                     </c>
                     <c ca="center">
                        <p>0.53</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000260896</p>
                     </c>
                     <c ca="left">
                        <p>NM_001026</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S24 (RPS24)</p>
                     </c>
                     <c ca="center">
                        <p>390</p>
                     </c>
                     <c ca="center">
                        <p>0.44</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000009589</p>
                     </c>
                     <c ca="left">
                        <p>NM_001023</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S20 (RPS20)</p>
                     </c>
                     <c ca="center">
                        <p>504</p>
                     </c>
                     <c ca="center">
                        <p>0.47</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000225430</p>
                     </c>
                     <c ca="left">
                        <p>NM_000981</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L19 (RPL19)</p>
                     </c>
                     <c ca="center">
                        <p>667</p>
                     </c>
                     <c ca="center">
                        <p>0.52</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000230050</p>
                     </c>
                     <c ca="left">
                        <p>NM_001016</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S12 (RPS12)</p>
                     </c>
                     <c ca="center">
                        <p>493</p>
                     </c>
                     <c ca="center">
                        <p>0.49</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000253004</p>
                     </c>
                     <c ca="left">
                        <p>NM_054012</p>
                     </c>
                     <c ca="left">
                        <p>Argininosuccinate synthetase (ASS)</p>
                     </c>
                     <c ca="center">
                        <p>1,245</p>
                     </c>
                     <c ca="center">
                        <p>0.56</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000216296</p>
                     </c>
                     <c ca="left">
                        <p>NM_004500</p>
                     </c>
                     <c ca="left">
                        <p>Heterogeneous nuclear ribonucleoprotein C (C1/C2) (HNRPC)</p>
                     </c>
                     <c ca="center">
                        <p>1,588</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000211372</p>
                     </c>
                     <c ca="left">
                        <p>NM_022551</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S18 (RPS18)</p>
                     </c>
                     <c ca="center">
                        <p>494</p>
                     </c>
                     <c ca="center">
                        <p>0.51</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000263097</p>
                     </c>
                     <c ca="left">
                        <p>NM_004368</p>
                     </c>
                     <c ca="left">
                        <p>Calponin 2 (CNN2)</p>
                     </c>
                     <c ca="center">
                        <p>882</p>
                     </c>
                     <c ca="center">
                        <p>0.61</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000253788</p>
                     </c>
                     <c ca="left">
                        <p>NM_000988</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L27 (RPL27)</p>
                     </c>
                     <c ca="center">
                        <p>450</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000259689</p>
                     </c>
                     <c ca="left">
                        <p>NM_001010</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S6 (RPS6)</p>
                     </c>
                     <c ca="center">
                        <p>784</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000260379</p>
                     </c>
                     <c ca="left">
                        <p>NM_001003</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein, large, P1 (RPLP1)</p>
                     </c>
                     <c ca="center">
                        <p>510</p>
                     </c>
                     <c ca="center">
                        <p>0.56</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000011649</p>
                     </c>
                     <c ca="left">
                        <p>NM_007104</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L10a (RPL10A)</p>
                     </c>
                     <c ca="center">
                        <p>682</p>
                     </c>
                     <c ca="center">
                        <p>0.51</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000255477</p>
                     </c>
                     <c ca="left">
                        <p>NM_003295</p>
                     </c>
                     <c ca="left">
                        <p>Tumor protein, translationally-controlled 1 (TPT1)</p>
                     </c>
                     <c ca="center">
                        <p>829</p>
                     </c>
                     <c ca="center">
                        <p>0.45</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000227378</p>
                     </c>
                     <c ca="left">
                        <p>NM_006597</p>
                     </c>
                     <c ca="left">
                        <p>Heat shock 70 kDa protein 8 (HSPA8)</p>
                     </c>
                     <c ca="center">
                        <p>1,938</p>
                     </c>
                     <c ca="center">
                        <p>0.46</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000218437</p>
                     </c>
                     <c ca="left">
                        <p>NM_001007</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S4, X-linked (RPS4X)</p>
                     </c>
                     <c ca="center">
                        <p>853</p>
                     </c>
                     <c ca="center">
                        <p>0.48</p>
                     </c>
                     <c ca="center">
                        <p>X</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000220072</p>
                     </c>
                     <c ca="left">
                        <p>NM_001021</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S17 (RPS17)</p>
                     </c>
                     <c ca="center">
                        <p>453</p>
                     </c>
                     <c ca="center">
                        <p>0.49</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000265385</p>
                     </c>
                     <c ca="left">
                        <p>NM_000883</p>
                     </c>
                     <c ca="left">
                        <p>IMP (inosine monophosphate) dehydrogenase 1 (IMPDH1)</p>
                     </c>
                     <c ca="center">
                        <p>1,425</p>
                     </c>
                     <c ca="center">
                        <p>0.59</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000265264</p>
                     </c>
                     <c ca="left">
                        <p>NM_000986</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L24 (RPL24)</p>
                     </c>
                     <c ca="center">
                        <p>447</p>
                     </c>
                     <c ca="center">
                        <p>0.48</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000216146</p>
                     </c>
                     <c ca="left">
                        <p>NM_000967</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L3 (RPL3)</p>
                     </c>
                     <c ca="center">
                        <p>1,265</p>
                     </c>
                     <c ca="center">
                        <p>0.54</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000196551</p>
                     </c>
                     <c ca="left">
                        <p>NM_001009</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S5 (RPS5)</p>
                     </c>
                     <c ca="center">
                        <p>720</p>
                     </c>
                     <c ca="center">
                        <p>0.58</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000228140</p>
                     </c>
                     <c ca="left">
                        <p>NM_001017</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S13 (RPS13)</p>
                     </c>
                     <c ca="center">
                        <p>495</p>
                     </c>
                     <c ca="center">
                        <p>0.45</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000221267</p>
                     </c>
                     <c ca="left">
                        <p>NM_003333</p>
                     </c>
                     <c ca="left">
                        <p>Ubiquitin A-52 residue ribosomal protein fusion product 1 (UBA52)</p>
                     </c>
                     <c ca="center">
                        <p>384</p>
                     </c>
                     <c ca="center">
                        <p>0.53</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000233609</p>
                     </c>
                     <c ca="left">
                        <p>NM_001018</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S15 (RPS15)</p>
                     </c>
                     <c ca="center">
                        <p>469</p>
                     </c>
                     <c ca="center">
                        <p>0.62</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000257522</p>
                     </c>
                     <c ca="left">
                        <p>NM_030940</p>
                     </c>
                     <c ca="left">
                        <p>Hypothetical protein MGC4276 similar to CG8198 (MGC4276)</p>
                     </c>
                     <c ca="center">
                        <p>255</p>
                     </c>
                     <c ca="center">
                        <p>0.38</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000265333</p>
                     </c>
                     <c ca="left">
                        <p>NM_003374</p>
                     </c>
                     <c ca="left">
                        <p>Voltage-dependent anion channel 1 (VDAC1)</p>
                     </c>
                     <c ca="center">
                        <p>1,498</p>
                     </c>
                     <c ca="center">
                        <p>0.45</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000236900</p>
                     </c>
                     <c ca="left">
                        <p>NM_001028</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein S25 (RPS25)</p>
                     </c>
                     <c ca="center">
                        <p>426</p>
                     </c>
                     <c ca="center">
                        <p>0.45</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000264254</p>
                     </c>
                     <c ca="left">
                        <p>NM_024065</p>
                     </c>
                     <c ca="left">
                        <p>Hypothetical protein MGC3062 (MGC3062)</p>
                     </c>
                     <c ca="center">
                        <p>955</p>
                     </c>
                     <c ca="center">
                        <p>0.42</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000246201</p>
                     </c>
                     <c ca="left">
                        <p>NM_003908</p>
                     </c>
                     <c ca="left">
                        <p>Eukaryotic translation initiation factor 2, subunit 2 beta (EIF2S2)</p>
                     </c>
                     <c ca="center">
                        <p>1,300</p>
                     </c>
                     <c ca="center">
                        <p>0.39</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000245206</p>
                     </c>
                     <c ca="left">
                        <p>NM_002080</p>
                     </c>
                     <c ca="left">
                        <p>Glutamic-oxaloacetic transaminase 2, mitochondrial (GOT2)</p>
                     </c>
                     <c ca="center">
                        <p>2,331</p>
                     </c>
                     <c ca="center">
                        <p>0.49</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000238591</p>
                     </c>
                     <c ca="left">
                        <p>NM_015962</p>
                     </c>
                     <c ca="left">
                        <p>CGI-35 protein (CGI-35)</p>
                     </c>
                     <c ca="center">
                        <p>1,019</p>
                     </c>
                     <c ca="center">
                        <p>0.37</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000249380</p>
                     </c>
                     <c ca="left">
                        <p>NM_005000</p>
                     </c>
                     <c ca="left">
                        <p>NADH dehydrogenase 1 alpha subcomplex, 5, 13 kDa (NDUFA5)</p>
                     </c>
                     <c ca="center">
                        <p>339</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000228825</p>
                     </c>
                     <c ca="left">
                        <p>NM_005719</p>
                     </c>
                     <c ca="left">
                        <p>Actin-related protein 2/3 complex, subunit 3, 21 kDa (ARPC3)</p>
                     </c>
                     <c ca="center">
                        <p>786</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000261565</p>
                     </c>
                     <c ca="left">
                        <p>NM_003187</p>
                     </c>
                     <c ca="left">
                        <p>TATA box binding protein (TBP)-associated factor, 32 kDa (TAF9)</p>
                     </c>
                     <c ca="center">
                        <p>833</p>
                     </c>
                     <c ca="center">
                        <p>0.34</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000227157</p>
                     </c>
                     <c ca="left">
                        <p>NM_005566</p>
                     </c>
                     <c ca="left">
                        <p>Lactate dehydrogenase A (LDHA)</p>
                     </c>
                     <c ca="center">
                        <p>1,589</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000264221</p>
                     </c>
                     <c ca="left">
                        <p>NM_006452</p>
                     </c>
                     <c ca="left">
                        <p>Phosphoribosylaminoimidazole carboxylase, (PAICS)</p>
                     </c>
                     <c ca="center">
                        <p>1,385</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000037869</p>
                     </c>
                     <c ca="left">
                        <p>NM_032822</p>
                     </c>
                     <c ca="left">
                        <p>Hypothetical protein FLJ14668 (FLJ14668)</p>
                     </c>
                     <c ca="center">
                        <p>414</p>
                     </c>
                     <c ca="center">
                        <p>0.56</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000235094</p>
                     </c>
                     <c ca="left">
                        <p>NM_001688</p>
                     </c>
                     <c ca="left">
                        <p>ATP synthase, mitochondrial F0 complex, subunit b (ATP5F1)</p>
                     </c>
                     <c ca="center">
                        <p>1,104</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000234875</p>
                     </c>
                     <c ca="left">
                        <p>NM_000983</p>
                     </c>
                     <c ca="left">
                        <p>Ribosomal protein L22 (RPL22)</p>
                     </c>
                     <c ca="center">
                        <p>541</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000005593</p>
                     </c>
                     <c ca="left">
                        <p>NM_001152</p>
                     </c>
                     <c ca="left">
                        <p>Solute carrier family 25, member 5 (SLC25A5)</p>
                     </c>
                     <c ca="center">
                        <p>894</p>
                     </c>
                     <c ca="center">
                        <p>0.52</p>
                     </c>
                     <c ca="center">
                        <p>X</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>ENST00000216252</p>
                     </c>
                     <c ca="left">
                        <p>NM_032758</p>
                     </c>
                     <c ca="left">
                        <p>PHD finger protein 5A (PHF5A)</p>
                     </c>
                     <c ca="center">
                        <p>330</p>
                     </c>
                     <c ca="center">
                        <p>0.48</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The number of PPs that were derived from respective genes. The top 50 genes are shown. <sup>&#8224;</sup>Length of the Ensembl gene transcripts (v1.1.0). <sup>&#8225;</sup>GC content of the Ensembl gene transcripts (v1.1.0). The list of all the genes is available as Additional data file 2.</p>
               </tblfn>
            </tbl>
            <p>As shown in Figure <figr fid="F1">1a</figr> and <figr fid="F1">1b</figr>, structural-protein PPs constitute the largest class (34%). The 50 most prolific PP parental genes include 25 ribosomal protein genes (Table <tblr tid="T2">2</tblr>) which contribute substantially to the high incidence of structural proteins among the total number of PPs presented in Figure <figr fid="F1">1b</figr>.</p>
         </sec>
         <sec>
            <st>
               <p>GC content in PP parental genes</p>
            </st>
            <p>Human PPs are derived from mRNAs that exhibit a wide range of GC content. We examined the possible relationship between the number of PPs derived from a gene and the GC content of its mRNA (Figure <figr fid="F2">2</figr>). The rates of PP generation from parental genes within each GC group show no significant statistical difference except for genes of high GC content (> 0.62). This result differs from that of a previous study in which an inverse correlation between the number of ribosomal protein PPs and the GC content of the parental genes was observed <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Because we analyzed a wide variety of PPs, including those of ribosomal protein genes, the correlation observed in this previous study probably reflects a specific correlation between GC content and either expression level or stability of ribosomal protein mRNAs.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>GC content of the PP parental genes and the number of PP copies of those genes</p>
               </caption>
               <text>
                  <p>GC content of the PP parental genes and the number of PP copies of those genes. The total number of PP parental genes having a given GC content is shown as individual bars in increments of 4%. The PP-generation rate (the PP number/gene) is shown as a line that connects averages for respective groups. The vertical error bars indicate standard error of the mean.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Chromosomal distribution of human PPs</p>
            </st>
            <p>The 3,664 PP candidates are distributed throughout all 24 chromosomes and were derived from genes on various chromosomes (Table <tblr tid="T3">3</tblr> and Figure <figr fid="F3">3</figr>). No stringent bias of gene 'projections' (that is the insertion of the PP of a gene in a specific chromosomal location) toward specific chromosomes was observed. In some chromosomes, however, (for example chromosome 19), the ratios of self-projection are relatively high. Interestingly, the PP density within each chromosome roughly parallels gene density (Figure <figr fid="F4">4</figr>). For example, chromosomes that are gene-rich, such as 19, 17 and 11 <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B28">28</abbr></abbrgrp>, tend to be relatively PP-rich. On the other hand, chromosomes that are gene-poor, such as Y, 21, 13 and 4 <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B28">28</abbr></abbrgrp>, tend to also be poor in PPs. As human gene density shows strong positive correlation with local GC content <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B28">28</abbr></abbrgrp>, this result suggests that the integration of PPs into chromosomes in general may be dependent on aspects of the genomic environment that are strictly related to chromosomal gene density <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B28">28</abbr></abbrgrp>, such as local GC content and an open chromatin structure that facilitates transcription.</p>
            <tbl id="T3" hint_layout="double">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Chromosomal distribution and density of human PPs</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Chromosome</p>
                     </c>
                     <c ca="center">
                        <p>PPs</p>
                     </c>
                     <c ca="center">
                        <p>Genes that generated PPs</p>
                     </c>
                     <c ca="center">
                        <p>Number of genes (Ensembl 4.28.1)</p>
                     </c>
                     <c ca="center">
                        <p>Genes/Mb*</p>
                     </c>
                     <c ca="center">
                        <p>PPs/Mb</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="center">
                        <p>3,664</p>
                     </c>
                     <c ca="center">
                        <p>1,299</p>
                     </c>
                     <c ca="center">
                        <p>23,863</p>
                     </c>
                     <c ca="center">
                        <p>7.33</p>
                     </c>
                     <c ca="center">
                        <p>1.12</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>359</p>
                     </c>
                     <c ca="center">
                        <p>117</p>
                     </c>
                     <c ca="center">
                        <p>2,482</p>
                     </c>
                     <c ca="center">
                        <p>8.90</p>
                     </c>
                     <c ca="center">
                        <p>1.28</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>241</p>
                     </c>
                     <c ca="center">
                        <p>84</p>
                     </c>
                     <c ca="center">
                        <p>1,550</p>
                     </c>
                     <c ca="center">
                        <p>6.31</p>
                     </c>
                     <c ca="center">
                        <p>0.98</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>225</p>
                     </c>
                     <c ca="center">
                        <p>76</p>
                     </c>
                     <c ca="center">
                        <p>1,277</p>
                     </c>
                     <c ca="center">
                        <p>5.94</p>
                     </c>
                     <c ca="center">
                        <p>1.04</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>163</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>868</p>
                     </c>
                     <c ca="center">
                        <p>4.33</p>
                     </c>
                     <c ca="center">
                        <p>0.81</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>193</p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="center">
                        <p>1,093</p>
                     </c>
                     <c ca="center">
                        <p>5.61</p>
                     </c>
                     <c ca="center">
                        <p>0.99</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>207</p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                     <c ca="center">
                        <p>1,297</p>
                     </c>
                     <c ca="center">
                        <p>7.07</p>
                     </c>
                     <c ca="center">
                        <p>1.12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>176</p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="center">
                        <p>1,251</p>
                     </c>
                     <c ca="center">
                        <p>7.59</p>
                     </c>
                     <c ca="center">
                        <p>1.06</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>144</p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="center">
                        <p>787</p>
                     </c>
                     <c ca="center">
                        <p>5.23</p>
                     </c>
                     <c ca="center">
                        <p>0.95</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>150</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>934</p>
                     </c>
                     <c ca="center">
                        <p>6.57</p>
                     </c>
                     <c ca="center">
                        <p>1.05</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>178</p>
                     </c>
                     <c ca="center">
                        <p>45</p>
                     </c>
                     <c ca="center">
                        <p>939</p>
                     </c>
                     <c ca="center">
                        <p>6.56</p>
                     </c>
                     <c ca="center">
                        <p>1.24</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>238</p>
                     </c>
                     <c ca="center">
                        <p>79</p>
                     </c>
                     <c ca="center">
                        <p>1,506</p>
                     </c>
                     <c ca="center">
                        <p>9.98</p>
                     </c>
                     <c ca="center">
                        <p>1.57</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>234</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                     <c ca="center">
                        <p>1,212</p>
                     </c>
                     <c ca="center">
                        <p>8.25</p>
                     </c>
                     <c ca="center">
                        <p>1.59</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>94</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>425</p>
                     </c>
                     <c ca="center">
                        <p>3.61</p>
                     </c>
                     <c ca="center">
                        <p>0.80</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>146</p>
                     </c>
                     <c ca="center">
                        <p>45</p>
                     </c>
                     <c ca="center">
                        <p>785</p>
                     </c>
                     <c ca="center">
                        <p>7.33</p>
                     </c>
                     <c ca="center">
                        <p>1.36</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>114</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="center">
                        <p>770</p>
                     </c>
                     <c ca="center">
                        <p>7.65</p>
                     </c>
                     <c ca="center">
                        <p>1.13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>54</p>
                     </c>
                     <c ca="center">
                        <p>1,040</p>
                     </c>
                     <c ca="center">
                        <p>10.15</p>
                     </c>
                     <c ca="center">
                        <p>0.92</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>126</p>
                     </c>
                     <c ca="center">
                        <p>63</p>
                     </c>
                     <c ca="center">
                        <p>1,272</p>
                     </c>
                     <c ca="center">
                        <p>14.44</p>
                     </c>
                     <c ca="center">
                        <p>1.43</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>74</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>370</p>
                     </c>
                     <c ca="center">
                        <p>4.43</p>
                     </c>
                     <c ca="center">
                        <p>0.88</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>123</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                     <c ca="center">
                        <p>1,504</p>
                     </c>
                     <c ca="center">
                        <p>20.80</p>
                     </c>
                     <c ca="center">
                        <p>1.70</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>640</p>
                     </c>
                     <c ca="center">
                        <p>10.15</p>
                     </c>
                     <c ca="center">
                        <p>0.93</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>232</p>
                     </c>
                     <c ca="center">
                        <p>5.20</p>
                     </c>
                     <c ca="center">
                        <p>0.76</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>577</p>
                     </c>
                     <c ca="center">
                        <p>12.14</p>
                     </c>
                     <c ca="center">
                        <p>1.30</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>X</p>
                     </c>
                     <c ca="center">
                        <p>207</p>
                     </c>
                     <c ca="center">
                        <p>51</p>
                     </c>
                     <c ca="center">
                        <p>922</p>
                     </c>
                     <c ca="center">
                        <p>5.84</p>
                     </c>
                     <c ca="center">
                        <p>1.31</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>130</p>
                     </c>
                     <c ca="center">
                        <p>2.53</p>
                     </c>
                     <c ca="center">
                        <p>0.42</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*The number of Ensembl genes per megabases.</p>
               </tblfn>
            </tbl>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Chromosomal origins of human PPs</p>
               </caption>
               <text>
                  <p>Chromosomal origins of human PPs. Individual bars indicate the total number of PPs in each chromosome. The different colors represent the chromosomal origins of the PPs.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>PP and gene density within each chromosome</p>
               </caption>
               <text>
                  <p>PP and gene density within each chromosome. For each chromosome, the number of PPs per megabase is plotted against the number of genes per megabase.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>A simple method for estimating the level of nucleotide substitutions in PPs</p>
            </st>
            <p>To approximate the age of each PP, we developed a method for estimating the level of nucleotide substitutions relative to the parental gene. Initially, this method corrected for the sequence divergence value (a consequence of nucleotide-substitution processes) by removing the contribution of mutations at CpG sites. The C-to-T transition rate in CpG pairs is around 12-fold higher than the rate for other transitions <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and causes distortions when comparing different genomic elements of high (for example, Alus) or low (for example, L1s) CpG content. Assuming that CpG frequency (&#953;) in a genomic element that was generated by duplication of a functional gene of high CpG content decreases over time (<it>t</it>) and reaches a state of equilibrium (&#949;) (approximately 20% of the frequency <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B33">33</abbr></abbrgrp> expected from the local fraction of cytosines and guanosines <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>), the time since the duplication (<it>T</it>) was calculated (see Materials and methods) from the given sequence divergence (<it>D</it>) and the neutral mutation rate (&#956;) of primates:</p>
            <p><it>D </it>= &#8747;<sub>0</sub><sup>T</sup>&#956;(1 + 11((&#953; - &#949;)/((0.01&#953;/(0.99&#953;-&#949;))<it>t </it>+ 1) + &#949;))dt</p>
            <p>Next, the quantity &#931; (=&#956;<it>T</it>) was corrected for multiple substitutions at the same site using the Jukes-Cantor model <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, giving the average number of substitutions per 100 base-pairs (bp), (<it>K</it>). For PPs, sequence divergences were defined as the mismatch rates of respective PPs relative to the current parental gene sequences. Finally, the levels of substitution that accumulated only in PPs were estimated (see Materials and methods). The estimated levels of substitutions in PPs (K(&#968;)) were then calculated as K(&#968;) = 0.705 <it>K</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Simultaneous burst of processed pseudogenes and Alu repeats in ancestral primates</p>
            </st>
            <p>Using the levels of nucleotide substitution in PPs estimated by K(&#968;), we next evaluated the total number of PPs having the same substitution value, thus approximating the age distribution of PPs. We initially presumed that if PPs were generated at a roughly constant rate during primate evolution, their age distribution would be nearly flat. Surprisingly, a Poisson-like distribution was obtained (Figure <figr fid="F5">5a</figr>). This result indicates that PPs in general may have been generated at extraordinarily high rates during some periods. If the rate of nucleotide substitution is assumed to be 1.5 &#215; 10<sup>-9 </sup>per nucleotide per year <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>, then our data estimate that the peak of PP generation occurred approximately 40 million years ago, coincident with the onset of the radiation of the higher primates <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Age distribution of human retroposons represented by the level of nucleotide substitutions</p>
               </caption>
               <text>
                  <p>Age distribution of human retroposons represented by the level of nucleotide substitutions. <b>(a) </b>Human PPs. The number of nucleotide substitutions per 100 bases (except CpG sites) was calculated for each PP, and the total number of PPs having a given number of substitutions is shown as individual bars in one-nucleotide increments. For comparison, the line shows a Poisson distribution of the same average values for PPs. <b>(b) </b>Alu repeats, calculated and presented as in (a). The line shows a Poisson distribution of the same average values for Alus. <b>(c) </b>Alu subfamilies, calculated as in <b>(a)</b>. The curves connect apices of respective bars calculated as in (a). For simplicity, subfamilies that contain less than 5,000 Alus, such as Alu Ya and Yb, are not shown. <b>(d) </b>L1s, calculated and presented as in (a). <b>(e) </b>L1 subfamilies, calculated and presented as in (c). For simplicity, subfamilies that contain less than 1,000 L1s, such as L1PA1 (L1Hs) and L1P1, are not shown. L1PA6, L1PA7 and L1PA8 are shown as bold blue lines.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-5"/>
            </fig>
            <p>The above results are reminiscent of the amplification profile of Alu repeats. Alu elements comprise approximately 10% of the human genome <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and are restricted to primates <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. It has been proposed that the average age of Alu repeats is around 40 million years and that the majority of Alus were generated around this time <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B26">26</abbr></abbrgrp>. We confirmed these previous results by re-estimating the age distribution of all human Alu repeats (Figure <figr fid="F5">5b</figr>). The Alus also showed a Poisson-like distribution with a sharp peak. Alus are classified into distinct subfamilies that can be identified on the basis of mutations shared among subfamily members <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B26">26</abbr></abbrgrp>. Alu subfamilies were derived from a small number of source or master genes. Accordingly, a consensus sequence constructed from members of each subfamily represents each subfamily's source gene(s) <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B26">26</abbr></abbrgrp>. To evaluate the contribution of each subfamily to the entire distribution of Alus, we estimated the age distribution of respective Alu subfamilies (Figure <figr fid="F5">5c</figr>). The peaks for respective subfamilies are grouped closely, and the subfamily Alu Sx strongly influences the overall distribution of Alus (compare with Figure <figr fid="F5">5b</figr>). Therefore, the Sx subfamily (and thus Alus in general) appears to have been amplified intensively over a relatively short period. To the best of our knowledge, many previous discussions of Alu amplification reflect this viewpoint of Alu evolution <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B26">26</abbr></abbrgrp>. However, our results show that the intensive generation of two distinct elements, PPs and Alus, occurred almost simultaneously suggesting that an unknown change in either the cellular environment or the proliferation mechanism itself enhanced the proliferation of such retroposons in ancestral primates 40-50 million years ago.</p>
         </sec>
         <sec>
            <st>
               <p>Concordant amplification of certain LINE1 subfamilies with PPs and Alus</p>
            </st>
            <p>Recent progress in L1 biology shows that mammalian L1-encoded proteins are likely to have been involved in the reverse transcription of Alus and PPs <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. To elucidate the cause of the elevated retrotransposition of PPs and Alus, we analyzed the age distribution of all human L1s (Figure <figr fid="F5">5d</figr>). Curiously, the rate of amplification (retrotransposition in cells and fixation within a population) of L1s does not peak around 7%, as was the case for PPs and Alus (compare with Figure <figr fid="F5">5a,b</figr>), raising the issue of how the rate of PP/Alu retrotransposition became elevated during a period of moderate change in L1 retrotransposition. To address this problem, L1s were divided into around 80 subfamilies <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and age distributions for representative subfamilies are shown in Figure <figr fid="F5">5e</figr>. Although the distributions of respective subfamilies overlap, each subfamily has emerged successively during approximately 150 million years of mammalian evolution. Merging the distribution profiles of all the L1s yields a curve that is rather flat (almost equal to the curve that connects the apices of the respective bars in Figure <figr fid="F5">5d</figr>). Among a large number of L1 subfamilies, certain subfamilies, namely L1PA6, L1PA7 and L1PA8, were amplified intensively around 47 million years ago (the time corresponding to the 7% score). These data suggest that only one or a few L1 subfamilies may have contributed to the increased level of Alu and PP amplification (see Discussion).</p>
            <p>Figure <figr fid="F6">6</figr> shows phylogenetic relationships between L1 subfamilies. A considerable number of substitutions are evident that could explain a possible functional change in L1s between these subfamilies and the current L1 subfamily (L1Hs/L1PA1). There are several amino-acid substitutions within evolutionarily conserved domains (for example, 'C (cysteine)-rich domain') that result in altered residue polarity or charge (Figure <figr fid="F6">6</figr>). A key example is the highly variable amino-terminal half of the L1-encoded ORF1 protein, which contains residues that may be critical for interaction with other proteins <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. There is 41% (54/131) amino-acid divergence between the ORF1 amino-terminal halves of L1PA7 and L1Hs whereas the divergence is only 7% (14/207) in the carboxy-terminal half (data not shown).</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Phylogenetic relationships between L1 subfamilies</p>
               </caption>
               <text>
                  <p>Phylogenetic relationships between L1 subfamilies. Amino-acid substitutions within the 'C domain' at particular stages of L1 evolution are denoted in boxes. The phylogenetic tree was constructed using the neighbor-joining method <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> based on the last 900 bp of the consensus sequences of respective subfamilies.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-6"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Possible mechanisms of a 'retrotranspositional explosion'</p>
            </st>
            <p>A recent extensive survey of the human genome revealed a large number of ribosomal protein pseudogenes derived from the 79 functional ribosomal protein genes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The discussion of the ages of these pseudogenes is problematic, however, in that ages were calculated by simply dividing sequence divergences by mutation rate. As the sequence divergence of a PP relative to its parental gene (<it>K</it>) is dependent on substitutions in both the PP (K(&#968;)) and the gene (K(<it>f</it>)) (see Materials and methods), the ages of the ribosomal protein pseudogenes were overestimated. For example, with respect to RPL21 mRNAs (the most predominant source of ribosomal protein pseudogenes in humans), the sequence divergence between human and mouse or rat is approximately 11% (NM_000982, NM_019647, NM_053330, and <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>). The previous ribosomal protein pseudogene calculations dismissed sequence divergences between the present-day and primordial genes, probably overestimating the ages by around 10 million years (a few percent per 8-10% of divergence). Therefore, it is difficult to compare such values with the ages of Alus/L1s. Our method provides a clear solution to this matter, enabling us to compare the ages of different classes of retroposons. Hence, our method led us to the finding that there was a simultaneous burst of PPs and Alus - a 'retrotranspositional explosion' - in the primate genome.</p>
            <p>Regarding the cause of the retrotranspositional explosion, it is worth considering the effect of a 'bottleneck' <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B41">41</abbr></abbrgrp> during primate evolution. Only individuals that experienced extensive genomic retrotransposition might have propagated to become a majority within a population of ancestral primates, via a mechanism involving a rapid reduction in the general population. Studies on the molecular phylogeny and demographic history of humans show, however, that the primate lineage leading to humans never experienced an extensive bottleneck, at least since its divergence from the prosimian lineage <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Therefore, the effect of a bottleneck can be largely ignored.</p>
            <p>The retrotranspositional explosion could be due to a change in the cellular environment of ancestral primates 40-50 million years ago, such as a higher transcriptional potential of parental (master) genes of PPs and Alus. A specific environment of the genome during the period of the retrotranspositional explosion, such as more available target sites of PPs and Alus, might have facilitated this event. Alternatively, a change in the proliferation mechanism of PPs and Alus, such as an increased amount of reverse transcriptase or an enhanced activity of enzymes for retrotransposition, might have promoted the explosion.</p>
            <p>Recent studies on the L1 retrotransposons show that mammalian L1-encoded proteins may have been involved in the reverse transcription of Alus and PPs <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Here, we have shown that the intensive amplification of distinct genetic elements, namely PPs and Alus, seems to have occurred almost simultaneously around 40-50 million years ago, and suggests that only one or a few L1 subfamilies may have contributed to the observed high levels of Alu/PP retrotransposition.</p>
            <p>How could a specific L1 subfamily (or subfamilies) have generated Alus and PPs at such an accelerated rate? We propose that L1s within specific subfamilies mobilized RNAs <it>in trans </it>at accelerated rates in ancestral primate genomes. Thus, a specific L1 subfamily may have mediated the Alu/PP retrotranspositional explosion. The age distributions estimated in this study allow the prediction of the most probable L1 subfamilies responsible for the explosion (care must be exercised when comparing ages between distinct genetic elements; see Materials and methods). The most probable candidate subfamilies are L1PA6, L1PA7 and L1PA8 (Figure <figr fid="F5">5e</figr>). As mentioned above, although the youngest L1 subfamily mobilizes cellular RNAs <it>in trans </it>at very low frequencies (0.01-0.05%) in HeLa cells, the frequency is not necessarily intrinsic to L1s. In fact, in cultured feline cells the frequency of L1-mediated PP formation <it>in trans </it>is 5% relative to that of L1 retrotransposition in <it>cis </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Moreover, an eel LINE family exhibits a high level of <it>trans </it>retrotransposition (up to 30% <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>), and the frequency of L1-mediated Alu retrotransposition in HeLa cells is 100-1,000 times higher than control mRNAs <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Although L1 subfamilies such as L1PA6, L1PA7 and L1PA8 appear to have been extinguished by cumulative mutations, the possibility that an ancient L1 subfamily exhibited an enhanced ability to mobilize RNAs <it>in trans </it>could be verified experimentally in HeLa cells using reconstructed L1 subfamilies <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> as sources for reverse transcription of <it>trans </it>RNAs.</p>
         </sec>
         <sec>
            <st>
               <p>The impact of the retrotranspositional explosion on the ancestral primate genome</p>
            </st>
            <p>Alu insertions mediate many genomic rearrangements, such as unequal crossing over, induction of alternative splicing, and the introduction of new promoters, poly(A) signals and even new exons <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Inactivation of CMP-<it>N</it>-acetylneuraminic acid hydroxylase (around 2.8 million years ago) before brain expansion during human evolution occurred by an Alu-mediated inactivating mutation <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, representing yet another example of the impact of the Alu expansion. The current frequency of human endogenous insertional mutations caused by Alu retrotransposition is estimated at around 1 in every 16-200 individuals <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B46">46</abbr></abbrgrp>. The frequency of Alu insertion at the time of the retrotranspositional explosion is estimated to have been 30-200 times higher than the frequency over the last 10 million years (<abbrgrp><abbr bid="B26">26</abbr></abbrgrp> and data not shown). This implies that at least one in seven individuals at the time carried new Alu insertions in their genomes (a maximum of 12 insertions per individual). This high Alu insertion rate may have had a much greater impact on ancestral primate genomes compared with the impact of present-day mutations.</p>
            <p>Retrotransposition of PPs causes not only insertional mutations but also the propagation of new genes. These 'retrogenes' comprise PPs that inserted themselves next to resident promoter/enhancer elements and thereby escaped transcriptional silencing and PPs that were initially inactive but were reactivated at a later time when flanking regulatory elements became activated by mutation <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Retrogenes are often observed in primate genomes <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, one example being the testis-specific human gene <it>CDY </it>(on the Y chromosome), which arose during primate evolution by retrotransposition of the ubiquitous mRNA of the gene <it>CDYL </it>located on chromosome 13 <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. From the observed distribution of <it>CDY </it>homologs in primates, this event appears to have occurred in the simian lineage after its divergence from prosimians but before the split between Old and New World monkeys <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> during the period of the retrotranspositional explosion. We predict that further studies will demonstrate that many human retrogenes were generated during this period, and postulate that such retrogenes were involved in generating new characteristics that are specific to simian primates <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B47">47</abbr></abbrgrp>.</p>
            <p>Over the course of eukaryotic evolution, the extensive propagation of new genes preceded apparent bursts of new organisms or the emergence of new hierarchies of morphological complexity <abbrgrp><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr></abbrgrp>. The time of the retrotranspositional explosion can be estimated at 40-50 million years ago, assuming a nucleotide substitution rate of 1.5 &#215; 10<sup>-9 </sup>per nucleotide per year <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. Fossil records show that before this period, simian ancestors (monkeys, apes and humans) diverged from prosimians (lemurs and lorises) with the divergence of New World monkeys and the radiation of the remaining primates proceeding immediately thereafter <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp> (Figure <figr fid="F7">7</figr>). The rapid amplification of Alus to the level of 10% of the primate genome and the creation of numerous replicas of various genes may have provided the molecular basis that led to the radiation of higher primates.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Timing of the retrotranspositional explosion during primate evolution</p>
               </caption>
               <text>
                  <p>Timing of the retrotranspositional explosion during primate evolution. Phylogenetic relationships among primates and the estimated timeframes are based on data from references <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B36">36</abbr></abbrgrp> and <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, and references therein.</p>
               </text>
               <graphic file="gb-2003-4-11-r74-7"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Determining a set of processed pseudogenes</p>
            </st>
            <p>PPs were searched for in an assembled human genome sequence (Human Genome Project Working Draft, April 1 2001) <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> using BLAT <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. The BLAT setting was as follows: Assembly: April 1, 2001; Query type: DNA; Sort output: query, score; Output type: hyperlink. 'Confirmed cDNAs' (23,929 entries) in Ensembl DB (v1.1.0) <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> were used as queries. The subject with the highest score was regarded as the gene encoding the transcript. Multiple hits were subjected to analysis.</p>
            <p>Subjects that contained over 90% of the query length were used. The number of aligning blocks, which usually corresponds to the number of exons, was compared between a gene and other subjects. If the number of aligning blocks was smaller than that of the gene, the subject was further analyzed, thus eliminating pseudogenes generated by DNA duplications. Subjects that were identified by intronless genes (single exon genes) were not included in the analysis to avoid confusing PPs and pseudogenes generated by DNA duplications. To avoid confusing phylogenetic relationships, loci (subjects) that were identified by multiple query hits were not included in the analysis. A series of Perl scripts were designed to analyze the BLAT search results.</p>
         </sec>
         <sec>
            <st>
               <p>Evaluating PP annotation</p>
            </st>
            <p>To evaluate our annotation of PPs, our results for chromosomes 21 and 22 were compared with those from other studies. For chromosome 21, the PP total in this study was 34 whereas previous studies reported 41 <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr></abbrgrp> and 57 <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The number of annotations common to two studies totaled 18 (this study and <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>), 14 (this study and <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>) and 21 <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B52">52</abbr></abbrgrp>. Annotations common to all studies totaled 10. For chromosome 22, the PP total in this study was 62, whereas previous studies reported 91 <abbrgrp><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp> and 73 <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The number of common annotations totaled 37 (this study and <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>), 28 (this study and <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>) and 52 <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B54">54</abbr></abbrgrp>. Annotations common to all studies totaled 27. Differences between the numbers appear to derive mainly from differences in the gene sets used for the analyses <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of Alus, L1s, and their subfamilies</p>
            </st>
            <p>For each Alu and L1 repeat, the genomic location and sequence divergence was obtained from the output file of the RepeatMasker program applied to the human genome draft sequence (22 December 2001 <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>). Sequence divergences were defined as the mismatch rates of respective repeats relative to the consensus sequence of respective subfamilies.</p>
         </sec>
         <sec>
            <st>
               <p>Analysis of sequence divergence</p>
            </st>
            <p>The level of substitutions that accumulated in a PP (K(&#968;)) was estimated using the following method.</p>
            <p>First, the sequence divergence value (<it>D</it>) was corrected by removing the contribution of mutations at CpG sites. Sequence divergence (&#948;) of a sequence (at a given time point) of length (<it>N</it>) including the number of CpG dinucleotides (<it>n</it>) is given as a function of the mutation rate at non-CpG dinucleotides (&#945;) and CpG dinucleotides (&#946;) as follows:</p>
            <p>&#948; = &#945;(1/2 - <it>n</it>/<it>N</it>) + &#946;<it>n</it>/<it>N </it>&#160;&#160;&#160; (1)</p>
            <p>From the result of Sved and Bird <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, the ratio of &#946; to &#945; is &#8764; 6.5. Therefore, designating &#945;/2 = &#956; and <it>n</it>/<it>N </it>= &#957; in Equation 1 gives the following:</p>
            <p>&#948; = &#956;(1 + 11&#957;) &#160;&#160;&#160; (2)</p>
            <p>Assuming that CpG frequency (&#953;) in a genomic element that was generated by duplication of a functional gene of high CpG content decreases over time (<it>t</it>) and reaches an equilibrium state (&#949;) (approximately 20% of the frequency <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B33">33</abbr></abbrgrp> expected from the local fraction of cytosines and guanosines <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>), the CpG frequency (&#957;) at time (<it>t</it>) was calculated as follows:</p>
            <p>&#957; = 1/(<it>At </it>+ 1/(&#953; - &#949;)) + &#949; &#160;&#160;&#160; (3)</p>
            <p>If we accept the value of 1.5 &#215; 10<sup>-9 </sup>per nucleotide per year <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp> as the neutral mutation rate <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and equate this to the mutation rate at non-CpG dinucleotides (&#956;) and use a time unit of 1 million years, then the mutation rate at CpG dinucleotides, &#946;/2, will be around 1 per 100 nucleotides per million years (that is, &#957; will be reduced by 1% every million years). Therefore, &#957;(<it>t </it>= 1)/&#957;(<it>t </it>= 0) in Equation 3 gives:</p>
            <p>(1/(<it>A </it>+ 1/(&#953; - &#949;)) + &#949;)/&#953; &#8776; 0.99</p>
            <p>Solving for <it>A </it>gives:</p>
            <p><it>A </it>= 0.01&#953;/((0.99&#953; - &#949;)(&#953; - &#949;)) &#160;&#160;&#160; (4)</p>
            <p>The sequence divergence value (<it>D</it>) is given as an integral of the sequence divergence (&#948;) from the present (<it>t </it>= 0) to the time of the duplication (<it>t </it>= <it>T</it>): <it>D </it>= &#8747;<sub>0</sub><sup>T</sup>&#948;dt. From Equations 2, 3 and 4,</p>
            <p><it>D </it>= &#8747;<sub>0</sub><sup>T</sup>&#956;(1 + 11((&#953; - &#949;)/((0.01&#953;/(0.99&#953; - &#949;))<it>t </it>+ 1) + &#949;))dt&#160;&#160;&#160;(5)</p>
            <p>Solving Equation 5 for <it>T </it>gives the time since the duplication. The following &#953; and &#949; values (&#953;, &#949;, respectively) were used for the retroposons shown <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B57">57</abbr></abbrgrp>:</p>
            <p>Alu (0.077, 0.020); L1 (0.012, 0.008); PPs (0.015, 0.010)</p>
            <p>The substitution level (&#931;) at sites other than CpG is given from the time since the duplication (<it>T</it>) and the neutral mutation rate (&#956;) of primates <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>: &#931; = &#956;<it>T</it>. The quantity &#931; was corrected for multiple substitutions at the same site using the Jukes-Cantor model <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, giving the average number of substitutions per 100 bp (<it>K</it>): <it>K </it>= - (3/4)ln(1 - (4/3)&#931;).</p>
            <p>For PPs, sequence divergences were defined as the mismatch rates of respective PPs relative to the current sequences of their parental genes. The mismatch rate of a PP relative to its parental gene (<it>K</it>) consists of the level of substitutions that accumulated only in the PP (K(&#968;)) and the level of substitutions that accumulated only in the gene (K(<it>f</it>)): <it>K </it>= K(<it>f</it>) + K(&#968;). K(<it>f</it>) and K(&#968;) can be further subdivided into the number of synonymous (Ks) and nonsynonymous (Ka) substitutions <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>: K(<it>f</it>) = Ks(<it>f</it>) + Ka(<it>f</it>), K(&#968;) = Ks(&#968;) + Ka(&#968;). Kuma and Miyata evaluated the average nucleotide substitution rates of 31 pairs of human PPs and their parental genes using homologs of other species as outgroups (K. Kuma and T. Miyata, personal communication). They used the following genes: ADP-ribosylation factor 1, aldolase A, aldose reductase, alpha-E-catenin, alpha-<smcaps>L</smcaps>-fucosidase, alpha-enolase, arylamine <it>N</it>-acetyltransferase, beta-tubulin, c-Raf protooncogene, cAMP-dependent protein kinase regulatory subunit, calmodulin, ceruloplasmin, creatine kinase, cyclophilin, cytochrome <it>b</it>5, cytochrome <it>c</it>, ferrochelatase, gamma-actin, glucocerebrosidase, glutamine synthetase, glyceraldehyde-3-phosphate dehydrogenase, histone H3.3, hsc70, hsp27, hsp60, lactate dehydrogenase-A, neurotrophin-4, phosphoglycerate kinase, prothymosine alpha, topoisomerase-I, triose phosphate isomerase. They calculated the following ratios: Rs(&#968;), the synonymous substitutions in PPs to synonymous substitutions in their parental genes; Ra(<it>f</it>), the ratio of nonsynonymous substitutions in genes to synonymous substitutions in genes; and Ra(&#968;), the ratio of nonsynonymous substitutions in PPs to synonymous substitutions in genes. The mean values of Rs(&#968;), Ra(&#968;) and Ra(<it>f</it>) were:</p>
            <p>Rs(&#968;) = Ks(&#968;)/Ks(<it>f</it>) = 1.40 &#160;&#160;&#160; (6.1)</p>
            <p>Ra(&#968;) = Ka(&#968;)/Ks(<it>f</it>) = 1.13 &#160;&#160;&#160; (6.2)</p>
            <p>Ra(f) = Ka(<it>f</it>)/Ks(<it>f</it>) = 0.06 &#160;&#160;&#160; (6.3)</p>
            <p>From Equations 6.1-6.3, and Equations <it>K </it>= K(<it>f</it>) + K(&#968;), K(<it>f</it>) = Ks(<it>f</it>) + Ka(<it>f</it>), and K(&#968;) = Ks(&#968;) + Ka(&#968;), the estimated level of substitutions in PPs (K(&#968;)) is given by:</p>
            <p>K(&#968;) = 0.705<it>K </it>&#160;&#160;&#160; (7)</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>A table showing mapping coordinates for human PPs (Additional data file <supplr sid="s1">1</supplr>) and a list of the human genes that generated PPs (Additional data file <supplr sid="s2">2</supplr>) are available.</p>
         <suppl id="s1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>A table showing mapping coordinates for human PPs</p>
            </caption>
            <text>
               <p>A table showing mapping coordinates for human PPs</p>
            </text>
            <file name="gb-2003-4-11-r74-s1.xls">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
         <suppl id="s2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>A list of the human genes that generated PPs</p>
            </caption>
            <text>
               <p>A list of the human genes that generated PPs</p>
            </text>
            <file name="gb-2003-4-11-r74-s2.pdf">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Katsuhiko Murakami (RIKEN-GSC) for helpful discussions and Kei-ichi Kuma and Takashi Miyata (Kyoto University) for providing the data on the average nucleotide substitution rates of 31 pairs of human PPs. This work was partially supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan, Grant-in-Aid for Scientific Research. This work was also supported by a grant from BIRD of Japan Science and Technology Corporation (JST) for K.O.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Processed pseudogenes: characteristics and evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Vanin</snm>
                  <fnm>EF</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>1985</pubdate>
            <volume>19</volume>
            <fpage>253</fpage>
            <lpage>272</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.ge.19.120185.001345</pubid>
                  <pubid idtype="pmpid" link="fulltext">3909943</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Vertebrate pseudogenes.</p>
            </title>
            <aug>
               <au>
                  <snm>Mighell</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Robinson</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Markham</snm>
                  <fnm>AF</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2000</pubdate>
            <volume>468</volume>
            <fpage>109</fpage>
            <lpage>114</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(00)01199-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">10692568</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Nature and structure of human genes that generate retropseudogenes.</p>
            </title>
            <aug>
               <au>
                  <snm>Gon&#231;alves</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Duret</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mouchiroud</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>672</fpage>
            <lpage>678</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.5.672</pubid>
                  <pubid idtype="pmpid" link="fulltext">10810090</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Hegyi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Balasubramanian</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Luscombe</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Bertone</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Echols</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>272</fpage>
            <lpage>280</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155275</pubid>
                  <pubid idtype="pmpid" link="fulltext">11827946</pubid>
                  <pubid idtype="doi">10.1101/gr.207102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22.</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gentles</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>2930</fpage>
            <lpage>2935</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">122450</pubid>
                  <pubid idtype="pmpid" link="fulltext">11867739</pubid>
                  <pubid idtype="doi">10.1073/pnas.052692099</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Brosius</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1999</pubdate>
            <volume>238</volume>
            <fpage>115</fpage>
            <lpage>134</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1119(99)00227-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10570990</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Retroposition of autosomal mRNA yielded testis-specific gene family on human Y chromosome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lahn</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Page</snm>
                  <fnm>DC</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <fpage>429</fpage>
            <lpage>433</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/7771</pubid>
                  <pubid idtype="pmpid" link="fulltext">10192397</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Evolution of the phosphoglycerate mutase processed gene in human and chimpanzee revealing the origin of a new primate gene.</p>
            </title>
            <aug>
               <au>
                  <snm>Betr&#225;n</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>654</fpage>
            <lpage>663</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11961099</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information.</p>
            </title>
            <aug>
               <au>
                  <snm>Weiner</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Deininger</snm>
                  <fnm>PL</fnm>
               </au>
               <au>
                  <snm>Efstratiadis</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1986</pubdate>
            <volume>55</volume>
            <fpage>631</fpage>
            <lpage>661</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.bi.55.070186.003215</pubid>
                  <pubid idtype="pmpid" link="fulltext">2427017</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>SINEs: Short interspersed repeated elements of the eukaryotic genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Okada</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>1991</pubdate>
            <volume>6</volume>
            <fpage>358</fpage>
            <lpage>361</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0169-5347(91)90226-N</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The origin of interspersed repeats in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>AF</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>743</fpage>
            <lpage>748</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(96)80030-X</pubid>
                  <pubid idtype="pmpid">8994846</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>SINEs and LINEs share common 3' sequences: a review.</p>
            </title>
            <aug>
               <au>
                  <snm>Okada</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hamada</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ogiwara</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ohshima</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1997</pubdate>
            <volume>205</volume>
            <fpage>229</fpage>
            <lpage>243</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1119(97)00409-5</pubid>
                  <pubid idtype="pmpid">9461397</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>SINEs and LINEs: the art of biting the hand that feeds you.</p>
            </title>
            <aug>
               <au>
                  <snm>Weiner</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Curr Opin Cell Biol</source>
            <pubdate>2002</pubdate>
            <volume>14</volume>
            <fpage>343</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0955-0674(02)00338-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12067657</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Initial sequencing and analysis of the human genome.</p>
            </title>
            <aug>
               <au>
                  <cnm>International Human Genome Sequencing Consortium</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>860</fpage>
            <lpage>921</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35057062</pubid>
                  <pubid idtype="pmpid" link="fulltext">11237011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>High frequency retrotransposition in cultured mammalian cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Moran</snm>
                  <fnm>JV</fnm>
               </au>
               <au>
                  <snm>Holmes</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Naas</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>DeBerardinis</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Boeke</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Kazazian</snm>
                  <fnm>HH</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1996</pubdate>
            <volume>87</volume>
            <fpage>917</fpage>
            <lpage>927</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8945518</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The impact of L1 retrotransposons on the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Kazazian</snm>
                  <fnm>HH</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Moran</snm>
                  <fnm>JV</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1998</pubdate>
            <volume>19</volume>
            <fpage>19</fpage>
            <lpage>24</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9590283</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons.</p>
            </title>
            <aug>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>1872</fpage>
            <lpage>1877</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">20010</pubid>
                  <pubid idtype="pmpid" link="fulltext">9050872</pubid>
                  <pubid idtype="doi">10.1073/pnas.94.5.1872</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Human LINE retrotransposons generate processed pseudogenes.</p>
            </title>
            <aug>
               <au>
                  <snm>Esnault</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Maestre</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Heidmann</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>24</volume>
            <fpage>363</fpage>
            <lpage>367</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/74184</pubid>
                  <pubid idtype="pmpid" link="fulltext">10742098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Human L1 retrotransposition: <it>cis</it> preference versus <it>trans</it> complementation.</p>
            </title>
            <aug>
               <au>
                  <snm>Wei</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ooi</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Lawler</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Ostertag</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Kazazian</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Boeke</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Moran</snm>
                  <fnm>JV</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2001</pubdate>
            <volume>21</volume>
            <fpage>1429</fpage>
            <lpage>1439</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99594</pubid>
                  <pubid idtype="pmpid" link="fulltext">11158327</pubid>
                  <pubid idtype="doi">10.1128/MCB.21.4.1429-1439.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution.</p>
            </title>
            <aug>
               <au>
                  <snm>Pavl&#237;cek</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Paces</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Elleder</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hejnar</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>391</fpage>
            <lpage>399</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155283</pubid>
                  <pubid idtype="pmpid" link="fulltext">11875026</pubid>
                  <pubid idtype="doi">10.1101/gr.216902. Article published online before print in February 2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>LINE-mediated retrotransposition of marked Alu sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Dewannieux</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Esnault</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Heidmann</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>35</volume>
            <fpage>41</fpage>
            <lpage>48</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1223</pubid>
                  <pubid idtype="pmpid" link="fulltext">12897783</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Evidence that most human Alu sequences were inserted in a process that ceased about 30 million years ago.</p>
            </title>
            <aug>
               <au>
                  <snm>Britten</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1994</pubdate>
            <volume>91</volume>
            <fpage>6148</fpage>
            <lpage>6150</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">44155</pubid>
                  <pubid idtype="pmpid" link="fulltext">8016128</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The age of Alu subfamilies.</p>
            </title>
            <aug>
               <au>
                  <snm>Kapitonov</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1996</pubdate>
            <volume>42</volume>
            <fpage>59</fpage>
            <lpage>65</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8576965</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The decline in human Alu retroposition was accompanied by an asymmetric decrease in SRP9/14 binding to dimeric Alu RNA and increased expression of small cytoplasmic Alu RNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Sarrowa</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>DY</fnm>
               </au>
               <au>
                  <snm>Maraia</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1997</pubdate>
            <volume>17</volume>
            <fpage>1144</fpage>
            <lpage>1151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">231839</pubid>
                  <pubid idtype="pmpid" link="fulltext">9032241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Does SINE evolution preclude Alu function?</p>
            </title>
            <aug>
               <au>
                  <snm>Schmid</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>4541</fpage>
            <lpage>4550</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147893</pubid>
                  <pubid idtype="pmpid" link="fulltext">9753719</pubid>
                  <pubid idtype="doi">10.1093/nar/26.20.4541</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Alu repeats and human genomic diversity.</p>
            </title>
            <aug>
               <au>
                  <snm>Batzer</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Deininger</snm>
                  <fnm>PL</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>370</fpage>
            <lpage>379</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg798</pubid>
                  <pubid idtype="pmpid" link="fulltext">11988762</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>LINEs and Alus - the polyA connection.</p>
            </title>
            <aug>
               <au>
                  <snm>Boeke</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1997</pubdate>
            <volume>16</volume>
            <fpage>6</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9140383</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The sequence of the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Mural</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
               <au>
                  <snm>Yandell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>291</volume>
            <fpage>1304</fpage>
            <lpage>1351</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1058040</pubid>
                  <pubid idtype="pmpid" link="fulltext">11181995</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The Ensembl genome database project.</p>
            </title>
            <aug>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Cameron</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Cuff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Curwen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Down</snm>
                  <fnm>T</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>38</fpage>
            <lpage>41</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99161</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752248</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.38</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Hogenesch</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Batalov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Su</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kay</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Cooke</snm>
                  <fnm>MP</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2001</pubdate>
            <volume>106</volume>
            <fpage>413</fpage>
            <lpage>415</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11534548</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Genes for intermediate filament proteins and the draft sequence of the human genome: novel keratin genes and a surprisingly high number of pseudogenes related to keratin genes 8 and 18.</p>
            </title>
            <aug>
               <au>
                  <snm>Hesse</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Magin</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Cell Sci</source>
            <pubdate>2001</pubdate>
            <volume>114</volume>
            <fpage>2569</fpage>
            <lpage>2575</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11683385</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Harrison</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1466</fpage>
            <lpage>1482</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187539</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368239</pubid>
                  <pubid idtype="doi">10.1101/gr.331902</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model.</p>
            </title>
            <aug>
               <au>
                  <snm>Sved</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bird</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1990</pubdate>
            <volume>87</volume>
            <fpage>4692</fpage>
            <lpage>4696</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">54183</pubid>
                  <pubid idtype="pmpid" link="fulltext">2352943</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <aug>
               <au>
                  <snm>Graur</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>W-H</fnm>
               </au>
            </aug>
            <source>Fundamentals of Molecular Evolution</source>
            <publisher>Sunderland, MA: Sinauer Associates</publisher>
            <edition>2</edition>
            <pubdate>2000</pubdate>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Estimate of the mutation rate per nucleotide in humans.</p>
            </title>
            <aug>
               <au>
                  <snm>Nachman</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Crowell</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2000</pubdate>
            <volume>156</volume>
            <fpage>297</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10978293</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Primate phylogeny: morphological vs. molecular results.</p>
            </title>
            <aug>
               <au>
                  <snm>Shoshani</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Groves</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Simons</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Gunnell</snm>
                  <fnm>GF</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <fpage>102</fpage>
            <lpage>154</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/mpev.1996.0009</pubid>
                  <pubid idtype="pmpid" link="fulltext">8673281</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Anthropoid origins.</p>
            </title>
            <aug>
               <au>
                  <snm>Kay</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>BA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>275</volume>
            <fpage>797</fpage>
            <lpage>804</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.275.5301.797</pubid>
                  <pubid idtype="pmpid" link="fulltext">9012340</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>T&#243;th</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Riggs</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>246</volume>
            <fpage>401</fpage>
            <lpage>417</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1994.0095</pubid>
                  <pubid idtype="pmpid" link="fulltext">7877164</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Adaptive evolution in LINE-1 retrotransposons.</p>
            </title>
            <aug>
               <au>
                  <snm>Boissinot</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Furano</snm>
                  <fnm>AV</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>2186</fpage>
            <lpage>2194</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11719568</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>The human ribosomal protein genes: sequencing and comparative analysis of 73 genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshihama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Uechi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Asakawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kawasaki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kato</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Higa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Maeda</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Minoshima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shimizu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kenmochi</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>379</fpage>
            <lpage>390</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155282</pubid>
                  <pubid idtype="pmpid" link="fulltext">11875025</pubid>
                  <pubid idtype="doi">10.1101/gr.214202</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <aug>
               <au>
                  <snm>Kimura</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>The Neutral Theory of Molecular Evolution</source>
            <publisher>Cambridge: Cambridge University Press</publisher>
            <pubdate>1983</pubdate>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Molecular phylogeny and demographic history of humans.</p>
            </title>
            <aug>
               <au>
                  <snm>Takahata</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>In Humanity from African Naissance to Coming Millennia</source>
            <publisher>Firenze: Firenze University Press</publisher>
            <editor>Tobias PV, Raath MA, Moggi-Cecchi J, Doyle GA</editor>
            <pubdate>2001</pubdate>
            <fpage>299</fpage>
            <lpage>305</lpage>
         </bibl>
         <bibl id="B43">
            <title>
               <p>LINEs mobilize SINEs in the eel through a shared 3' sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Kajikawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Okada</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2002</pubdate>
            <volume>111</volume>
            <fpage>433</fpage>
            <lpage>444</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12419252</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Ivics</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Hackett</snm>
                  <fnm>PB</fnm>
               </au>
               <au>
                  <snm>Plasterk</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Izsv&#225;k</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1997</pubdate>
            <volume>91</volume>
            <fpage>501</fpage>
            <lpage>510</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9390559</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Inactivation of CMP-N-acetylneuraminic acid hydroxylase occurred prior to brain expansion during human evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Chou</snm>
                  <fnm>H-H</fnm>
               </au>
               <au>
                  <snm>Hayakawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Diaz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Krings</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Indriati</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Leakey</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Paabo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Satta</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Takahata</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Varki</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>11736</fpage>
            <lpage>11741</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">129338</pubid>
                  <pubid idtype="pmpid" link="fulltext">12192086</pubid>
                  <pubid idtype="doi">10.1073/pnas.182257399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>An estimated frequency of endogenous insertional mutations in humans.</p>
            </title>
            <aug>
               <au>
                  <snm>Kazazian</snm>
                  <fnm>HH</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>22</volume>
            <fpage>130</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/9638</pubid>
                  <pubid idtype="pmpid" link="fulltext">10369250</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Positive Darwinian selection after gene duplication in primate ribonuclease genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rosenberg</snm>
                  <fnm>HF</fnm>
               </au>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>3708</fpage>
            <lpage>3713</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">19901</pubid>
                  <pubid idtype="pmpid" link="fulltext">9520431</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.7.3708</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Extensive gene duplication in the early evolution of animals before the parazoan-eumetazoan split demonstrated by G proteins and protein tyrosine kinases from sponge and hydra.</p>
            </title>
            <aug>
               <au>
                  <snm>Suga</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Koyanagi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hoshiyama</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ono</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Iwabe</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kuma</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Miyata</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1999</pubdate>
            <volume>48</volume>
            <fpage>646</fpage>
            <lpage>653</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10229568</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Gu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>31</volume>
            <fpage>205</fpage>
            <lpage>209</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng902</pubid>
                  <pubid idtype="pmpid" link="fulltext">12032571</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Assembly of the working draft of the human genome with GigAssembler.</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1541</fpage>
            <lpage>1548</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.183201</pubid>
                  <pubid idtype="pmpid" link="fulltext">11544197</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>BLAT - the BLAST-like alignment tool.</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>656</fpage>
            <lpage>664</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187518</pubid>
                  <pubid idtype="pmpid" link="fulltext">11932250</pubid>
                  <pubid idtype="doi">10.1101/gr.229202. Article published online before March 2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>The DNA sequence of human chromosome 21.</p>
            </title>
            <aug>
               <au>
                  <snm>Hattori</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fujiyama</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Toyoda</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ishii</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Totoki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Choi</snm>
                  <fnm>DK</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>405</volume>
            <fpage>311</fpage>
            <lpage>319</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35012518</pubid>
                  <pubid idtype="pmpid" link="fulltext">10830953</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Human Genome Research Group: Chromosome 21</p>
            </title>
            <url>http://hgp.gsc.riken.go.jp/data_tools/chr21.html</url>
         </bibl>
         <bibl id="B54">
            <title>
               <p>The DNA sequence of human chromosome 22.</p>
            </title>
            <aug>
               <au>
                  <snm>Dunham</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Shimizu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Chissoe</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hunt</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Bruskiewich</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Beare</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Smink</snm>
                  <fnm>LJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1999</pubdate>
            <volume>402</volume>
            <fpage>489</fpage>
            <lpage>495</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/990031</pubid>
                  <pubid idtype="pmpid" link="fulltext">10591208</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Human chromosome 22 project overview</p>
            </title>
            <url>http://www.sanger.ac.uk/HGP/Chr22</url>
         </bibl>
         <bibl id="B56">
            <title>
               <p>UCSC genome bioinformatics</p>
            </title>
            <url>http://www.genome.ucsc.edu</url>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Repbase update</p>
            </title>
            <url>http://www.girinst.org/Repbase_Update.html</url>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions.</p>
            </title>
            <aug>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1986</pubdate>
            <volume>3</volume>
            <fpage>418</fpage>
            <lpage>426</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3444411</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>A simple method for estimating the intensity of purifying selection in protein-coding genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ophir</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Graur</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1999</pubdate>
            <volume>16</volume>
            <fpage>49</fpage>
            <lpage>53</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10331251</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>A maximum likelihood method for analyzing pseudogene evolution: implications for silent site evolution in humans and rodents.</p>
            </title>
            <aug>
               <au>
                  <snm>Bustamante</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>110</fpage>
            <lpage>117</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11752196</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>NCBI Reference sequences</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/RefSeq/</url>
         </bibl>
         <bibl id="B62">
            <title>
               <p>The neighbor-joining method: a new method for reconstructing phylogenetic trees.</p>
            </title>
            <aug>
               <au>
                  <snm>Saitou</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1987</pubdate>
            <volume>4</volume>
            <fpage>406</fpage>
            <lpage>425</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3447015</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
