<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-8-r48</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Duplication is more common among laterally transferred genes than among indigenous genes</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Hooper</snm>
               <mi>D</mi>
               <fnm>Sean</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2" ca="yes">
               <snm>Berg</snm>
               <mi>G</mi>
               <fnm>Otto</fnm>
               <insr iid="I1"/>
               <email>otto.berg@ebc.uu.se</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Molecular Evolution, Uppsala University, Norbyvagen 18C, SE-75236 Uppsala, Sweden</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>8</issue>
         <fpage>R48</fpage>
         <url>http://genomebiology.com/2003/4/8/R48</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-8-r48</pubid>
               <pubid idtype="pmpid">12914657</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>25</day>
               <month>3</month>
               <year>2003</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>4</day>
               <month>6</month>
               <year>2003</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>10</day>
               <month>6</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>7</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Hooper and Berg; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <shorttitle>
         <p>Duplication is more common among laterally transferred genes than among indigenous genes</p>
      </shorttitle>
      <shortabs>
         <p>Using both a compositional method and a gene-tree approach, a number of proposed laterally transferred genes have been identified and their nucleotide composition and frequency of duplication studied.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Recent developments in the understanding of paralogous evolution have prompted a focus not only on obviously advantageous genes, but also on genes that can be considered to have a weak or sporadic impact on the survival of the organism. Here we examine the duplicative behavior of a category of genes that can be considered to be mostly transient in the genome, namely laterally transferred genes. Using both a compositional method and a gene-tree approach, we identify a number of proposed laterally transferred genes and study their nucleotide composition and frequency of duplication.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>It is found that duplications are significantly overrepresented among potential laterally transferred genes compared to the indigenous ones. Furthermore, the GC<sub>3 </sub>distribution of potential laterally transferred genes was found to be largely uniform in some genomes, suggesting an import from a broad range of donors.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>The results are discussed not in a context of strongly optimized established genes, but rather of genes with weak or ancillary functions. The importance of duplication may therefore depend on the variability and availability of weak genes for which novel functions may be discovered. Therefore, lateral transfer may accelerate the evolutionary process of duplication by bringing foreign genes that have mainly weak or no function into the genome.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>There are few natural niches left on Earth that have yet to be colonized by microbes. The adaptability of prokaryotic life in particular is arguably a major reason for its success. In order to use new metabolites, or to survive new environments, microbes need genes that code for products that facilitate survival. There are basically two methods by which genomes expand their repertoire of genes: creating them through duplication and adaptation or taking existing genes from outside sources. The method of duplication has been examined extensively, starting as a model of gene evolution where a copy of a gene is free to evolve and diverge, until it may attain a novel function <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Recently, several authors have challenged the period of neutrality implied by the old model, and shifted the focus more towards how new paralogs avoid neutrality by gene-amplification effects <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. A gene amplification will be either selected, neutral or counter-selected, depending on the original gene function. If noncoding DNA is duplicated, then both paralogs are probably transient in the genome. If the paralogs have a weak but slightly selected product, amplification may be selected. However, if gene products are strong, well-established, highly expressed or part of a delicate balance of proteins, then amplification may be counter-selected <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Furthermore, fragments of genes can be duplicated, possibly amplifying secondary functions in established genes. Through amplification, new functions can be 'discovered' in extant ones.</p>
         <p>As an alternative to developing new gene functions through duplication, an organism can incorporate DNA from outside sources through the process of lateral transfer. This process of gene transfer has been observed not only for eubacteria (for example <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>), but also for archaea <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> and eukaryotes. Over the years, systematic studies of lateral transfer (for example <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>) have sparked an animated debate on the extent of lateral transfer. Some authors propose that it is a major force in genome evolution <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>, whereas others downplay its role. One factor limiting the impact of lateral transfer may be the high level of complexity in a large number of pathways; a single transferred gene will have difficulty outcompeting an indigenous gene that is part of an adapted system of complex protein interactions <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. Certainly, highly optimized genes are difficult to replace in the genome through lateral transfer, even if they are not a part of a complex pathway. However, duplication of such genes could also be disruptive and thereby reduce the impact of duplication on gene innovation. Of course, clusters of genes as well as single genes can be imported. Some gene families, such as the phosphonate metabolism <it>phn </it>group, are widely considered to have been imported into <it>Escherichia coli </it>(<abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and references therein), and are organized as clusters of genes. This may, perhaps, be the only way, however improbable, to circumvent conservation caused by complexity. For reasons of complexity and optimization, it may be more interesting from an evolutionary point of view to shift the focus from well-established genes with highly optimized functions to the weaker ones, as it is here that evolutionary mechanisms such as duplication and lateral transfer may have more significant effects.</p>
         <p>We previously studied lateral transfer of genes between species in the <it>Salmonella</it>/<it>Escherichia </it>clade <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and also the turnover and fixation of duplications within various species <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. It was found that the rates of both lateral transfer and duplication in for instance <it>E. coli </it>are considerable, but that only a few imports and duplications survive deletion. Also, very few recently imported genes seem to have a defined product. However, the results also suggest that the contribution to gene innovation is mainly the result of gene-dosage effects of weak or ancillary gene functions. Even though paralogs can be selected for amplification of strong, established functions, we found that these events rarely result in new gene functions.</p>
         <p>Here we examine whether an influx of weak or nonfunctional genes into a genome can actually contribute to gene innovation through duplication. Furthermore, we examine the general patterns of lateral transfer among a selection of organisms of different levels of divergence, including distributions of GC content.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Summary of gene categories</p>
            </st>
            <p>Genes are grouped by their presence or absence in a selection of organisms (Table <tblr tid="T1">1</tblr>). In gene group A (Table <tblr tid="T2">2</tblr>), we expect to find the most recent additions to the genome through contributions from both lateral transfer and duplication. Because these genes are present only in the subject organism, it is probable that lateral transfer occurred after the last divergence. In the case of relatively recent divergence, such as for instance <it>Salmonella typhimurium </it>and <it>S. typhi</it>, the total number of genes (<it>N</it>) in category A is relatively small, whereas organisms which are farther apart (for example, <it>Bacillus subtilis </it>and <it>B. halodurans</it>) have larger numbers of unique genes. However, there is probably no constant rate at which genomes acquire genes. This is illustrated by the comparison between the two <it>E. coli </it>strains, where the A group is almost twice as large in O157:H7 than in K12. This suggests that strain O157:H7 is more active than K12 in gene acquisition either mechanistically or because of added selective pressure. Other pairs of organisms, such as <it>E. coli </it>K12 and <it>S. typhimurium</it>, have more similar patterns of gene acquisition. Imports in category B (Table <tblr tid="T3">3</tblr>) are on average older than in category A, and many genes are likely to have been lost in the outgroup genomes as a result of different lifestyles and requirements. Genes in category D are present in all four organisms in the group, and are probably not recently transferred genes.</p>
            <tbl id="T1" hint_layout="single">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Overview of gene categories</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Category</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>Cr</p>
                     </c>
                     <c ca="center">
                        <p>O1</p>
                     </c>
                     <c ca="center">
                        <p>O2</p>
                     </c>
                     <c ca="left">
                        <p>Suggested content of imports</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Recent imports or loss of imports from B</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Less recent imports</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="left">
                        <p>Loss of genes in Cr, or pervasive imports</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>D</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="left">
                        <p>Only highly pervasive imports expected</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>G, genome under consideration; Cr, paired close relative; O1, first outgroup organism; O2, second outgroup organism. +, Present; -, absent.</p>
               </tblfn>
            </tbl>
            <tbl id="T2" hint_layout="single">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Category A</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Pair</p>
                     </c>
                     <c ca="left">
                        <p>Outgroups</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>N</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Pdev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndup</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, edl</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>95</p>
                     </c>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>0.179</p>
                     </c>
                     <c ca="left">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, stym</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>248</p>
                     </c>
                     <c ca="left">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p>0.173</p>
                     </c>
                     <c ca="left">
                        <p>25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>edl, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>180</p>
                     </c>
                     <c ca="left">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p>0.239</p>
                     </c>
                     <c ca="left">
                        <p>51</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sfle, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>139</p>
                     </c>
                     <c ca="left">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p>0.216</p>
                     </c>
                     <c ca="left">
                        <p>37</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, ecol</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>270</p>
                     </c>
                     <c ca="left">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>0.174</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, styp</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>89</p>
                     </c>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>0.225</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>styp, stym</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>138</p>
                     </c>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>0.225</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bsub, bhal</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>575</p>
                     </c>
                     <c ca="left">
                        <p>51</p>
                     </c>
                     <c ca="left">
                        <p>0.089</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>603</p>
                     </c>
                     <c ca="left">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>0.066</p>
                     </c>
                     <c ca="left">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>779</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                     <c ca="left">
                        <p>0.063</p>
                     </c>
                     <c ca="left">
                        <p>20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>samu, samw</p>
                     </c>
                     <c ca="left">
                        <p>bsub, cace</p>
                     </c>
                     <c ca="left">
                        <p>70</p>
                     </c>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>0.129</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><it>N</it>, total number of genes; <it>Ndev</it>, number of genes with cT<sup>2 </sup>scores > 38; <it>Pdev</it>, proportion of deviant genes; <it>Ndup</it>, number of duplicated genes. See Materials and methods for species abbreviations. All <it>Pdev </it>values are significant at 99% when compared to group D (Table <tblr tid="T4">4</tblr>).</p>
               </tblfn>
            </tbl>
            <tbl id="T3" hint_layout="single">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Category B</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Pair</p>
                     </c>
                     <c ca="left">
                        <p>Outgroups</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>N</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Pdev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndup</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, edl</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>190</p>
                     </c>
                     <c ca="left">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p>0.142</p>
                     </c>
                     <c ca="left">
                        <p>18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, stym</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>163</p>
                     </c>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>0.135</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>edl, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>189</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>0.079</p>
                     </c>
                     <c ca="left">
                        <p>32</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sfle, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>169</p>
                     </c>
                     <c ca="left">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p>0.160</p>
                     </c>
                     <c ca="left">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, ecol</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>171</p>
                     </c>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>0.105</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, styp</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>249</p>
                     </c>
                     <c ca="left">
                        <p>34</p>
                     </c>
                     <c ca="left">
                        <p>0.137</p>
                     </c>
                     <c ca="left">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>styp, stym</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>233</p>
                     </c>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>0.133</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bsub, bhal</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>582</p>
                     </c>
                     <c ca="left">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p>0.085</p>
                     </c>
                     <c ca="left">
                        <p>18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>602</p>
                     </c>
                     <c ca="left">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>0.047*</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>293</p>
                     </c>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>0.065</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>samu, samw</p>
                     </c>
                     <c ca="left">
                        <p>bsub, cace</p>
                     </c>
                     <c ca="left">
                        <p>489</p>
                     </c>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>0.063</p>
                     </c>
                     <c ca="left">
                        <p>12</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*All <it>Pdev </it>values are significant at 99% when compared to group D (Table <tblr tid="T3">3</tblr>), except for bhal which is not significant. See Materials and methods for species abbreviations.</p>
               </tblfn>
            </tbl>
            <p>In previous work <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp> we have developed a compositional measure (cT<sup>2</sup>; see Materials and methods) to find genes with atypical dinucleotide frequencies. This measure is used in combination with the gene categories. Through all categories and in all comparisons, the number of atypical genes is the highest in groups that are associated with recent lateral transfer. The proportion of deviant genes (<it>Pdev</it>; P(cT<sup>2 </sup>> 38)) are consistently higher in categories A and B than in D (Table <tblr tid="T4">4</tblr>), which would support the notion that there are many candidates for lateral transfer in groups A and B. In <it>B. subtilis </it>and <it>B. halodurans</it>, which appear to have diverged long ago, <it>Pdev </it>in categories A and B is low, although still higher than in D. The numbers of deviant genes in category A are 51 and 40 in <it>B. subtilis </it>and <it>B. halodurans </it>respectively, similar in number to those of, say, <it>E. coli </it>or <it>S. typhi. </it>Thus the rates of import may be similar in magnitude and the low <it>Pdev </it>scores could be due to category A containing a large number of older genes that have either been ameliorated in <it>Bacillus </it>or lost in <it>Clostridium</it>. The older the A group, the lower <it>Pdev </it>will be, until it approaches the <it>Pdev </it>of category D. The comparison between the <it>Staphylococcus aureus </it>strains mu50 and mw2 shows an intermediate pattern, with a relatively high <it>Pdev </it>in categories A and B compared to category D. This would suggest that there is no great difference between the proteobacteria and the <it>Bacillus/Clostridium </it>group in lateral transfer patterns.</p>
            <tbl id="T4" hint_layout="double">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Category D</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Pair</p>
                     </c>
                     <c ca="left">
                        <p>Outgroups</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>N</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Pdev</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ndup</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p><it>Ndev </it>and <it>dup</it>*</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, edl</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>2,823</p>
                     </c>
                     <c ca="left">
                        <p>91</p>
                     </c>
                     <c ca="left">
                        <p>0.032</p>
                     </c>
                     <c ca="left">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, stym</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>2,163</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                     <c ca="left">
                        <p>0.023</p>
                     </c>
                     <c ca="left">
                        <p>34</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>edl, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>2,865</p>
                     </c>
                     <c ca="left">
                        <p>87</p>
                     </c>
                     <c ca="left">
                        <p>0.030</p>
                     </c>
                     <c ca="left">
                        <p>71</p>
                     </c>
                     <c ca="left">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sfle, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>2,684</p>
                     </c>
                     <c ca="left">
                        <p>113</p>
                     </c>
                     <c ca="left">
                        <p>0.042</p>
                     </c>
                     <c ca="left">
                        <p>214</p>
                     </c>
                     <c ca="left">
                        <p>20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, ecol</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>2,129</p>
                     </c>
                     <c ca="left">
                        <p>74</p>
                     </c>
                     <c ca="left">
                        <p>0.035</p>
                     </c>
                     <c ca="left">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, styp</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>2,752</p>
                     </c>
                     <c ca="left">
                        <p>95</p>
                     </c>
                     <c ca="left">
                        <p>0.035</p>
                     </c>
                     <c ca="left">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>styp, stym</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>2,662</p>
                     </c>
                     <c ca="left">
                        <p>96</p>
                     </c>
                     <c ca="left">
                        <p>0.036</p>
                     </c>
                     <c ca="left">
                        <p>29</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bsub, bhal</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>1,329</p>
                     </c>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p>0.024</p>
                     </c>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>1,354</p>
                     </c>
                     <c ca="left">
                        <p>51</p>
                     </c>
                     <c ca="left">
                        <p>0.038</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>1,290</p>
                     </c>
                     <c ca="left">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>0.036</p>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>samu, samw</p>
                     </c>
                     <c ca="left">
                        <p>bsub, cace</p>
                     </c>
                     <c ca="left">
                        <p>1,055</p>
                     </c>
                     <c ca="left">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p>0.040</p>
                     </c>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>* The number of duplicated genes with cT<sup>2 </sup>> 38. See Materials and methods for species abbreviations.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>GC<sub>3 </sub>distributions in gene-categories A and D</p>
            </st>
            <p>The distributions of guanine and cytosine at the third codon position (GC<sub>3</sub>) in categories A and D are summarized in Figure <figr fid="F1">1</figr> and Table <tblr tid="T5">5</tblr>. Category A is generally 'flatter' than the GC<sub>3 </sub>distribution of category D. In some genomes, particularly in the <it>Escherichia/Salmonella </it>clade, GC<sub>3 </sub>is almost uniformly distributed. Given that genes in category A are good candidates for lateral transfer, a flat distribution would suggest that the GC<sub>3 </sub>range of imports is wide. Accordingly, the number of donor genomes must be large enough to incorporate a broad spectrum of gene GC<sub>3 </sub>content. In the <it>Bacillus</it>/<it>Clostridium </it>clade, the GC<sub>3 </sub>pattern of the A groups resembles that of the D groups to varying degrees. Particularly <it>B. halodurans </it>appears to have a narrower GC<sub>3 </sub>range of potential imports, which could be explained by one of three reasons. First, <it>B. subtilis </it>and <it>B. halodurans </it>have been diverging for so long that the A group could be dominated by old imports that have had time to ameliorate to the genomic GC<sub>3 </sub>distribution. Second, it is possible that the number of donor organisms is lower in the halophilic environment of <it>B. halodurans</it>. If so, there could be less variation in the GC<sub>3 </sub>range of imports. Third, genes in category A could be primarily genes that have been lost in <it>B. subtilis </it>but present in the ancestor, although this explanation also means that the same genes have been lost in <it>Clostridium</it>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Distribution of GC<sub>3 </sub>content of genes in categories A (dashed line) and D (solid line)</p>
               </caption>
               <text>
                  <p>Distribution of GC<sub>3 </sub>content of genes in categories A (dashed line) and D (solid line). <b>(a) </b><it>E. coli </it>K12; <b>(b) </b><it>E. coli </it>O157:H7; <b>(c) </b><it>S. flexneri</it>; <b>(d) </b><it>S. typhimurium</it>; <b>(e) </b><it>S. typhi</it>; <b>(f) </b><it>B. subtilis</it>; <b>(g) </b><it>B. halodurans</it>; <b>(h) </b><it>C. acetobutylicum</it>; <b>(i) </b><it>S. aureus</it>.</p>
               </text>
               <graphic file="gb-2003-4-8-r48-1"/>
            </fig>
            <tbl id="T5" hint_layout="single">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>GC<sub>3 </sub>% distribution data</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Pair</p>
                     </c>
                     <c ca="left">
                        <p>Outgroups</p>
                     </c>
                     <c ca="left">
                        <p>m(A)</p>
                     </c>
                     <c ca="left">
                        <p>SD(A)</p>
                     </c>
                     <c ca="left">
                        <p>m(D)</p>
                     </c>
                     <c ca="left">
                        <p>SD(D)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, edl</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.451</p>
                     </c>
                     <c ca="left">
                        <p>0.119</p>
                     </c>
                     <c ca="left">
                        <p>0.523</p>
                     </c>
                     <c ca="left">
                        <p>0.035</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>edl, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.425</p>
                     </c>
                     <c ca="left">
                        <p>0.130</p>
                     </c>
                     <c ca="left">
                        <p>0.544</p>
                     </c>
                     <c ca="left">
                        <p>0.068</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sfle, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.455</p>
                     </c>
                     <c ca="left">
                        <p>0.108</p>
                     </c>
                     <c ca="left">
                        <p>0.546</p>
                     </c>
                     <c ca="left">
                        <p>0.062</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, styp,</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.441</p>
                     </c>
                     <c ca="left">
                        <p>0.138</p>
                     </c>
                     <c ca="left">
                        <p>0.589</p>
                     </c>
                     <c ca="left">
                        <p>0.069</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>styp, stym</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.441</p>
                     </c>
                     <c ca="left">
                        <p>0.123</p>
                     </c>
                     <c ca="left">
                        <p>0.589</p>
                     </c>
                     <c ca="left">
                        <p>0.069</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bsub, bhal</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>0.387</p>
                     </c>
                     <c ca="left">
                        <p>0.091</p>
                     </c>
                     <c ca="left">
                        <p>0.437</p>
                     </c>
                     <c ca="left">
                        <p>0.061</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>0.383</p>
                     </c>
                     <c ca="left">
                        <p>0.061</p>
                     </c>
                     <c ca="left">
                        <p>0.408</p>
                     </c>
                     <c ca="left">
                        <p>0.049</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>0.186</p>
                     </c>
                     <c ca="left">
                        <p>0.037</p>
                     </c>
                     <c ca="left">
                        <p>0.182</p>
                     </c>
                     <c ca="left">
                        <p>0.032</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>samu, samw</p>
                     </c>
                     <c ca="left">
                        <p>bsub, cace</p>
                     </c>
                     <c ca="left">
                        <p>0.231</p>
                     </c>
                     <c ca="left">
                        <p>0.049</p>
                     </c>
                     <c ca="left">
                        <p>0.195</p>
                     </c>
                     <c ca="left">
                        <p>0.035</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>m, Mean; SD, standard deviation. A, category A; D, category D. See Materials and methods for species abbreviations.</p>
               </tblfn>
            </tbl>
            <p>To focus on the genes that were likely to be more recent imports in <it>Clostridium acetobutylicum</it>, <it>B. subtilis </it>and <it>B. halodurans</it>, which have long divergence times, we studied the GC<sub>3 </sub>distribution of the subset of A genes with high cT<sup>2 </sup>values. Genes with cT<sup>2 </sup>scores grater than 30 were compared to the rest of the set (Figure <figr fid="F2">2</figr>). The numbers of genes with scores over 30 were 91, 97 and 78 for <it>C. acetobutylicum</it>, <it>B. subtilis </it>and <it>B. halodurans </it>respectively. The GC<sub>3 </sub>distribution of the high-cT<sup>2 </sup>genes in <it>C. acetobutylicum </it>differs neither from the rest of the genes in the A group, nor from those in the D group (Figure <figr fid="F1">1h</figr>). This could either be due to a lower rate of import (all imports have ameliorated), or a consequence of an already low GC content. <it>B. subtilis </it>and <it>B. halodurans</it>, on the other hand, have a wider GC<sub>3 </sub>range in the subsets of high-cT<sup>2 </sup>genes relative to the rest of the A genes. <it>B. subtilis </it>has a significant overrepresentation of high-cT<sup>2 </sup>genes in the GC-poor terminus region, which could be an effect of either a higher recombination rate in this area <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> or local nucleotide variations, or both.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Distribution of GC<sub>3 </sub>content of genes of high (dashed line) and low (solid line) cT<sup>2 </sup>in category A</p>
               </caption>
               <text>
                  <p>Distribution of GC<sub>3 </sub>content of genes of high (dashed line) and low (solid line) cT<sup>2 </sup>in category A. <b>(a) </b><it>B. subtilis</it>; <b>(b) </b><it>B. halodurans</it>; <b>(c) </b><it>C. acetobutylicum</it>.</p>
               </text>
               <graphic file="gb-2003-4-8-r48-2"/>
            </fig>
            <p>In <it>B. subtilis </it>group D, there are a number of genes with high cT<sup>2 </sup>and low GC<sub>3 </sub>around the <it>ori </it>region. These are most likely not imports, as they have defined (and often ribosomal) functions and a high codon adaptation index (CAI) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, indicating that there is translational selection on these genes. Genes in group A around the terminus region also have high CAI, but this is probably due to the fact that low GC<sub>3 </sub>coincides with the choice of major codons in <it>B. subtilis </it><abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
            <p>Comparing GC<sub>3 </sub>skews of category D in Figure <figr fid="F1">1</figr>, we note that all <it>Escherichia/Salmonella </it>clade genomes are skewed towards low GC<sub>3</sub>. This relationship could be due to one or both of the following explanations: there may be local, although small, regions that are biased towards lower GC<sub>3</sub>; or it may suggest that there are a number of pervasive low-GC<sub>3 </sub>imports into the <it>Escherichia/Salmonella </it>clade.</p>
            <p>Both <it>E. coli </it>and <it>B. subtilis </it>have distinct regions around the terminus with low GC content <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B22">22</abbr></abbrgrp>. This regional bias around the terminus is not a general feature. The genome of <it>B. halodurans </it>also has a low-GC region around the terminus, although not as distinct as that of <it>B. subtilis</it>. The regional bias of <it>S. aureus </it>mu50 is intermediate between <it>B. halodurans </it>and <it>B. subtilis</it>, whereas <it>Clostridium </it>has no low-GC region around the terminus.</p>
         </sec>
         <sec>
            <st>
               <p>Recent duplications</p>
            </st>
            <p>The fate of a recent duplication is selection or redundancy followed by deletion, with the vast majority of recent duplications being deleted <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. In several organisms it was found that the persistence of neutral material is extremely low. Few neutral paralogs survive long enough to undergo any significant mutational degradation. In all cases considered we find that the total number of duplications in the A groups are significantly overrepresented compared to category D (Table <tblr tid="T6">6</tblr>).</p>
            <tbl id="T6" hint_layout="single">
               <title>
                  <p>Table 6</p>
               </title>
               <caption>
                  <p>Frequency of duplications by category</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Pair</p>
                     </c>
                     <c ca="left">
                        <p>Outgroups</p>
                     </c>
                     <c ca="left">
                        <p><it>Ndup</it>(A)/ <it>N</it>(A)</p>
                     </c>
                     <c ca="left">
                        <p><it>Ndup</it>(D)/ <it>N</it>(D)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, edl</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.074**</p>
                     </c>
                     <c ca="left">
                        <p>0.016</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ecol, stym</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>0.101**</p>
                     </c>
                     <c ca="left">
                        <p>0.018</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>edl, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.283**</p>
                     </c>
                     <c ca="left">
                        <p>0.025</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>sfle, ecol</p>
                     </c>
                     <c ca="left">
                        <p>stym, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.266**</p>
                     </c>
                     <c ca="left">
                        <p>0.080</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, ecol</p>
                     </c>
                     <c ca="left">
                        <p>kpne, paer</p>
                     </c>
                     <c ca="left">
                        <p>0.030*</p>
                     </c>
                     <c ca="left">
                        <p>0.015</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>stym, styp</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.056**</p>
                     </c>
                     <c ca="left">
                        <p>0.014</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>styp, stym</p>
                     </c>
                     <c ca="left">
                        <p>ecol, klebs</p>
                     </c>
                     <c ca="left">
                        <p>0.036*</p>
                     </c>
                     <c ca="left">
                        <p>0.011</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bsub, bhal</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>0.029**</p>
                     </c>
                     <c ca="left">
                        <p>0.008</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>0.027**</p>
                     </c>
                     <c ca="left">
                        <p>0.011</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cace, cper</p>
                     </c>
                     <c ca="left">
                        <p>bhal, bsub</p>
                     </c>
                     <c ca="left">
                        <p>0.026**</p>
                     </c>
                     <c ca="left">
                        <p>0.009</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>samu, samw</p>
                     </c>
                     <c ca="left">
                        <p>bsub, cace</p>
                     </c>
                     <c ca="left">
                        <p>0.086**</p>
                     </c>
                     <c ca="left">
                        <p>0.009</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Significant at 95%; **significant at 99%.</p>
               </tblfn>
            </tbl>
            <p>As a result of both stoichiometric and expression-level constraints on gene duplication, we expect the number of recent duplications among genes that are either highly expressed or involved in complex protein interactions to be low. While the interaction level can be hard to assess, expression can be estimated from the codon bias <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. In <it>E. coli </it>K12, we see that the paralogs in category A are all hypothetical proteins with low codon bias, CAI &lt; 0.3. In category D, 33 of 47 recent duplications are hypothetical or putative genes, with a low average CAI score. Two exceptions are the <it>tuf </it>copies. In <it>E. coli </it>O157:H7, hypothetical genes dominate the duplications in category A and none of the paralogs has a known function. The duplications in category D of O157:H7 are similar to those in category D of K12, with mostly hypothetical genes. <it>S. typhimurium </it>has only five recent duplications in category A, and all are putative. In category D, we find 15 copies of a number of ABC transporter-related genes, along with putative genes and two <it>tuf </it>copies. <it>C. acetobutylicum </it>also follows this pattern, with mostly hypothetical genes, along with ABC transporters. In <it>B. subtilis</it>, all 16 duplications in category A are hypothetical. Ten of sixteen have CAI scores lower than average. Genes are better defined in category D, with only 6 of 10 duplicated genes having CAI scores lower than average. <it>S. aureus </it>mu50 has few recent paralogs in category D, and only one is annotated as hypothetical. In category A, four out of six are hypothetical. If duplication is mainly dependent on function, then it may in part explain the overrepresentation of duplications in category A for almost all organisms. This category should contain the highest proportion of weakly functional or nonfunctional genes, which are less disruptive to amplify.</p>
            <p>In general, duplicated genes seem more likely to have high cT<sup>2 </sup>scores. This correlation is dominated by the D category (Table <tblr tid="T4">4</tblr>) where there is a higher proportion of deviant genes among the duplications in all but three cases. This would suggest a general link between atypical sequences and duplication, which may be due to unusual DNA sequences such as transposon-like structures. Even though annotated transposons have been removed from the datasets, some such sequences may still persist. Furthermore, the overrepresentation could also be due to higher rates of recombination in certain chromosomal regions. Finally, it is possible that the paralogs were imported as a cluster - the duplication therefore taking place before the transfer. This is not supported by the chromosomal positions of paralogs in category A of the paired organisms in Table <tblr tid="T6">6</tblr>, suggesting that duplication is likely to have taken place after transfer. Another possibility could be a multiple and probably simultaneous import of the same gene from a single source. It would be difficult to tell this event apart from an indigenous duplication event.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Patterns of lateral transfer</p>
            </st>
            <p>The context bias (cT<sup>2</sup>) has been shown to be a useful measure to identify recently imported genes <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. This is further corroborated in this study where the gene categories that are expected to contain more recently imported genes always have a larger fraction of deviant genes (high cT<sup>2</sup>). The connection is strengthened further by the observation that the deviant genes have a broader and more uniform GC<sub>3 </sub>distribution in keeping with the expectation that recent imports have not had time to ameliorate to the genomic bias. Thus, the number of recent imports in each genome can be assumed to be related to the number of deviant genes in category A. The absolute numbers of deviant genes in the A and B groups are very similar for all genomes considered (Tables <tblr tid="T2">2</tblr>,<tblr tid="T3">3</tblr>). A possible exception is <it>S. aureus</it>, which seems to have a lower number of deviant genes in category A. This suggests that the intensity of lateral transfer is roughly of the same magnitude in all cases.</p>
         </sec>
         <sec>
            <st>
               <p>GC<sub>3 </sub>distribution of proposed imports</p>
            </st>
            <p>The general flatness of the distributions of GC<sub>3 </sub>content in category A may support the notion that recently laterally transferred genes dominate in this group. This is the kind of distribution we would expect under the assumption that the multitude of potential donors have a wide range of GC<sub>3 </sub>content. The distribution of GC<sub>3 </sub>content in category D is almost always more peaked and has more weight in its tails, reflecting a non-Gaussian probability distribution. Among the organisms with approximately 50% GC<sub>3 </sub>content, there is a slight skew to low GC<sub>3</sub>. Furthermore, in the two organisms with low genomic GC<sub>3 </sub>content, both categories A and D are skewed slightly to higher GC<sub>3</sub>.</p>
            <p>Whereas recent imports generally seem to have broad, nearly uniform, GC<sub>3 </sub>distributions, there seem to be few genes with high GC<sub>3 </sub>content. This is particularly evident in the genomes with low overall GC<sub>3 </sub>content. Thus, the recent imports in the <it>Escherichia/Salmonella </it>clade with intermediate overall GC<sub>3 </sub>content have broad tails to low GC<sub>3</sub>, but the low-GC<sub>3 </sub>genomes have no corresponding tails to high GC<sub>3 </sub>(Figure <figr fid="F1">1</figr>). Either the potential donors for these genomes are all of low GC<sub>3 </sub>content or low-GC<sub>3 </sub>genes are more easily exchanged. <it>E. coli </it>has regions in the genome with low GC<sub>3 </sub>content <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B22">22</abbr></abbrgrp> as does <it>B. subtilis</it>. Possibly such regions are more likely to accept imports of similar GC content.</p>
            <p>In the <it>Escherichia/Salmonella </it>clade, the chromosomal position of category A genes was studied. For <it>E. coli </it>K12 and <it>S. typhimurium</it>, there is a tendency to have a high number of A genes at gene position 2,000 to 3,000. <it>E. coli </it>O157:H7 has a peak at 1,000 to 2,000. In general, there is no overrepresentation of potential laterally transferred genes in the immediate vicinity of the low-GC terminus region in contrast to previous proposals <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B18">18</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Duplication bias</p>
            </st>
            <p>Genes that are likely to have been imported into the organism from an outside source are also more likely to be duplicated. This is the overall impression from studying bacteria both from the proteobacteria group and from the <it>Bacillus/Clostridium </it>clade, although the relation between cT<sup>2 </sup>scores and propensity to duplicate is weaker in the latter group. This observed bias may be due to the following:</p>
            <p>(1) Genes that have been imported into the genome may by their nature be more mobile or 'promiscuous'.</p>
            <p>(2) Imported genes that are required in a new niche may be needed in larger amounts until control of expression levels has become established.</p>
            <p>(3) Duplications are more likely to be retained or occur in genes that are poorly optimized. Thus, amplification is more likely to succeed in category A since there is a lower risk of counter-selection and a greater chance of discovering a weak but novel function.</p>
            <p>(4) Genes may be imported in multiple copies into a genome. Even though this is not an effect of the duplication mechanism of the new host, the multiple imports still act as if they were actual paralogs, with the same restrictions. This explanation would be largely indistinguishable from the others.</p>
            <p>Although all genes or open reading frames (ORFs) annotated as being transposon- or phage-related have been removed from this analysis, the possibility always remains that some such sequences remain that are as yet unrecognized. Therefore, explanation (1) cannot fully be ruled out. However, we believe that explanation (3) is more pertinent from observations in category D, where duplication occurs more often among genes with poorly understood function, and also from previous observations <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> where we find that highly expressed genes are less likely to be duplicated. This suggests that amplification of a gene product with a weak function is generally less disruptive and costly than amplification of a stronger gene product. If this is the case, the bias for duplications in category A need not be explained by areas of high recombination or increased mobility, but simply by virtue of a larger proportion of poorly optimized genes and gene functions among the potential imports.</p>
            <p>As virtually all duplicated recent imports are annotated as hypothetical or putative, it is not easy to suggest functions. It is possible, and even likely, that some recent imports have strong gene functions that are required in a new niche and could be duplicated to provide an appropriate gene-dosage effect. However, recent imports that have identifiable functions are rarely duplicated. Thus it seems reasonable to suggest that most of the duplicated genes among the recent imports have poorly optimized functions and that the apparent bias for duplication is due to selection for gene dosage. Nevertheless, we expect that most of these duplicated imports are also transient, as are most other imports and other duplications. The main point is that neutral and near-neutral genes have a limited persistence in the genome and that the pool of weakly functional or nonfunctional genes is dominated by the recent imports. This is where new functions seem most likely to be discovered.</p>
         </sec>
         <sec>
            <st>
               <p>Lateral transfer and gene innovation</p>
            </st>
            <p>If every duplication event has roughly the same chance of being selected by amplifying a gene product, then the contribution of lateral transfer to gene innovation can be estimated. Here we assume that the observed duplications are essentially neutral or weakly selected, and that deleterious amplifications are quickly purged from the population. In <it>E. coli </it>K12 categories A and B, when compared to both <it>E. coli </it>O157:H7 and <it>S. typhimurium</it>, there are roughly half as many duplication events as in category D. In the corresponding <it>S. typhimurium </it>comparisons, a quarter of all duplications are in category A or B. For the proteobacteria, <it>E. coli </it>O157:H7 tops the list, with almost half of its recent duplications in category A or B. In the <it>Bacillus/Clostridium </it>group, more than half of all duplications are in these categories.</p>
            <p>As a conservative estimate, under the assumption that novel gene functions arise through modifying paralogs that are retained through an amplification of primarily weak functions, we propose that at least a quarter of all gene-innovation events are a direct consequence of lateral transfer. The estimate is conservative as there may be duplications in D that are 'stuck' in the amplified function, such as elongation factor and, possibly, ABC transporter genes. In any case, lateral transfer in combination with duplication may have a considerable contribution to gene innovation. This contribution is not always apparent when focusing solely on the transfer of established gene functions.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Genome data</p>
            </st>
            <p>A number of genomes were downloaded from GenBank, and abbreviated as follows: <it>Escherichia coli </it>K12 (ecol) <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>; <it>E. coli </it>O157:H7 EDL 933 (edl) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>; <it>Shigella flexneri </it>(sflex) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>; <it>Salmonella typhimurium </it>(stym) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>; <it>S. typhi </it>(styp) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>; <it>Bacillus subtilis </it>(bsub) <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>; <it>B. halodurans </it>(bhal) <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>; <it>Clostridium perfringens </it>(cper) <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>; <it>C. acetobutylicum </it>(cace) <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>; <it>Staphylococcus aureus </it>strains mu50 (samu) <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> and mw2 (samw) <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>; and <it>Pseudomonas aeruginosa </it>(paer) <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Furthermore, contigs from <it>Klebsiella pneumoniae </it>(klebs), which is nearing completion, were downloaded from the Genome Sequencing Center <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
            <p>ORFs annotated as related to either transposons or phage sequences were removed, because occurrences of these sequences may prove confounding. Only genes longer than 400 nucleotides were considered, because of small-sample effects when calculating cT<sup>2 </sup>scores (see below).</p>
            <p>The bacteria studied here fall loosely into two clades: the <it>Bacillus/Clostridium </it>and the <it>Salmonella/Escherichia </it>clades. Although it is often difficult to produce good estimates of bacterial divergence, 16s RNA trees (data not shown) suggest that the bacteria in the <it>Salmonella/Escherichia </it>clade diverged much later than the bacteria in the <it>Bacillus/Clostridium </it>clade, with the possible exception of the <it>S. aureus </it>strains, which may be comparable to the <it>Salmonella/Escherichia </it>clade.</p>
         </sec>
         <sec>
            <st>
               <p>Lateral transfer</p>
            </st>
            <p>As a measure of lateral transfer into various genomes, we use a phylogenetic approach among a group of related bacteria. For instance, as the divergence of <it>Salmonella </it>and <it>Escherichia </it>is more recent than the divergence of their ancestor to <it>Klebsiella </it>and <it>Pseudomonas</it>, we assume that a gene present only in <it>Escherichia </it>is a likely lateral transfer (gene import) that occurred after the divergence of <it>Salmonella </it>and <it>Escherichia</it>, or a transfer that occurred before divergence but was subsequently lost in <it>Salmonella</it>. This category of genes contains likely candidates for lateral transfer. As support, we use a method developed to detect atypical nucleotide context biases (cT<sup>2 </sup><abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>). As defined, this context bias is independent of the nucleotide usage of the gene considered and is expected to reflect a mutation bias that deviates from the average of the genome. The values of cT<sup>2 </sup>are low when genes appear typical and high when they appear atypical. When a foreign gene is first introduced, it may appear atypical because it has been adapted to a different mutation bias. In time it is expected to ameliorate and approach the context bias of its new host.</p>
            <p>Organisms are organized into groups of four. The first group comprises the subject organism (for example, <it>E. coli </it>K12) and a close relative (<it>S. typhimurium</it>). These are then compared with two organisms that are more distant (<it>K. pneumoniae </it>and <it>P. aeruginosa</it>). There are four categories of genes that are considered: A, B, C and D, according to Table <tblr tid="T1">1</tblr> (compare <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>). Category A is assumed to include very recent gene imports, entering the genome after the last divergence. Category B is presumed to also include imports, although to a lesser extent than category A, as there is the increased probability of genes being lost in the outlier genomes. Category C, which is only briefly addressed in this study, is likely to be composed of genes that have been lost in the paired genome and retained in the others. Category D consists mainly of genes that have been present for a long time and would include imports only if they are very pervasive, that is, they have been lost and reimported or imported several times.</p>
            <p>As a threshold for determining presence or absence, we used the <it>blastx </it>package <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, which gives six-frame translations of nucleotide sequence to amino acid sequence. An E-value of less than 10<sup>-10 </sup>indicates presence in the studied genome. The value was set to a relatively low level of significance because of the highly varying degrees of divergence of the organisms studied. Furthermore, we chose the same threshold for all organisms, as any variable value would imply that the relative ages of divergence were known.</p>
         </sec>
         <sec>
            <st>
               <p>Duplication</p>
            </st>
            <p>As a simple measure of recent duplications, genomes were searched against themselves using <it>blastn </it><abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Genes that could fulfill both a more stringent E-value of lower than 10<sup>-20 </sup>and a nucleotide identity of &#8805; 95% were considered to be recent duplications. There is always the possibility that such genes are not recent paralogs, but highly conserved, through gene conversion for instance. However, if these paralogs are so highly conserved, they would more likely be found in categories C or D.</p>
         </sec>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by the Swedish Research Council.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <aug>
               <au>
                  <snm>Ohno</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Evolution by Gene Duplication</source>
            <publisher>Heidelberg: Springer-Verlag</publisher>
            <pubdate>1970</pubdate>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Selection in the evolution of gene duplications.</p>
            </title>
            <aug>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0008.1</fpage>
            <lpage>0008.9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2002-3-2-research0008</pubid>
                  <pubid idtype="pmpid" link="fulltext">11864370</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>On the nature of gene innovation: duplication patterns in microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <fpage>945</fpage>
            <lpage>954</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msg101</pubid>
                  <pubid idtype="pmpid" link="fulltext">12716994</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Complete genome sequence of <it>Neisseria meningitidis </it>serogroup B strain MC58.</p>
            </title>
            <aug>
               <au>
                  <snm>Tettelin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Saunders</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Heidelberg</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jeffries</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Ketchum</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Hood</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Peden</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>RJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>1809</fpage>
            <lpage>1815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5459.1809</pubid>
                  <pubid idtype="pmpid" link="fulltext">10710307</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Codon usage and lateral gene transfer in <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Moszer</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rocha</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Danchin</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Curr Opin Microbiol</source>
            <pubdate>1999</pubdate>
            <volume>2</volume>
            <fpage>524</fpage>
            <lpage>528</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1369-5274(99)00011-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10508724</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Evidence for horizontal gene transfer in <it>Escherichia coli </it>speciation.</p>
            </title>
            <aug>
               <au>
                  <snm>Medigue</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rouxel</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vigier</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Henaut</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Danchin</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1991</pubdate>
            <volume>222</volume>
            <fpage>851</fpage>
            <lpage>856</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1762151</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Phylogenetic analyses of two "archaeal" genes in <it>Thermotoga maritima </it>reveal multiple transfers between archaea and bacteria.</p>
            </title>
            <aug>
               <au>
                  <snm>Nesbo</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>L'Haridon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>362</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11230537</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>442</fpage>
            <lpage>444</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(98)01553-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">9825671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Amelioration of bacterial genomes: rates of change and exchange.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1997</pubdate>
            <volume>44</volume>
            <fpage>383</fpage>
            <lpage>397</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9089078</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Molecular archaeology of the <it>Escherichia coli </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>9413</fpage>
            <lpage>9417</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.95.16.9413</pubid>
                  <pubid idtype="pmpid" link="fulltext">9689094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Horizontal gene transfer in bacterial and archaeal complete genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Garcia-Vallve</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Romeu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Palau</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1719</fpage>
            <lpage>1725</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.130000</pubid>
                  <pubid idtype="pmpid" link="fulltext">11076857</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Genomes in flux: the evolution of archaeal and proteobacterial gene content.</p>
            </title>
            <aug>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huynen</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>17</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.176501</pubid>
                  <pubid idtype="pmpid" link="fulltext">11779827</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Phylogenetic classification the universal tree.</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>284</volume>
            <fpage>2124</fpage>
            <lpage>2129</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.284.5423.2124</pubid>
                  <pubid idtype="pmpid" link="fulltext">10381871</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Gene transfer in bacteria: speciation without species?</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Theor Popul Biol</source>
            <pubdate>2002</pubdate>
            <volume>61</volume>
            <fpage>449</fpage>
            <lpage>460</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/tpbi.2002.1587</pubid>
                  <pubid idtype="pmpid" link="fulltext">12167364</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Prokaryotic evolution in light of gene transfer.</p>
            </title>
            <aug>
               <au>
                  <snm>Gogarten</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>2226</fpage>
            <lpage>2238</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12446813</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Something for everyone: horizontal gene transfer in evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Kurland</snm>
                  <fnm>CG</fnm>
               </au>
            </aug>
            <source>EMBO Rep</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>92</fpage>
            <lpage>95</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/embo-reports/kvd042</pubid>
                  <pubid idtype="pmpid">11265763</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Evolution of microbial genomes: sequence acquisition and loss.</p>
            </title>
            <aug>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
               <au>
                  <snm>Kurland</snm>
                  <fnm>CG</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>2265</fpage>
            <lpage>2276</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12446817</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Gene import or deletion - a study of the difference genes in <it>Escherichia coli </it>strains K12 and O157:H7.</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2002</pubdate>
            <volume>55</volume>
            <fpage>734</fpage>
            <lpage>744</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-002-2369-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12486532</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Detection of genes with atypical nucleotide sequence in microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2002</pubdate>
            <volume>54</volume>
            <fpage>365</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11847562</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The Codon Adaptation Index - a measure of directional synonymous codon usage bias, and its potential applications.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>W-H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1987</pubdate>
            <volume>15</volume>
            <fpage>1281</fpage>
            <lpage>1295</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3547335</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Codon usage patterns in <it>Escherichia coli</it>, <it>Bacillus subtilis</it>, <it>Saccharomyces cerevisiae</it>, <it>Schizosaccaromyces pombe</it>, <it>Drosophila melanogaster </it>and <it>Homo sapiens</it>: a review of the considerable within-species diversity.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Cowe</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Shields</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Wright</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1988</pubdate>
            <volume>16</volume>
            <fpage>8207</fpage>
            <lpage>8211</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3138659</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Intragenomic base content variation is a potential source of biases when searching for horizontally transferred genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Guindon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Perriere</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>1838</fpage>
            <lpage>1840</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11504864</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The complete genome sequence of <it>Escherichia coli </it>K-12.</p>
            </title>
            <aug>
               <au>
                  <snm>Blattner</snm>
                  <fnm>FR</fnm>
               </au>
               <au>
                  <snm>Plunkett</snm>
                  <fnm>G</fnm>
                  <suf>III</suf>
               </au>
               <au>
                  <snm>Bloch</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Perna</snm>
                  <fnm>NT</fnm>
               </au>
               <au>
                  <snm>Burland</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Riley</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Glasner</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Rode</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Mayhew</snm>
                  <fnm>GF</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>277</volume>
            <fpage>1453</fpage>
            <lpage>1474</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.277.5331.1453</pubid>
                  <pubid idtype="pmpid" link="fulltext">9278503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Genome sequence of enterohaemorrhagic <it>Escherichia coli </it>O157:H7.</p>
            </title>
            <aug>
               <au>
                  <snm>Perna</snm>
                  <fnm>NT</fnm>
               </au>
               <au>
                  <snm>Plunkett</snm>
                  <fnm>G</fnm>
                  <suf>III</suf>
               </au>
               <au>
                  <snm>Burland</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mau</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Glasner</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Rose</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Mayhew</snm>
                  <fnm>GF</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Gregor</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kirkpatrick</snm>
                  <fnm>HA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>529</fpage>
            <lpage>533</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35054089</pubid>
                  <pubid idtype="pmpid" link="fulltext">11206551</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Genome sequence of <it>Shigella flexneri </it>2a, insights into pathogenicity through comparison with genomes of <it>Escherichia coli </it>K12 and O157.</p>
            </title>
            <aug>
               <au>
                  <snm>Jin</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Yuan</snm>
                  <fnm>ZH</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Shen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>F</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>4432</fpage>
            <lpage>4441</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkf566</pubid>
                  <pubid idtype="pmpid" link="fulltext">12384590</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Complete genome sequence of <it>Salmonella enterica </it>serovar <it>Typhimurium </it>LT2.</p>
            </title>
            <aug>
               <au>
                  <snm>McClelland</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Spieth</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Clifton</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Latreille</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Courtney</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Porwollik</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ali</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dante</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Du</snm>
                  <fnm>F</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>413</volume>
            <fpage>852</fpage>
            <lpage>856</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35101614</pubid>
                  <pubid idtype="pmpid" link="fulltext">11677609</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Complete genome sequence of a multiple drug resistant <it>Salmonella enterica </it>serovar <it>Typhi </it>CT18.</p>
            </title>
            <aug>
               <au>
                  <snm>Parkhill</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dougan</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Thomson</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Pickard</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wain</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Churcher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Bentley</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Holden</snm>
                  <fnm>MTG</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>413</volume>
            <fpage>848</fpage>
            <lpage>852</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35101607</pubid>
                  <pubid idtype="pmpid" link="fulltext">11677608</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The complete genome sequence of the gram-positive bacterium <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kunst</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Moszer</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Albertini</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Alloni</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Azevedo</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bertero</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Bessieres</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bolotin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Borchert</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>390</volume>
            <fpage>249</fpage>
            <lpage>256</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/36786</pubid>
                  <pubid idtype="pmpid" link="fulltext">9384377</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Complete genome sequence of the alkaliphilic bacterium <it>Bacillus halodurans </it>and genomic sequence comparison with <it>Bacillus subtilis</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Takami</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakasone</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Maeno</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sasaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Masui</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Fuji</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hirama</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>4317</fpage>
            <lpage>4331</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/28.21.4317</pubid>
                  <pubid idtype="pmpid" link="fulltext">11058132</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Complete genome sequence of <it>Clostridium perfringens</it>, an anaerobic flesh-eater.</p>
            </title>
            <aug>
               <au>
                  <snm>Shimizu</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ohtani</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hirakawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ohshima</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shiba</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ogasawara</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hattori</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kuhara</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hayashi</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>996</fpage>
            <lpage>1001</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.022493799</pubid>
                  <pubid idtype="pmpid" link="fulltext">11792842</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Genome sequence and comparative analysis of the solvent-producing bacterium <it>Clostridium acetobutylicum</it></p>
            </title>
            <aug>
               <au>
                  <snm>Nolling</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Breton</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Omelchenko</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Markarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Zeng</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Dubois</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Qiu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hitti</snm>
                  <fnm>J</fnm>
               </au>
               <etal/>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>4823</fpage>
            <lpage>4838</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1128/JB.183.16.4823-4838.2001</pubid>
                  <pubid idtype="pmpid" link="fulltext">11466286</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Whole genome sequencing of methicillin-resistant <it>Staphylococcus aureus</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kuroda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohta</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Uchiyama</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Baba</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yuzawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Cui</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Oguchi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Aoki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nagai</snm>
                  <fnm>Y</fnm>
               </au>
               <etal/>
            </aug>
            <source>Lancet</source>
            <pubdate>2001</pubdate>
            <volume>357</volume>
            <fpage>1225</fpage>
            <lpage>1240</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0140-6736(00)04403-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">11418146</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Genome virulence determinants of high virulence community-acquired MRSA.</p>
            </title>
            <aug>
               <au>
                  <snm>Baba</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Takeuchi</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Kuroda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yuzawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Aoki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Oguchi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nagai</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Iwama</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Asano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Naimi</snm>
                  <fnm>T</fnm>
               </au>
               <etal/>
            </aug>
            <source>Lancet</source>
            <pubdate>2002</pubdate>
            <volume>359</volume>
            <fpage>1819</fpage>
            <lpage>1827</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0140-6736(02)08713-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">12044378</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Complete genome sequence of <it>Pseudomonas aeruginosa </it>PA01, an opportunistic pathogen.</p>
            </title>
            <aug>
               <au>
                  <snm>Stover</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Pham</snm>
                  <fnm>X-QT</fnm>
               </au>
               <au>
                  <snm>Erwin</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Mizoguchi</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Warrener</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Brinkman</snm>
                  <fnm>FSL</fnm>
               </au>
               <au>
                  <snm>Hufnagle</snm>
                  <fnm>WO</fnm>
               </au>
               <au>
                  <snm>Kowalik</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Lagrou</snm>
                  <fnm>M</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>406</volume>
            <fpage>959</fpage>
            <lpage>964</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35023079</pubid>
                  <pubid idtype="pmpid" link="fulltext">10984043</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Genome Sequencing Center</p>
            </title>
            <url>http://genome.wustl.edu</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Basic local alignment search tool.</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1990.9999</pubid>
                  <pubid idtype="pmpid" link="fulltext">2231712</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
