<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-2-r14</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Lateral gene transfer and ancient paralogy of operons containing redundant copies of tryptophan-pathway genes in <it>Xylella </it>species and in heterocystous cyanobacteria</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Xie</snm>
               <fnm>Gary</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
            </au>
            <au id="A2">
               <snm>Bonner</snm>
               <mi>A</mi>
               <fnm>Carol</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A3">
               <snm>Brettin</snm>
               <fnm>Tom</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A4">
               <snm>Gottardo</snm>
               <fnm>Raphael</fnm>
               <insr iid="I2"/>
            </au>
            <au id="A5" ca="yes">
               <snm>Keyhani</snm>
               <mi>O</mi>
               <fnm>Nemat</fnm>
               <insr iid="I1"/>
               <email>keyhani@ufl.edu</email>
            </au>
            <au id="A6">
               <snm>Jensen</snm>
               <mi>A</mi>
               <fnm>Roy</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <insr iid="I3"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Microbiology and Cell Science, University of Florida, PO Box 110700, Gainesville, FL 32611, USA</p>
            </ins>
            <ins id="I2">
               <p>BioScience Division, Los Alamos National Laboratory, Los Alamos, NM 87544, USA</p>
            </ins>
            <ins id="I3">
               <p>Department of Chemistry, City College of New York, New York, NY 10031, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>2</issue>
         <fpage>R14</fpage>
         <url>http://genomebiology.com/2003/4/2/R14</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-2-r14</pubid>
               <pubid idtype="pmpid">12620124</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>27</day>
               <month>9</month>
               <year>2002</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>4</day>
               <month>11</month>
               <year>2002</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>26</day>
               <month>11</month>
               <year>2002</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>29</day>
               <month>1</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>2003 Xie et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <shorttitle>
         <p>Lateral gene transfer and ancient paralogy of operons containing redundant copies of tryptophan-pathway genes in <it>Xylella </it>species and in <it>heterocystous cyanobacteria</it></p>
      </shorttitle>
      <shortabs>
         <p>Tryptophan-pathway genes that exist within an apparent operon-like organization were evaluated. A seven-gene cluster in <it>Xylella fastidiosa </it>exhibits a sharply delineated low-GC content. This strongly implicates lateral gene transfer. In contrast, parametric studies and protein tree phylogenies did not support the origination of a gene block in the <it>Anabaena/Nostoc </it>lineage by lateral gene transfer.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Tryptophan-pathway genes that exist within an apparent operon-like organization were evaluated as examples of multi-genic genomic regions that contain phylogenetically incongruous genes and coexist with genes outside the operon that are congruous. A seven-gene cluster in <it>Xylella fastidiosa </it>includes genes encoding the two subunits of anthranilate synthase, an aryl-CoA synthetase, and <it>trpR. </it>A second gene block, present in the <it>Anabaena/Nostoc </it>lineage, but not in other cyanobacteria, contains a near-complete tryptophan operon nested within an apparent supraoperon containing other aromatic-pathway genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The gene block in <it>X. fastidiosa </it>exhibits a sharply delineated low-GC content. This, as well as bias of codon usage and 3:1 dinucleotide analysis, strongly implicates lateral gene transfer (LGT). In contrast, parametric studies and protein tree phylogenies did not support the origination of the <it>Anabaena/Nostoc </it>gene block by LGT.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>Judging from the apparent minimal amelioration, the low-GC gene block in <it>X. fastidiosa </it>probably originated by LGT at a relatively recent time. The surprising inability to pinpoint a donor lineage still leaves room for alternative, albeit less likely, explanations other than LGT. On the other hand, the large <it>Anabaena/Nostoc </it>gene block does not seem to have arisen by LGT. We suggest that the contemporary <it>Anabaena/Nostoc </it>array of divergent paralogs represents an ancient ancestral state of paralog divergence, with extensive streamlining by gene loss occurring in the lineage of descent representing other (unicellular) cyanobacteria.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010001">Biochemistry and structural biology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <sec>
            <st>
               <p>Lateral gene transfer</p>
            </st>
            <p>Lateral gene transfer (LGT) has been generally accepted for some time, as exemplified by the endosymbiotic hypothesis of organelle origin [<abbr bid="B1">1</abbr>,<abbr bid="B2">2</abbr>]. Nevertheless, a long-standing background of general conviction has held that LGT is rare, especially between distant organisms. However, the modern era of genomics has been accompanied by increasingly numerous claims that LGT is frequent [<abbr bid="B3">3</abbr>,<abbr bid="B4">4</abbr>,<abbr bid="B5">5</abbr>,<abbr bid="B6">6</abbr>], and there now seems little doubt that LGT exerts a significant influence upon evolutionary histories. Indeed, it has even been asserted that vertical evolutionary patterns of descent might be impossibly masked by rampant events of LGT and that, in fact, instead of bifurcating phylogenetic trees, a reticulate (net-like) pattern exists [<abbr bid="B7">7</abbr>,<abbr bid="B8">8</abbr>,<abbr bid="B9">9</abbr>]. On the other hand, others urge a more balanced perspective, pointing out that alternative explanations for apparent cases of LGT have not always been considered [<abbr bid="B10">10</abbr>,<abbr bid="B11">11</abbr>,<abbr bid="B12">12</abbr>,<abbr bid="B13">13</abbr>,<abbr bid="B14">14</abbr>]. The rationale for explanations other than LGT for genealogical incongruities (such as hidden paralogies and reconstruction artifacts) have been presented in comprehensive detail by Glansdorff [<abbr bid="B15">15</abbr>].</p>
            <p>Woese [<abbr bid="B16">16</abbr>] contends that the rRNA tree is a valid representation of organismal genealogy, that LGT was rampant only before the initial bifurcation of the universal phylogenetic tree, and that LGT has become progressively more restricted as a function of elapsed evolutionary time. Using the aminoacyl-tRNA synthases as an example of the modular-type entities asserted to be most amenable to LGT, Woese concludes that the genealogical trace of vertical gene flow is readable, despite a significant jumbling influence of LGT. If correct, this allows the optimistic viewpoint that the complex interplay of vertical gene descent and LGT can be deciphered to yield correct evolutionary histories, provided that sufficiently detailed studies are done.</p>
            <p>Approaches for detection of LGT events are either phylogenetic or parametric. Phylogenetic approaches depend on congruence of phylogenetic trees. Aside from technical difficulties of inferring high-quality trees, conflicts between trees under comparison are not necessarily due to LGT, but can arise from coincidental loss of divergent paralogs in different, widely spaced lineages or from convergent evolution. Parametric approaches for detection of LGT include (but are not limited to) the analysis of nucleotide composition, dinucleotide frequencies and codon usage biases. Lawrence and Ochman [<abbr bid="B17">17</abbr>] used such parametric analysis to identify a set of <it>Escherichia coli </it>genes (17.5% of the genome) having putative origin by LGT, and this has stimulated much discussion. High rates of both false positives and false negatives have been asserted by others [<abbr bid="B18">18</abbr>,<abbr bid="B19">19</abbr>], but this is tempered by presentation of a rationale for why phylogenetic and different parametric methods detect different gene subsets [<abbr bid="B20">20</abbr>,<abbr bid="B21">21</abbr>,<abbr bid="B22">22</abbr>]. A consensus seems to be emerging that the most proficient attempts to reconstruct evolutionary events will employ a multifaceted approach that combines tree inference with parametric analysis in a biological context [<abbr bid="B21">21</abbr>,<abbr bid="B22">22</abbr>]. Lawrence and Ochman [<abbr bid="B21">21</abbr>] provide a number of examples of how the context of biological information can assist the analysis, and this approach is implemented herein.</p>
            <p>If each member of a linked group of genes is already represented elsewhere in a genome, their origin by LGT is a distinct possibility, as their transfer <it>en bloc </it>as an operon unit would have required only a single evolutionary event. During an ongoing analysis of the genomic distribution of tryptophan-pathway genes, we observed two such cases, that is, where one set of genes was phylogenetically congruent, in contrast to the incongruence of redundant gene copies that were linked to one another. We have evaluated the evidence for the alternative possibilities of LGT or ancient paralogy, as reported here.</p>
         </sec>
         <sec>
            <st>
               <p>A block of Trp-pathway genes in <it>Xylella</it></p>
            </st>
            <p>The phylogenetic incongruence of <it>trpR</it>, a regulatory gene in <it>Xylella fastidiosa</it>, led to recognition of a low-GC gene block in <it>X. fastidiosa. </it>The tryptophan repressor (TrpR) is quite limited in its phylogenetic distribution, being consistently present only within the enteric lineage, as shown in the protein tree of Figure <figr fid="F1">1</figr>. Here TrpR of <it>Shewanella putrefaciens </it>marks the outlying sequence of the enteric lineage (shown in gray). Outside the boundaries of the enteric lineage, only <it>Coxiella burnetii, X. fastidiosa </it>and two chlamydial species are thus far known to possess <it>trpR</it>. The distribution of <it>trpR </it>in the later three lineages is phylogenetically incongruent because they are widely spaced from one another on the 16S rRNA tree.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Protein tree for TrpR</p>
               </caption>
               <text>
                  <p>Protein tree for TrpR. Bootstrap values are shown at internal branch positions as percentages (1,000 replicates).</p>
               </text>
               <graphic file="gb-2003-4-2-r14-1"/>
            </fig>
            <p>In <it>Chlamydia trachomatis </it>and <it>Chlamydophila psittaci, trpR </it>is positioned near structural genes of tryptophan biosynthesis, but no indication of recent origin by LGT of genes in this region was obtained [<abbr bid="B23">23</abbr>]. <it>X. fastidiosa trpR </it>is separated by three genes from two structural genes of tryptophan biosynthesis. These latter genes do not appear to be essential for the primary task of tryptophan biosynthesis as all seven genes of tryptophan biosynthesis are represented elsewhere in the genome within one of two operons. Thus, in <it>X. fastidiosa </it>the incongruous phylogenetic position of <it>trpR</it>, the redundancy of the <it>trp</it>-linked genes encoding <it>trpAa </it>and <it>trpAb</it>, and the distinct phylogenetic incongruence of the latter gene pair all supported a reasonable possibility of origin by LGT.</p>
         </sec>
         <sec>
            <st>
               <p>The tryptophan supraoperon of <it>Anabaena/Nostoc</it></p>
            </st>
            <p>All cyanobacteria possess each of the seven Trp-pathway genes at dispersed loci, and individual trees of proteins corresponding to these dispersed genes are phylogenetically congruent. Although this generalization also applies to <it>Anabaena/Nostoc</it>, this latter lineage is unique among cyanobacteria in its possession of an additional set of Trp-pathway genes (lacking only <it>trpC</it>) that coexist within an apparent operon. As shown in Figure <figr fid="F2">2</figr>, both <it>Anabaena </it>and <it>Nostoc </it>exhibit the same relative order of operonic <it>trp </it>genes: <it>trp</it>Aa&#8226;<it>trp</it>Ab &#8594; <it>trp</it>D &#8594; <it>trpEa </it>&#8594; <it>trpEb </it>&#8594; <it>trpB</it>. <it>trpAa </it>and <it>trpAb </it>are fused, as indicated in Figure <figr fid="F2">2</figr> with a filled bar and in the text by the bullet in the notation: <it>trpAa</it>&#8226;<it>trpAb</it>. In <it>Anabaena, qor </it>(encoding NADPH: quinone reductase) has been inserted between <it>trpD </it>and <it>trpEa</it>. Another <it>qor </it>paralog is present elsewhere in the genome of <it>Anabaena</it>. <it>Nostoc </it>also has two <it>qor </it>paralogs, but neither resides within the tryptophan operon. Other cyanobacteria lack <it>qor </it>homologs altogether. In <it>Nostoc, tyrP1 </it>(encoding tyrosinase) has been inserted between <it>trpEa </it>and <it>trpEb</it>. All other cyanobacteria, including <it>Anabaena</it>, lack <it>tyrP1</it>. The two <it>trp </it>operons are less compact than frequently observed elsewhere, and relatively large intergenic spacing exists, especially in <it>N. punctiforme</it>. The only instance of translational coupling is between <it>trpAa</it>&#8226;<it>trpAb </it>and <it>trpD </it>in <it>N. punctiforme</it>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Genomic organization of aromatic-pathway genes in cyanobacteria</p>
               </caption>
               <text>
                  <p>Genomic organization of aromatic-pathway genes in cyanobacteria. Genes relevant to the common pathway segment, the tryptophan branch, the tyrosine branch, and the phenylalanine branch are color-coded, as indicated. A system for uniform genomic naming of Trp-pathway genes or domains has been used as previously implemented [<abbr bid="B23">23</abbr>,<abbr bid="B57">57</abbr>]. Fused catalytic domains are joined by solid black linkers. Gene positions along the entire chromosomes of <it>Synechocystis </it>sp. PCC 6803 and <it>Anabaena </it>sp. PCC 7120 are shown. The qualitative presence or absence of genes in <it>Nostoc punctiforme</it>, an unfinished genome, is also indicated. Detailed zoom-in schematics are shown for the gene organizations within the supraoperons of <it>Anabaena </it>and <it>Nostoc</it>, regions spanning 13,000-14,000 bp. In the latter regions, intergenic spacing is shown, with negative values indicating the extent of genic overlap.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-2"/>
            </fig>
            <p>The tryptophan operons appear to be nested within what could be a larger unit of transcription that is reminiscent of what has been called a supraoperon in <it>Bacillus subtilis </it>[<abbr bid="B24">24</abbr>]. The genes comprising the supraoperon of <it>B. subtilis </it>are <it>aroG </it>&#8594; <it>aroB </it>&#8594; <it>aroH </it>&#8594; <it>trpAaBDCEbEa </it>&#8594; <it>hisH</it><sub>b </sub>&#8594; <it>tyrA</it><sub>p </sub>&#8594; <it>aroF</it>. A hierarchy of internal promoters and terminators exists for differential control of the <it>B. subtilis </it>supraoperon. The <it>Anabaena/Nostoc </it>linkage group is additionally reminiscent of the <it>B. subtilis </it>supraoperon in the presence of <it>aroB </it>and <it>tyrA</it>. Although <it>B. subtilis </it>does not have <it>aroA</it><sub>I&#946; </sub>represented in its supraoperon (as do <it>Anabaena </it>and <it>Nostoc</it>), <it>aroA</it><sub>I&#946; </sub>is the homology class (of three possible DAHP synthase homologs distributed in nature [<abbr bid="B25">25</abbr>,<abbr bid="B26">26</abbr>,<abbr bid="B27">27</abbr>]) that is utilized by <it>B. subtilis</it>. A number of supraoperon gene insertions have occurred outside of the <it>trp </it>operon as well. These differ for <it>Anabaena </it>and <it>Nostoc </it>as depicted in Figure <figr fid="F2">2</figr>. <it>Anabaena </it>has genes encoding <it>aph </it>and a hypothetical gene (open reading frame (ORF)) inserted between <it>aroB </it>and the <it>trpAa</it>&#8226;<it>trpAb </it>fusion. The <it>aph </it>gene encodes an uncharacterized protein of the defined alkaline phosphatases (metalloenzyme superfamily) (group COG1524 in the COGS database). Among cyanobacteria, only <it>Nostoc </it>has homologs of these two <it>Anabaena </it>genes, although they are not inserted in the <it>Nostoc </it>supraoperon. <it>Nostoc </it>has <it>frnE </it>(encoding a thiol-disulfide isomerase) inserted between <it>tyrA</it><sub>(p) </sub>and <it>aroB</it>. Four subclasses of <it>tyrA </it>are defined according to the substrate specificities of the TyrA gene product: <it>tyrA</it><sub>p</sub>, specific for prephenate; <it>tyrA</it><sub>a</sub>, specific for arogenate; <it>tyrA</it><sub>c</sub>, accepts either prephenate or arogenate; and <it>tyrA</it><sub>(p)</sub>, has broad specificity but exhibits a distinct preference for prephenate. Among all cyanobacteria, only <it>Nostoc </it>possesses <it>frnE</it>.</p>
            <p>In their genomes outside the supraoperon boundaries, <it>Nostoc </it>and <it>Anabaena </it>possess a full complement of genes for biosynthesis of tryptophan, tyrosine and phenylalanine. Even these extra-supraoperonic genes of the <it>Anabaena/Nostoc </it>lineage are represented by multiple paralogs in many cases (Figure <figr fid="F2">2</figr>). If one considers the single-copy assemblage of aromatic-pathway genes present in the <it>Synechocystis/Synechococcus/Prochlorococcus </it>lineage as a fundamental complement of genes common to all cyanobacteria, the <it>Anabaena/Nostoc </it>genomic repertoire contains substantial redundancy. Thus, <it>Anabaena </it>has two additional extra-operonic paralogs of <it>aroA</it><sub>I&#946; </sub>and <it>trpD</it>. In addition to extra-operonic, free-standing copies of <it>trpAa </it>and <it>trpAb</it>, a second fused gene (<it>trpAa</it>&#8226;<it>trpAb_2</it>) encoding the two domains of anthranilate synthase is present in <it>Anabaena</it>. <it>Nostoc </it>has two extra-operonic copies of <it>aroA</it><sub>I&#946;</sub>, <it>aroB </it>and <it>trpD</it>. All cyanobacteria possess AroA of the I&#946; class (<it>aroA</it><sub>I&#946;</sub>). While this is also true of the <it>Anabaena/Nostoc </it>lineage (in fact, having multiple copies), both <it>Anabaena </it>and <it>Nostoc </it>possess an additional gene encoding AroA of the I&#945; class (<it>aroA</it><sub>I&#945;</sub>). All cyanobacteria possess a <it>tyrA </it>gene of the arogenate-specificity class (<it>tyrA</it><sub>a</sub>), but the <it>Anabaena/Nostoc </it>supraoperons also possess a <it>tyrA </it>gene deemed to be a cyclohexadienyl dehydrogenase [<abbr bid="B28">28</abbr>] with a favored specificity for prephenate (<it>tyrA</it><sub>(p)</sub>) (C.A.B., R.A.J., N.K. and McNally A., unpublished observation).</p>
            <p>Figure <figr fid="F3">3</figr> shows an evolutionary scenario, using a Fitch diagram [<abbr bid="B29">29</abbr>], that depicts the suggested origin of <it>trpD </it>paralogs via two gene duplication events (Dp1 and Dp2) that preceded the node of speciation divergence (Sp4) to <it>Nostoc </it>(Npu) and <it>Anabaena </it>(Asp). Consistent with the latter conclusion, Npu TrpD_1 exhibits greater identity with its ortholog Asp TrpD_1 than with its paralogs Npu TrpD_2 and Npu TrpD_3. Likewise, Npu TrpD_2 and Npu TrpD_3 exhibit greater identity with their Asp orthologs than with their Npu paralogs.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Fitch diagram [<abbr bid="B29">29</abbr>] illustrating the origin and distribution of ortholog and paralogs of <it>trpD </it>in cyanobacteria</p>
               </caption>
               <text>
                  <p>Fitch diagram [<abbr bid="B29">29</abbr>] illustrating the origin and distribution of ortholog and paralogs of <it>trpD </it>in cyanobacteria. Paralogs, originating by gene duplication events (Dp1 and Dp2), track back to a horizontal line, whereas orthologs, originating by speciation (Sp1, Sp2, Sp3 and Sp4), track back to an inverted Y. The six <it>trpD </it>genes of <it>Nostoc </it>(Npu) and <it>Anabaena </it>(Asp) comprise a paralog set, and each of those comprises a four-member ortholog set with respect to the <it>trpD </it>genes from <it>P. marinus </it>(Pmu), <it>Synechococcus </it>sp. (Syn), and <it>Synechocystis </it>sp. (Ssp).</p>
               </text>
               <graphic file="gb-2003-4-2-r14-3"/>
            </fig>
            <p>Since the basic single-copy repertoire of dispersed aromatic-pathway genes shown in Figure <figr fid="F2">2</figr> for <it>Synechocystis </it>(Ssp) is representative of other cyanobacteria such as <it>Synechococcus </it>(Syn) and <it>Prochlorococcus </it>(Pmu) and is also present at dispersed extra-operonic loci of <it>Anabaena </it>and <it>Nostoc</it>, an obvious possibility would seem to be that the genes of the supraoperon originated by LGT in a common ancestor of <it>Anabaena </it>and <it>Nostoc</it>. If so, speciation was followed by different species-specific gene-insertion events. Because the divergence of <it>Anabaena </it>and <it>Nostoc </it>was relatively recent, evidence for LGT by analysis of GC content, codon usage, or dinucleotide frequency might be forthcoming. A number of distinctive properties of the supraoperon gene block represent items of biological context (as discussed by Lawrence and Ochman [<abbr bid="B21">21</abbr>]) that potentially could provide excellent tracking clues about the identity of the putative donor in LGT. These include the overall gene organization of the <it>trp </it>operon, for which many microbial patterns are known; the extremely rare gene order of <it>trpEa trpEb </it>instead of the typical order <it>trpEb trpEa; </it>the fusion of genes encoding the alpha (<it>trpAa</it>) and beta (<it>trpAb</it>) subunits of anthranilate synthase, a fusion that exists in only a limited number of other taxa, and the presence of operonic genes exhibiting distinctive homology subtypes (<it>aroA</it><sub>I&#946; </sub>and <it>tyrA</it><sub>(p)</sub>).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Lateral gene transfer of a block of genes in <it>Xylella</it></p>
            </st>
            <p>The <it>trpR </it>gene in <it>X. fastidiosa </it>was previously noted [<abbr bid="B23">23</abbr>] to have anomalously low GC content, relative to that of the genome. Low-GC blocks of genes have been attributed to LGT before, for example, <it>argF </it>(present in <it>E. coli </it>K-12 but not in other strains) is bracketed with unidentified high-GC (59%) genes that together comprise a distinctive block of LGT genes [<abbr bid="B30">30</abbr>]. The flanking genes of <it>trpR </it>were accordingly analyzed for GC content. Figure <figr fid="F4">4</figr> shows that <it>trpR </it>in <it>X. fastidiosa </it>is at one end of a block of seven genes, all of which have a distinctively low GC content (highlighted in green), compared to the flanking genes (highlighted in yellow).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Block of genes acquired by lateral gene transfer (LGT) in <it>Xylella fastidiosa</it></p>
               </caption>
               <text>
                  <p>Block of genes acquired by lateral gene transfer (LGT) in <it>Xylella fastidiosa</it>. The gene map at the top shows the LGT block of genes with a green bar. The gene block begins with <it>trpAa </it>on the left and ends with <it>trpR </it>on the right. Intergenic spacing is given. The vertical pale green bar in the lower panel shows the corresponding genes from bottom to top. The GC% for each gene is shown, and the gene products are named. The hypothetical protein belongs to pfam00583, the acetyltransferase (GNAT) family. The low-GC gene block of the <it>X. fastidiosa </it>genome corresponds to gene numbers XF1914 (<it>trpAa</it>)-XF1920 (<it>trpR</it>).</p>
               </text>
               <graphic file="gb-2003-4-2-r14-4"/>
            </fig>
            <p>If the block of low-GC genes in <it>Xylella </it>really reflects an alien origin, differences in dinucleotide frequencies might be expected, as such context biases differ from organism to organism. A 3:1 dinucleotide bias (third nucleotide position in a codon analysis algorithm followed by the first nucleotide position in the succeeding codon) was utilized, as it is the dinucleotide that is least restricted by amino-acid preference and codon usage in individual genes [<abbr bid="B31">31</abbr>]. The 3:1 dinucleotide frequencies were calculated for the entire block of low-GC genes, as well as for the immediately flanking genes. These results presented in Figure <figr fid="F5">5</figr> with a set of four selected dinucleotides shows that dinucleotides frequencies of the flanking genes were within a variance of about 4% from genomic frequencies, whereas the low-GC block of genes exhibited recognizably greater variances from the genomic dinucleotides frequencies of <it>X. fastidiosa</it>.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Three-to-one dinucleotide analysis of the putative LGT-block of <it>X. fastidiosa </it>genes shown in Figure <figr fid="F4">4</figr></p>
               </caption>
               <text>
                  <p>Three-to-one dinucleotide analysis of the putative LGT-block of <it>X. fastidiosa </it>genes shown in Figure <figr fid="F4">4</figr>. For easier viewing, four of the 16 dinucleotide combinations have been selected. The frequency variation of each gene is shown as positive variation (upward-pointing bars) or negative variation (downward-pointing bars) with respect to the average genomic frequencies (set to a value of zero at the midline), the absolute values of which can be seen in Table <tblr tid="T1">1</tblr>. treg, transcriptional regulator; hypo, hypothetical gene.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-5"/>
            </fig>
            <p>The co-variation of 3:1 dinucleotide frequencies of genes in the low-GC gene block of <it>Xylella </it>with the corresponding genomic frequencies was also evaluated using the Spearman rank correlation coefficient. Table <tblr tid="T1">1</tblr> illustrates the data used to compare the <it>Xylella trpR </it>gene and the <it>Xylella </it>genome. A p-value of 0.730 indicated that the 3:1 dinucleotide frequencies of <it>trpR </it>from <it>Xylella </it>did not exhibit significant co-variation with the frequencies of the <it>Xylella </it>genome. In contrast, the 3:1 dinucleotide frequencies of <it>trpR </it>from <it>Chlamydia trachomatis </it>did exhibit significant co-variation with the frequencies characteristic of the <it>C. trachomatis </it>genome (<it>p</it>-value = 0.031). These analyses are consistent with occurrence of recent LGT in <it>X. fastidiosa</it>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Statistical test of co-variation of 3:1 dinucleotide frequencies of <it>trpR </it>and its cognate genome</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>Xylella fastidiosa</it>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <it>Chlamydia trachomatis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3:1 Dinucleotide frequencies</p>
                     </c>
                     <c ca="center">
                        <p>Genome</p>
                     </c>
                     <c ca="center">
                        <p>trpR</p>
                     </c>
                     <c ca="center">
                        <p>Genome</p>
                     </c>
                     <c ca="center">
                        <p>trpR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TT</p>
                     </c>
                     <c ca="center">
                        <p>4.5</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                     <c ca="center">
                        <p>9.3</p>
                     </c>
                     <c ca="center">
                        <p>9.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TC</p>
                     </c>
                     <c ca="center">
                        <p>5.0</p>
                     </c>
                     <c ca="center">
                        <p>6.5</p>
                     </c>
                     <c ca="center">
                        <p>8.4</p>
                     </c>
                     <c ca="center">
                        <p>9.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TA</p>
                     </c>
                     <c ca="center">
                        <p>4.2</p>
                     </c>
                     <c ca="center">
                        <p>16.3</p>
                     </c>
                     <c ca="center">
                        <p>8.6</p>
                     </c>
                     <c ca="center">
                        <p>9.6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TG</p>
                     </c>
                     <c ca="center">
                        <p>11.4</p>
                     </c>
                     <c ca="center">
                        <p>7.6</p>
                     </c>
                     <c ca="center">
                        <p>10.2</p>
                     </c>
                     <c ca="center">
                        <p>3.2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CT</p>
                     </c>
                     <c ca="center">
                        <p>4.5</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                     <c ca="center">
                        <p>4.7</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CC</p>
                     </c>
                     <c ca="center">
                        <p>6.1</p>
                     </c>
                     <c ca="center">
                        <p>2.2</p>
                     </c>
                     <c ca="center">
                        <p>3.4</p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CA</p>
                     </c>
                     <c ca="center">
                        <p>8.9</p>
                     </c>
                     <c ca="center">
                        <p>8.7</p>
                     </c>
                     <c ca="center">
                        <p>4.5</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG</p>
                     </c>
                     <c ca="center">
                        <p>9.6</p>
                     </c>
                     <c ca="center">
                        <p>2.2</p>
                     </c>
                     <c ca="center">
                        <p>4.0</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AT</p>
                     </c>
                     <c ca="center">
                        <p>3.2</p>
                     </c>
                     <c ca="center">
                        <p>1.1</p>
                     </c>
                     <c ca="center">
                        <p>4.9</p>
                     </c>
                     <c ca="center">
                        <p>6.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AC</p>
                     </c>
                     <c ca="center">
                        <p>5.2</p>
                     </c>
                     <c ca="center">
                        <p>6.5</p>
                     </c>
                     <c ca="center">
                        <p>4.4</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AA</p>
                     </c>
                     <c ca="center">
                        <p>3.7</p>
                     </c>
                     <c ca="center">
                        <p>15.2</p>
                     </c>
                     <c ca="center">
                        <p>7.5</p>
                     </c>
                     <c ca="center">
                        <p>7.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AG</p>
                     </c>
                     <c ca="center">
                        <p>5.5</p>
                     </c>
                     <c ca="center">
                        <p>9.8</p>
                     </c>
                     <c ca="center">
                        <p>12.2</p>
                     </c>
                     <c ca="center">
                        <p>18.1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GT</p>
                     </c>
                     <c ca="center">
                        <p>4.6</p>
                     </c>
                     <c ca="center">
                        <p>3.3</p>
                     </c>
                     <c ca="center">
                        <p>3.3</p>
                     </c>
                     <c ca="center">
                        <p>5.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GC</p>
                     </c>
                     <c ca="center">
                        <p>8.1</p>
                     </c>
                     <c ca="center">
                        <p>2.2</p>
                     </c>
                     <c ca="center">
                        <p>4.1</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GA</p>
                     </c>
                     <c ca="center">
                        <p>6.4</p>
                     </c>
                     <c ca="center">
                        <p>5.4</p>
                     </c>
                     <c ca="center">
                        <p>5.6</p>
                     </c>
                     <c ca="center">
                        <p>5.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GG</p>
                     </c>
                     <c ca="center">
                        <p>9.0</p>
                     </c>
                     <c ca="center">
                        <p>4.3</p>
                     </c>
                     <c ca="center">
                        <p>5.0</p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Xfa genome/<it>trpR p</it>-value = 0.031; Ctr genome/<it>trpR p</it>-value = 0.730.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>What is the origin of the LGT gene block?</p>
            </st>
            <p>Gene organization is subject to constant change. For precisely this reason, the overall gene organization within the low-GC gene block might implicate a donor organism because the LGT event is inferred to be recent. Because the enteric lineage is a reasonable source of the LGT gene block, it is pertinent that the gene organization around <it>trpR </it>is highly conserved in the enteric lineage. Without exception, <it>trpR </it>in the enteric lineage is preceded upstream by a gene encoding soluble lytic murein transglycosylase (<it>slt</it>). <it>hemK </it>is usually positioned directly downstream, except for the <it>Haemophilus actinomycetemcomitans/H. influenzae/Pasteurella multocida </it>grouping (where the downstream gene encodes a monofunctional biosynthetic peptidoglycan transglycosylase (<it>mtgA</it>)). No genomes of the enteric lineage were found to possess <it>trpR </it>in a context of flanking genes that resembled the <it>X. fastidiosa </it>gene organization.</p>
            <p>The LGT-block of <it>Xylella </it>genes conceivably could have originated from a donor similar to a common ancestor of the chlamydiae before the massive gene reduction associated with the chlamydial lifestyle. This would be consistent with the low GC content of both the chlamydial genome and the LGT-block of genes, as well as with the observation that chlamydiae and <it>Xylella </it>are the only two known taxa where <it>trpR </it>is positioned near structural genes of the tryptophan pathway. Direct comparison of chlamydial <it>trpAa </it>and <it>trpAb </it>genes with those of the <it>Xylella </it>operon is not possible because all chlamydial genomes thus far mapped lack <it>trpAa </it>and <it>trpAb </it>[<abbr bid="B23">23</abbr>]. In this context, sequencing of genomes from closely related free-living relatives of the chlamydiae could be informative. The currently available chlamydial genomes also lack other genes of the low-GC block.</p>
            <p><it>C. burnetii </it>was also considered as a possible source of the low-GC gene block in <it>X. fastidiosa </it>because it possesses <it>trpR</it>. This potential LGT event seems ruled out because <it>trpR </it>is not near any structural genes encoding TrpAa and TrpAb in <it>C. burnetti; C. burnetii </it>TrpAa and TrpAb are not close to the corresponding <it>X. fastidiosa </it>enzymes on phylogenetic trees; and <it>C. burnetii </it>lacks the remaining genes in the low-GC gene block of <it>X.fastidiosa</it>.</p>
            <p>If LGT accounts for the low-GC gene block in <it>X. fastidiosa</it>, how recent was this event? Presumably, it was sufficiently recent that significant amelioration to the genomic GC content has not yet occurred. The closest sequenced genome to <it>Xylella </it>is <it>Xanthomonas</it>. Genomes representing two species of the latter genus have been sequenced, and both lack the low-GC gene block. Therefore, the putative LGT event occurred some time after lineage divergence of <it>Xylella </it>and <it>Xanthomonas</it>. On the other hand, LGT presumably has predated speciation in the <it>Xylella </it>genus as all three strains of <it>Xylella </it>in the National Center for Biotechnology Information (NCBI) database possess the low-GC gene block. The T<sup>2 </sup>score of Hooper and Berg [<abbr bid="B31">31</abbr>] measures the covariance of 3:1 dinucleotide signatures, and is designed to recognize very recent imports of alien genes by LGT. T<sup>2 </sup>scores calculated for the low-GC gene block of <it>X. fastidiosa </it>were not above the required threshold for very recent gene imports.</p>
         </sec>
         <sec>
            <st>
               <p>What is the function of the low-GC block of genes in <it>Xyella</it>?</p>
            </st>
            <p>Within the low-GC block, <it>trpR </it>is separated by four ORFs from genes encoding the two subunits of anthranilate synthase (<it>trpAa </it>and <it>trpAb</it>). These probably do not function for general tryptophan biosynthesis since paralogs of these genes, which exhibit a phylogenetically congruent context of gene organization, exist elsewhere in the genome (Figure <figr fid="F6">6</figr>). The latter genes are located within either of two separate operon clusters (Figure <figr fid="F6">6</figr>) with the GC content characteristic of <it>X. fastidiosa</it>. The GC-content values for the latter genes: <it>trpAa, trpAb, trpB, trpC, trpD, trpEa</it>, and <it>trpEb </it>are 52%, 49%, 54%, 55%, 51%, 59% and 55%, respectively. Furthermore, Figure <figr fid="F6">6</figr> shows that the organization of the full complement of <it>trp</it>-pathway genes into two operons in <it>X. fastidiosa </it>is similar or identical to that of some of its nearest neighbors on the 16S rRNA tree, although the <it>Xylella </it>operons exhibit atypically large intergenic spacings. None of these neighbors possesses the low-GC block of <it>Xylella </it>genes illustrated in Figure <figr fid="F4">4</figr>. Hence, the two operons shown in Figure <figr fid="F6">6</figr> can be inferred to be responsible for primary tryptophan biosynthesis throughout this clade.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Organization of <it>trp</it>-pathway genes in <it>X. fastidiosa </it>and its nearest phylogenetic neighbors</p>
               </caption>
               <text>
                  <p>Organization of <it>trp</it>-pathway genes in <it>X. fastidiosa </it>and its nearest phylogenetic neighbors. The position of the organisms indicated on a 16S rRNA subtree is shown at the far left. To enhance the presentation, the <it>trp</it>-gene acronyms are shortened. Thus, <it>trpAa </it>is shown as Aa, etc. Intergenic spacing is indicated. <it>dmt </it>refers to a putative DNA methyltransferase. <it>TrpAa </it>in <it>Nitrosomonas europeae </it>and <it>trpC </it>in <it>Bordetella parapertussis </it>are located in other chromosomal positions, unlinked to other <it>trp</it>-pathway genes. <it>X. fastidiosa </it>and <it>N. europeae</it>, but not the other organisms shown in the figure, possess <it>truA </it>(encoding tRNA pseudouridine synthase A) upstream of <it>trpC. truA </it>and <it>trpC </it>are translationally coupled with 31-bp and 105-bp overlaps in <it>X. fastidiosa </it>and <it>N. europeae</it>, respectively. The gene organizations shown for a given organism is identical to the other organisms shown in parentheses as follows: <it>Ralstonia metallidurans </it>(<it>R. solanacearum</it>), <it>Burkholderia fungorum </it>(<it>B. pseudomallei</it>, <it>B. mallei</it>), and <it>B. parapertussis </it>(<it>B. pertussis</it>, <it>B. bronchiseptica</it>). <it>R. solanacearum</it>, in addition to the genes shown, has adjacent paralogs of <it>trpB </it>and <it>trpD </it>located on a large plasmid. The <it>trpAaAbBD </it>and <it>trpCEbEa </it>operons of the <it>X. fastidiosa </it>9a5c genome correspond to gene numbers XF0210-XF0213 and XF1374-XF1376, respectively.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-6"/>
            </fig>
            <p>Genes encoding the two anthranilate synthase subunits (<it>trpAa </it>and <it>trpAb</it>) and aryl-CoA ligase (<it>acl</it>) surely belong to an operon, as translational coupling is evident from the overlap of start and stop codons (Figure <figr fid="F4">4</figr>). Acl exhibits strong similarity to coenzyme F390 synthetase of methanogenic archaea, as well as to phenylacetate-CoA ligase of <it>E. coli</it>. As <it>Xylella </it>does not appear to make the F420 cofactor that is the substrate of F390 synthetase, the function of Acl is likely to be closer to phenylacetate-CoA ligase. The aromatic ring is highly stable, and CoA thioesterification can provide chemical activation, allowing cleavage of the aromatic ring, as exemplified by catabolism of benzoate, 4-hydroxybenzoate, and anthranilate [<abbr bid="B32">32</abbr>]. Because <it>acl </it>is tightly organized with <it>trpAa </it>and <it>trpAb</it>, it seems feasible that anthranilate might be the substrate of <it>acl</it>. An anthranilate-CoA ligase has been described recently in <it>Azoarcus evansii </it>by Schuhle <it>et al</it>. [<abbr bid="B33">33</abbr>]. The <it>Xylella </it>Acl exhibited greater identity with phenylacetate-CoA ligase of <it>E. coli </it>than with anthranilate-CoA ligase of <it>A. evansii</it>, but a given substrate specificity within homology groups often can be associated with different subgroupings [<abbr bid="B25">25</abbr>,<abbr bid="B34">34</abbr>].</p>
            <p>If anthranilate is indeed the substrate of Acl in <it>Xylella</it>, it would be a futile cycle if anthranilate were formed biosynthetically, only to be subsequently catabolized. Therefore, it seems more likely that the activation of anthranilate could be a step in the formation of a siderophore or antibiotic compound that is assembled by a nonribosomal peptide synthetase mechanism (see Quadri <it>et al</it>. [<abbr bid="B35">35</abbr>] and references therein for numerous examples). Pyochelin from <it>Pseudomonas aeruginosa </it>exemplifies an iron siderophore whose peptide-based synthesis depends on CoA-activated salicylate (closely related to anthranilate) as a starter unit [<abbr bid="B36">36</abbr>].</p>
            <p>While it appears likely that <it>trpR</it>, aryl-CoA ligase, <it>trpAa </it>and <it>trpAb </it>belong to a common functional unit, the possible roles of the remaining three genes downstream of <it>acl </it>are problematic at the present time.</p>
         </sec>
         <sec>
            <st>
               <p>The <it>Anabaena/Nostoc </it>gene blocks</p>
            </st>
            <p>The large gene blocks in <it>Anabaena </it>and <it>Nostoc </it>that begin with <it>aroA</it><sub>I&#946; </sub>and end with <it>tyrAr</it><sub>(p) </sub>exhibited GC ratios that were similar to that of the host genome (Table <tblr tid="T2">2</tblr>). This is not necessarily inconsistent with their possible origin by LGT because the GC ratio of a putative donor genome could have been coincidentally similar to that of <it>Anabaena </it>and <it>Nostoc</it>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Did operonic genes originate by LGT?</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>First BLAST hit</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Second BLAST hit</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Gene product</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>% GC</p>
                     </c>
                     <c ca="left">
                        <p>Organism</p>
                     </c>
                     <c ca="center">
                        <p>% Identity</p>
                     </c>
                     <c ca="left">
                        <p>Organism</p>
                     </c>
                     <c ca="center">
                        <p>% Identity</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Anabaena</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>AroA<sub>I&#946;</sub></p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>71</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpB</p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>76</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpEb</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpEa</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>85</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>74</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Qor</p>
                     </c>
                     <c ca="center">
                        <p>45</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Enterococcus faecalis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Streptomyces coelicolor</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpD</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>56</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpAa&#8226;TrpAb</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>81</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Gpm1</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>70</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Streptomyces coelicolor</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>51</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>58</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Streptomyces coelicolor</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>AroB</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>62</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TyrA<sub>(p)</sub></p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Yersinia pseudotuberculosis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>43</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Nostoc<sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>AroA<sub>I&#946;</sub></p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>79</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpB</p>
                     </c>
                     <c ca="center">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>77</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpEb</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TyrP_1</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nitrosomonas europeae</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpEa</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>85</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>74</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpD</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TrpAa&#8226;TrpAb</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>81</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>76</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>AroB</p>
                     </c>
                     <c ca="center">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nostoc punctiforme</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>FrnE</p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Deinococcus radiodurans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Rhodobacter capsulatus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>TyrA<sub>(p)</sub></p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p><it>Anabaena </it>sp.</p>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Yersinia pseudotuberculosis</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*<it>Anabaena </it>sp. PCC 7120 has a genomic GC ratio of 42.82%. <sup>&#8224;</sup><it>Nostoc punctiforme </it>has a genomic GC ratio of 43.90%.</p>
               </tblfn>
            </tbl>
            <p>Accordingly, the 3:1 dinucleotide frequencies of the <it>aroA</it><sub>I&#946;</sub>-<it>tyrA</it><sub>c </sub>gene block and the immediately flanking genes were analyzed, but these dinucleotide frequencies also did not suggest LGT. Figure <figr fid="F7">7</figr> shows that dinucleotide frequencies did not deviate more than 5% from the genomic frequencies across the <it>aroA</it><sub>I&#946;</sub>-<it>tyrA</it><sub>c </sub>gene block. This contrasts with the distinctly greater deviation of 3:1 dinucleotide frequencies within the low-GC gene block of <it>Xylella</it>, which is shown on the same scale as the <it>Anabaena </it>data.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Three-to-one dinucleotide analysis</p>
               </caption>
               <text>
                  <p>Three-to-one dinucleotide analysis. <b>(a) </b>The <it>aroA</it><sub>I&#946;</sub>-<it>tyrA</it>c gene block in <it>Anabaena</it>. Deviations from genomic frequencies are expressed as positive (upward-pointing bars) or negative (downward-pointing bars) percentages. <b>(b) </b>For comparison, the results obtained for the low-GC gene block of <it>X. fastidiosa </it>(of which Figure <figr fid="F4">4</figr> is a subset). The gene blocks of interest are highlighted in yellow, and the flanking genes are indicated by numbers.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-7"/>
            </fig>
            <p>Codon usage was analyzed throughout the gene block and also failed to implicate LGT. Figure <figr fid="F8">8</figr> exemplifies this with a comparison of the pair of TrpAa domains in <it>Anabaena</it>, one encoded from within the gene block and the other outside. As <it>Xylella </it>also possesses a TrpAa pair, one encoded from within the low-GC gene block and the other outside, the analyses of these are also included in Figure <figr fid="F8">8</figr> as a kind of positive control. In <it>Anabaena </it>(two bars on the right of each panel) the codon usage for leucine, serine, arginine, glycine, valine and proline was very similar for the TrpAa domain of TrpAa&#8226;TrpAb in the <it>aroA</it><sub>I&#946;</sub>-<it>tyrA</it><sub>(p) </sub>gene block and for the stand-alone TrpAa protein. This contrasts with the results obtained for the two <it>Xylella </it>TrpAa proteins, one in the low-GC gene block (on the far left) and the other (second bar in each panel) encoded by the gene in the <it>trpAaAbBD </it>operon (Figure <figr fid="F6">6</figr>). Thus, in contrast to the <it>Anabaena </it>TrpAa pair, the <it>Xylella </it>TrpAaAb pair exhibited distinctly different codon usage. Although this result is certainly consistent with an explanation of LGT in <it>Xylella</it>, one cannot be certain that the different functional roles of TrpAa domains might be associated with differing intra-genomic patterns of codon usage that are not yet well characterized [<abbr bid="B37">37</abbr>].</p>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Codon usage for the pairs of TrpAa domains in the genomes of <it>Anabaena </it>sp. (Asp) and <it>Xylella fastidiosa </it>(Xfa)</p>
               </caption>
               <text>
                  <p>Codon usage for the pairs of TrpAa domains in the genomes of <it>Anabaena </it>sp. (Asp) and <it>Xylella fastidiosa </it>(Xfa). <b>(a) </b>Leucine; <b>(b) </b>serine; <b>(c) </b>arginine; <b>(d) </b>glycine; <b>(e) </b>valine and <b>(f) </b>proline. From left-to-right, Xfa TrpAa_1 is encoded from the low-GC gene block (In) and Xfa TrpAa_2 is encoded from outside (Out) the gene block; Asp TrpAa_1 is encoded from within the <it>aroA</it><sub>I&#946;</sub><it>tyrA</it><sub>(p) </sub>gene block (In) and Asp TrpAa_2 is encoded from outside (Out) the latter gene block. Synonymous codons are shown at the right of each amino acid set and color-coded to match the percent usage indicated by the bars.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-8"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of protein trees</p>
            </st>
            <p>We evaluated whether the closest BLAST hits, using as queries the amino-acid sequences corresponding to the operonic genes of <it>Anabaena </it>or <it>Nostoc</it>, would be with other cyanobacteria (and therefore consistent with origin by gene duplication) or with another taxon grouping (consistent with LGT). In either case, one would expect that the sequences encoded by the operonic genes of <it>Anabaena </it>would be the best matches for the operonic genes of <it>Nostoc</it>, as was indeed the case. For all of the operonic <it>Anabaena/Nostoc </it>Trp-pathway proteins used as queries, homolog sequences from other cyanobacteria (<it>Synechocystis, Synechococcus, Prochlorococcus</it>) were the remaining top hits returned in the BLAST queue. As BLAST hits must be considered imperfect indicators of nearest-neighbor homologs [<abbr bid="B38">38</abbr>], the conclusion that the operonic <it>trp</it>-pathway genes are of cyanobacterial lineage origin was confirmed more rigorously by examination of extensive trees (available upon request) constructed for each <it>trp </it>protein of <it>Anabaena </it>and <it>Nostoc</it>. For the Trp-pathway proteins, all the cyanobacterial proteins clustered together, regardless of whether they were <it>Anabaena </it>or <it>Nostoc </it>paralogs or whether they were the singly represented proteins of <it>Synechocystis, Synechococcus</it>, or <it>Prochlorococcus</it>. The same result was obtained for AroA<sub>I&#946; </sub>protein trees. All the redundant genes exhibited identity relationships that suggested their origin by one or more gene-duplication events in the common ancestor of <it>Anabaena </it>and <it>Nostoc</it>; that is, exactly as diagrammed in Figure <figr fid="F3">3</figr>.</p>
            <p>A different result was obtained for genes encoding AroB and TyrA. AroB sequences in nature are rather divergent. All of the cyanobacterial AroB proteins form a compact cluster in the AroB tree (including the non-operonic <it>Anabaena/Nostoc aroB </it>genes), except for those encoded by the <it>Anabaena/ Nostoc </it>supraoperons. The supraoperonic AroB proteins occupy a tree position that is not particularly close to other AroB proteins (the closest matches being on the order of 30-35% identity with some enteric bacteria). A similar situation applies to TyrA<sub>(p)</sub>. All cyanobacteria possess the arogenate dehydrogenase specificity class (denoted TyrA<sub>a</sub>) of the TyrA superfamily. The additional TyrA<sub>(p) </sub>present only in <it>Anabaena </it>and <it>Nostoc </it>and located as the carboxy-terminal gene of the supraoperon exhibits identities of 39-43% with the TyrA<sub>(p) </sub>proteins of some enteric bacteria. These results for supraoperonic <it>aroB </it>and <it>tyrA</it><sub>(p) </sub>could be consistent with LGT, but with no clear donor candidates available. On the other hand, origin as ancient paralogs is also a possibility.</p>
         </sec>
         <sec>
            <st>
               <p>The <it>trpAa</it>&#8226;<it>trpAb </it>fusion</p>
            </st>
            <p>A particularly fortuitous gene that could favor or disfavor the hypothesis of LGT of the <it>aroA</it><sub>I&#946;</sub>-<it>tyrA</it><sub>(p) </sub>gene block in <it>Anabaena/Nostoc </it>is <it>trpAa</it>&#8226;<it>trpAb</it>, a fusion corresponding to two genes that are usually separate (free-standing). As only a limited number of <it>trpAa</it>&#8226;<it>trpAb </it>fusions are known, possible LGT donors can be evaluated. Organisms known to possess the <it>trpAa</it>&#8226;<it>trpAb </it>fusion are listed at the top of Table <tblr tid="T3">3</tblr>. Another small group of <it>trpAa</it>&#8226;<it>trpAb </it>fusions are known, which are dedicated to phenazine biosynthesis and which form a distinct cluster. These are denoted <it>trpAa</it>&#8226;<it>trpAb_phz </it>in Table <tblr tid="T3">3</tblr>. Thus far, the <it>trpAa</it>&#8226;<it>trpAb_phz</it>, fusions are limited to species of <it>Pseudomonas </it>and <it>Streptomyces. pabAa </it>and <it>pabAb </it>are homologs of <it>trpAa </it>and <it>trpAb</it>, and the distribution of fusions involving these domains are also listed in Table <tblr tid="T3">3</tblr> to give a general sense of the frequencies of such gene fusions. A variety of data (G.X. and R.A.J., unpublished observation) indicates that equivalent fusions often arise independently of one another in widely spaced lineages.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Gene fusions involving <it>trpAa </it>and or <it>trpAb </it>homologs</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>trpAa</it>&#8226;<it>trpAb</it></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p><it>Brucella melitensis</it>; <it>Sinorhizobium meliloti; Agrobacterium tumefaciens</it>; <it>Azospirillum brasilense; Nostoc punctiforme; Thermomonospora fusca; Rhodopseudomonas palustris; Rhizobium loti</it>; <it>Legionella pneumophila; Anabaena </it>sp._1; <it>Anabaena </it>sp._2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><b><it>trpAa</it>&#8226;trpAb_phz</b>*</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Pseudomonas aureofaciens; Pseudomonas aeruginosa; Pseudomonas chlororaphis; Pseudomonas fluorescens; Streptomyces venezuelae; Streptomyces coelicolor</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>trpAb</it>&#8226;<it>trpB</it></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Escherichia coli; Salmonella typhi; Campylobacter jejuni; Thermotoga maritima</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>pabAa</it>&#8226;<it>pabAb</it></b>
                           <sup>&#8224;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Deinococcus radiodurans</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>pabAa</it>&#8226;<it>pabAc</it></b>
                           <sup>&#8224;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p><it>Neisseria meningitides; Neisseria gonorrhoeae; Chlorobium tepidum; Helicobacter pylori; Campylobacter jejuni; Streptococcus pneumoniae; Streptococcus pyogenes; Streptococcus equi; Streptococcus gordonii; Listeria innocua; Listeria monocytogenes; Geobacter sulfurreducens; Ralstonia solanacearum; Burkholderia fungorum; Novosphingobium aromaticivorans; Chlorobium tepidum; Ralstonia metallidurans; Lactococcus lactis; Burkholderia pseudomallei; Magnetococcus </it>sp.</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>pabAb</it>&#8226;<it>pabAa</it></b>
                           <sup>&#8224;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p><it>Streptomyces griseus; Streptomyces venezuelae; Streptomyces pristinaespiralis; Thermomonospora fusca; Anabaena </it>sp.; <it>Nostoc punctiforme; Corynebacterium glutamicum; Saccharomyces cerevisiae; Aspergillus fumigatus; Plasmodium falciparum; Coprinus cinereus; Schizosaccharomyces pombe</it></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Also known as <it>phzE</it>. <sup>&#8224;</sup><it>pabAa</it>, <it>pabAb </it>and <it>pabAc </it>are also known as <it>pabB</it>, <it>pabA </it>and <it>pabC</it>.</p>
               </tblfn>
            </tbl>
            <p>Figure <figr fid="F9">9</figr> shows a segment of the 16S rRNA tree that contains all of the <it>trpAa</it>&#8226;<it>trpAb </it>fusions which are known so far. Cyanobacteria other than <it>Anabaena/Nostoc </it>lack the fusion, as do all nearby lineages. The fusion is present in the cluster that includes <it>Rhodopseudomonas palustris, Rhizobium loti, Brucella melitensis, Agrobacterium tumefaciens </it>and <it>Sinorhizobium meliloti. </it>(<it>A. tumefaciens</it>, which is not shown in Figure <figr fid="F9">9</figr>, is virtually identical to <it>S. meliloti</it>). Additional phylogenetically spaced fusions are present in <it>Thermomonospora fusca, Azospirillum brasilense</it>, and <it>Legionella pneumophila</it>. Other fusions that involve <it>trpAa </it>or <it>trpAb </it>homologs also occur in nature, as shown in Table <tblr tid="T3">3</tblr>, and a degree of care is needed to avoid confusion between them.</p>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>16S rRNA tree showing the phylogenetic distribution (highlighted in yellow) of <it>trpAa</it>&#8226;<it>trpAb </it>fusions</p>
               </caption>
               <text>
                  <p>16S rRNA tree showing the phylogenetic distribution (highlighted in yellow) of <it>trpAa</it>&#8226;<it>trpAb </it>fusions. The gene fusions unlinked to any other <it>trp </it>genes are shown to the right of the highlighted name. The remaining <it>trp</it>-operon gene organizations are shown at the right. The white arrows indicate gene insertions that encode the following: <it>Thermomonospora</it>, integral membrane protein; <it>Streptomyces</it>, three membrane proteins: <it>Corynebacterium</it>, membrane protein, pantoate 3-alanine ligase (<it>panC</it>), and 3-methyl-2-oxobutanoate hydroxymethyl transferase (<it>panB</it>); <it>Mycobacterium</it>, conserved hypothetical protein; <it>Cytophaga</it>, conserved hypothetical protein; <it>Sphingomonas</it>, conserved hypothetical protein and outer-membrane protein; <it>Rhodobacter</it>, and acetyltransferase <it>yibQ</it>; <it>Ralstonia</it>, DNA methyltransferase (<it>dmt</it>); <it>Burkholdaria</it>, DNA methyltransferase (<it>dmt</it>). In addition <it>aroR </it>in <it>R. sphaeroides </it>is a putative regulatory gene [<abbr bid="B58">58</abbr>]. The lineage relationships of three organisms that have maintained the putative ancestral <it>trp </it>operon are shown with heavy, gray lines.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-9"/>
            </fig>
            <p>A phylogenetic tree consisting of all free-standing TrpAa and TrpAb proteins was constructed, together with the corresponding two domains of the TrpAa&#8226;TrpAb fusions (available upon request). Surprisingly, each of the 10 fusion domains clustered tightly on the TrpAa and TrpAb trees, to the exclusion of the free-standing TrpAa and TrpAb domains. This is consistent with a single ancestral fusion event, but requires the assumption of multiple LGT events. However, it is surprising that no free-standing domains (that is, close homologs of the original fusion partners) cluster with either of the two sets of 10 fusion domains. This might suggest an alternative to LGT, namely that there has been extreme sequence convergence because of strong selection for appropriate residues mediating domain-domain interactions. If so, it is possible that <it>trpAa</it>&#8226;<it>trpAb </it>fusions occurred as a number of independent events, followed by strong convergence.</p>
            <p>Figure <figr fid="F9">9</figr> shows the individual genomic organization of trp-pathway genes in the 16S rRNA tree sector that is relevant to the <it>trpAa</it>&#8226;<it>trpAb </it>fusion. The <it>Anabaena/Nostoc </it>lineage is unique in having <it>trpAa</it>&#8226;<it>trpAb </it>linked to other <it>trp</it>-pathway genes and is further unique in having an additional set of freestanding genes encoding TrpAa and TrpAb. Although generally uncommon, complete dispersal of Trp-pathway genes is characteristic of the non-filamentous cyanobacteria, <it>Aquifex aeolicus </it>and <it>Chlorobium tepidum</it>. The ancestral state of <it>trp </it>gene organization has been asserted (G.X., C.B., N.K. and R.J., unpublished work) to be <it>trpAa/Ab/B/D/C/Eb/Ea</it>, an operon organization seen in contemporary <it>Cytophaga hutchinsonii, Desulfovibrio vulgaris </it>and <it>Coxiella burnetii </it>(Figure <figr fid="F9">9</figr>). Dynamic gene reorganization events that involve gene insertions, gene scrambling, gene duplications and gene dispersal are apparent from inspection of Figure <figr fid="F9">9</figr>.</p>
            <p>It is expected that LGT would most easily be recognized if it occurred relatively recently before passage of sufficient time for amelioration of alien characteristics to those of the host genome, for example GC content. In the case of each of the known <it>trpAa</it>&#8226;<it>trpAb </it>gene fusions, the absence of the gene fusion in a closely related genome implies that the gene-fusion event (or the LGT event) occurred recently, that is, in the one lineage following the time of its separation from the other by speciation. Thus, the acquisition of <it>trpAa</it>&#8226;<it>trpAb </it>by <it>Thermomonospora fusca </it>must have occurred by fusion or by LGT relatively recently, that is, after the speciation event that generated the <it>Streptomyces </it>lineage (see Figure <figr fid="F9">9</figr>). In each of the remaining cases of <it>trpAa</it>&#8226;<it>trpAb </it>fusion, a relatively near time of fusion shown in Figure <figr fid="F9">9</figr> origin can be identified. These are defined by points of speciation divergence between <it>Anabaena/Nostoc </it>and other cyanobacteria, between the <it>Rhodopseudomonas/Sinorhizobium </it>cluster (fusion) and <it>Caulobacter </it>(no fusion), between <it>Azospirillum brasilense </it>(fusion) and <it>Magnetospirillum magnetotacticum </it>(no fusion), and between <it>Legionella pneumophila </it>(fusion) and <it>Coxiella burnetii </it>(no fusion).</p>
            <p>If any of the <it>trpAa</it>&#8226;<it>trpAb </it>fusions, other than the <it>Nostoc/Anabaena </it>pair, have a common origin, similar flanking regions of gene organization might be expected since all of the fusions are of relatively recent origin. On this criterion, only <it>R. loti, B. melitensis, A. tumefaciens </it>and <it>S. meliloti </it>exhibited similarities of flanking-gene organization, and this is phylogenetically congruent. These observations imply that within the span of phylogeny shown in Figure <figr fid="F9">9</figr>, the <it>trpAa</it>&#8226;<it>trpAb </it>fusion may have occurred independently as many as seven times.</p>
         </sec>
         <sec>
            <st>
               <p>Interdomain linker regions</p>
            </st>
            <p>In fusion proteins an interdomain linker region of critical length and mobility is important to facilitate specific domain-domain interactions. Fusions of independent origin might be expected to exhibit a variety of linker regions. Particular constraints undoubtedly limit this variety, and such constraints might be more stringent for some domain combinations than others. (In the case of particularly stringent constraints, similar linker regions would not necessarily demonstrate a common origin). Figure <figr fid="F10">10</figr> shows an alignment of the carboxy-terminal region of the TrpAa domain, the linker region, and the amino-terminal region of the TrpAb domain for all of the fusion proteins depicted in Figure <figr fid="F9">9</figr> (as well as that from <it>A. tumefaciens</it>).</p>
            <fig id="F10">
               <title>
                  <p>Figure 10</p>
               </title>
               <caption>
                  <p>Comparison of TrpAa&#8226;TrpAb linker regions</p>
               </caption>
               <text>
                  <p>Comparison of TrpAa&#8226;TrpAb linker regions. The seven independent fusions that are suggested were aligned with free-standing TrpAa and TrpAb proteins in order to visualize the inter-domain linker regions. Amino-acid residue numbering is indicated at the left and right margins.</p>
               </text>
               <graphic file="gb-2003-4-2-r14-10"/>
            </fig>
            <p>Only the two operonic fusion proteins from <it>Anabaena </it>and <it>Nostoc </it>and the four rhizobial fusion proteins (Mlo, Bme, Rme and Atu) exhibit linker regions of identical length and obvious similarity. The paralog TrpAa&#8226;TrpAb protein of <it>Anabaena </it>sp (Asp_2) seems to have a distinctly different linker, and it maybe that the two fusions in <it>Anabaena </it>arose as two independent events. The partial sequences shown in Figure <figr fid="F10">10</figr> are spaced to indicate the seven independent events of gene fusion that are suggested.</p>
         </sec>
         <sec>
            <st>
               <p>Function of the <it>Anabaena/Nostoc </it>gene blocks?</p>
            </st>
            <p>The gene blocks shown in Figure <figr fid="F2">2</figr> encode the entire tryptophan pathway (except for <it>trpC</it>), as well as the first two enzymes of the common aromatic pathway, and the key enzyme of tyrosine biosynthesis. Multiple enzymes catalyzing the same reaction have been described in developmental systems where differential regulation of isoenzymes are deployed in different temporal and spatial contexts. Filamentous cyanobacteria (such as <it>Anabaena </it>and <it>Nostoc</it>) subscribe to a developmental program of heterocyst formation that is widely considered the primitive state and that correlates with their exceedingly large genomes. Unicellular cyanobacteria such as <it>Synechocystis, Synechococcus </it>and <it>Prochlorococcus </it>have far smaller genomes and lack the ability to fix nitrogen (heterocyst formation). It, therefore, seems to be a distinct possibility that the gene blocks diagrammed in Figure <figr fid="F2">2</figr> (as well as additional gene duplicates) are specifically involved in specialized capabilities of <it>Nostoc/Anabaena </it>that do not exist in other cyanobacteria. In terms of the evolutionary scenario, the <it>Anabaena/Nostoc </it>lineage may reflect the ancestral state, and modern unicellular cyanobacteria may be derived genomes that are smaller and more streamlined (reductive evolution).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <sec>
            <st>
               <p>Operon displacement</p>
            </st>
            <p>Alien genes that may be subject to possible LGT can generally expect a hostile reception in that they lack a history of functional integration with the resident genome. Genes that offer immediate selective advantages (for example, antibiotic resistance) are likely to persist. The acquisition of a completely new functional capability will often require an entire suite of novel genes, and such recruitment is certainly easier to envision if all of the genes arrive <it>en bloc </it>(that is, as an operon). Once a primary biosynthetic pathway, such as that responsible for tryptophan formation, has been established and integrated with the individualistic metabolic circuitry of a given organism, one does not expect facile displacement of resident genes. This should apply even if the incoming genes all coexist as an operon. We have found only two examples of LGT of whole-Trp operons, that of <it>trpAa/Ab/B/D</it>&#8226;<it>C/Eb/Ea </it>from the enteric lineage to coryneform bacteria and to <it>Helicobacter</it>, as discussed earlier.</p>
         </sec>
         <sec>
            <st>
               <p>Has there been separate lateral gene transfer of individual genes?</p>
            </st>
            <p>According to the foregoing rationale, isolated genes that participate in multi-step processes would not generally be expected to have much success in LGT. In some cases analog genes encode enzymes that catalyze the same reaction in a multi-step pathway, and one analog gene might conceivably displace another. Lack of enough information about genomic representation of such analog genes can lead to incorrect inferences of LGT. For example, the initial discovery of "plant-type" AroA<sub>II </sub>in bacteria led to the assumption of LGT from plant to bacterium. Elucidation of the fuller genomic representation of <it>aroA</it><sub><it>II </it></sub>([<abbr bid="B27">27</abbr>] and refs therein) demonstrated the origin of <it>aroA</it><sub><it>II </it></sub>in Bacteria, and plants probably have received <it>aroA</it><sub><it>II </it></sub>from the Bacteria via endosymbiosis. A similar outcome seems quite possible with respect to the "eukaryotic" fructose-1,6-bisphosphate aldolase in <it>Xylella </it>species. Phylogenetic incongruities that involve such analogs can pose great difficulties in distinguishing LGT from vertical progressions of differential analog losses in different lineages.</p>
         </sec>
         <sec>
            <st>
               <p>Specialized <it>Trp </it>genes not required for primary biosynthesis</p>
            </st>
            <p>In this article we focus on a number of cases where at least several <it>trp </it>genes are linked, thus providing analytical advantages offered by the analysis of more than one gene. These genes are also redundant and phylogenetically incongruent, in contrast to coexisting homolog genes that are part of a full phylogenetically congruent set. Both of the latter are consistent with origin by LGT, but unrecognized ancient paralogy is also possible. In the first case, the homologs coexisting in one organism are xenologs, whereas in the latter case, they are paralogs. A relatively simple example is the <it>trpAa/trpAb </it>pair originally denoted <it>phnA/phnB </it>in <it>Pseudomonas aeruginosa </it>[<abbr bid="B39">39</abbr>]. This comprises an anthranilate synthase that is not strictly required for primary tryptophan biosynthesis and that is uniquely expressed during stationary-phase physiology [<abbr bid="B40">40</abbr>]. Why the generation of anthranilate under these conditions would be of value is unknown, but phylogenetic trees clearly show <it>phnA/phnB </it>to be xenologs originating from the enteric lineage via LGT (G.X. and R.A.J., unpublished data). In this case, genes that function for primary biosynthesis in the donor genome did not displace the corresponding genes in the recipient genome, but have instead been recruited to a specialized function. In <it>Streptomyces coelicolor, trpAa/trpAb/trpB/trpD/aroA</it><sub>II </sub>are contained within a large cluster dedicated to antibiotic synthesis [<abbr bid="B41">41</abbr>]. Calcium-dependent antibiotic (CDC) contains tryptophan, and presumably the feedback-resistant variety of enzyme encoded by <it>aroA</it><sub>II </sub>ensures enhanced precursor flow to tryptophan during antibiotic production. Detailed studies have not yet been done to see whether the CDC gene cluster originated via LGT or reflects ancient paralogy.</p>
            <p>In this article, we have discussed at length the <it>Xylella </it>and cyanobacterial gene blocks that seem likely to have specialized functional roles other than primary biosynthesis. The <it>Xylella </it>genes are associated with other genes that presumably dictate a fate for anthranilate other than as a primary precursor of tryptophan. We suspect that selective advantages conferred by this specialized operon accommodated successful LGT to <it>Xylella</it>. The <it>Anabaena/Nostoc </it>supraoperon is reminiscent of the <it>S. coelicolor </it>system in the inclusion of AroA<sub>I&#946;</sub>, which might enhance precursor flow to chorismate. Although the <it>Anabaena/Nostoc </it>operon only lacks <it>trpC</it>, its features of gene fusion and gene organization are novel. It might perhaps have an unknown physiological function related to the complex developmental programs unique to heterocystous cyanobacteria. We conclude that in this case the operonic <it>trp </it>genes are ancient paralogs of a dispersed set of <it>trp </it>genes engaged in primary biosynthesis.</p>
            <p>Against a backdrop where organisms generally possess highly efficient and integrated pathways of tryptophan biosynthesis, displacement of resident genes by LGT of the corresponding genes is relatively infrequent. Aside from the broadly distributed primary pathway, highly specialized pathways are known that utilize some or all tryptophan-pathway enzymes, and these pathways can originate by recruitment of paralog genes derived from the primary-pathway genes [<abbr bid="B42">42</abbr>]. The genes of such specialized operons may diverge considerably to meet the demands of a novel functional role. In a contemporary organism this might have the status of unrecognized (or recognized) paralogy, as we suggest for the <it>Anabaena/Nostoc </it>gene block. However, such an operon module also has strong potential for xenologous transfer because of its specialized functional potential.</p>
            <p>The tryptophan pathway exemplifies the situation where paralogs can be engaged in primary amino-acid biosynthesis (widespread) or in a variety of specialized pathways (narrowly distributed). Aside from the extent to which the specialized pathways may be individually intriguing and important, this study illustrates that case-by-case analysis can distinguish paralogs (or xenologs) from their homologs engaged in primary biosynthesis. This conclusion is encouraging as it shows that both vertical and horizontal events of gene transfer can be deduced to track evolutionary history.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Dinucleotide frequencies</p>
            </st>
            <p>The CODONW program [<abbr bid="B43">43</abbr>] was used to calculate 3:1 dinucleotide frequencies (third base of a given codon followed by the first base of the next codon). For whole-genome calculations, genome nucleotide sequences (.ffn file) were obtained from GenBank [<abbr bid="B44">44</abbr>]. Perl scripts were used to eliminate the defline and assemble all genomic ORFs together for CODONW calculation. The length (from UNIX wc command) divided by 3 was used to validate the absence of frameshift errors. Pairwise covariation of 3:1 dinucleotide frequencies was assessed by the Spearman rank correlation coefficient [<abbr bid="B45">45</abbr>], a nonparametric rank statistic for testing monotonic relationships. T<sup>2 </sup>values were kindly provided by Hooper [<abbr bid="B31">31</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Codon usage</p>
            </st>
            <p>Codon usage for individual genes was computed with the CDONTREE program [<abbr bid="B46">46</abbr>]. Codon-usage values for whole genomes were obtained from the Codon Usage Database [<abbr bid="B47">47</abbr>,<abbr bid="B48">48</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Phylogenetic trees</p>
            </st>
            <p>16S rRNA subtrees were derived from the Ribosomal Database site [<abbr bid="B49">49</abbr>,<abbr bid="B50">50</abbr>]. Unrooted phylogenetic protein trees were derived by input of the indicated homolog amino-acid sequences into the ClustalW program (Version 1.4) [<abbr bid="B51">51</abbr>]. Manual alignment adjustments were made as needed with the assistance of the BioEdit multiple alignment tool of Hall [<abbr bid="B52">52</abbr>]. The refined multiple alignment was used as input for generation of a phylogenetic tree using the program package PHYLIP [<abbr bid="B53">53</abbr>]. The neighbor-joining and Fitch programs [<abbr bid="B51">51</abbr>] were used to obtain distance-based trees. The distance matrix was obtained using Protdist with a Dayhoff Pam matrix. The Seqboot and Consense programs were then used to assess the statistical strength of the tree using bootstrap resampling. Neighbor-joining and Fitch trees yielded similar clusters and arrangement of taxa within them. Bootstrap values indicate the number of times a node was supported in 1,000 resampling replications.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of linker regions</p>
            </st>
            <p>Fusion proteins were aligned (ClustalW) with one another and with the assemblage of free-standing proteins corresponding to the amino-terminal and the carboxy-terminal domains of the fusion proteins. The boundaries of each domain were defined by the last highly conserved residues of the amino-terminal domain and the early highly conserved residues of the carboxy-terminal domain. The Conserved Domain Database was useful as a reference guide [<abbr bid="B54">54</abbr>,<abbr bid="B55">55</abbr>].</p>
         </sec>
         <sec>
            <st>
               <p>Comparative genome analysis</p>
            </st>
            <p>Most of the comparative genome analysis was carried out using the database and tools of ERGO [<abbr bid="B56">56</abbr>].</p>
         </sec>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>G.X. was partially supported in this work through the STDGEN project at Los Alamos National Laboratory (NIH/NIAIDAGY1-A1-8228-05). We thank Sean Hooper (Department of Molecular Evolution, Uppsala University, Sweden) for assistance with dinucleotide frequency calculations. We are indebted to A. Osterman of Integrated Genomics, Inc. (Chicago, IL) for provision of access to ERGO [<abbr bid="B56">56</abbr>]. This is Florida Agricultural Experiment Station Journal series no. R-09159.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <aug>
               <au>
                  <snm>Margulis</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Symbiosis in Cell Evolution</source>
            <publisher>San Francisco: WH Freeman</publisher>
            <pubdate>1981</pubdate>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Evolution of organellar genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Gray</snm>
                  <fnm>MW</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>678</fpage>
            <lpage>687</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(99)00030-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">10607615</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>442</fpage>
            <lpage>444</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(98)01553-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">9825671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Molecular archaeology of the <it>Escherichia coli </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>9413</fpage>
            <lpage>9417</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21352</pubid>
                  <pubid idtype="pmpid" link="fulltext">9689094</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.16.9413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Evidence for lateral gene transfer between archaea and bacteria from genome sequence of <it>Thermotoga maritima</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Clayton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gill</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Gwinn</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Haft</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Ketchum</snm>
                  <fnm>KA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1999</pubdate>
            <volume>399</volume>
            <fpage>323</fpage>
            <lpage>329</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/20601</pubid>
                  <pubid idtype="pmpid" link="fulltext">10360571</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Lateral gene transfer and the nature of bacterial innovation.</p>
            </title>
            <aug>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Groisman</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>405</volume>
            <fpage>299</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35012500</pubid>
                  <pubid idtype="pmpid" link="fulltext">10830951</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Phylogenetic classification and the universal tree.</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>284</volume>
            <fpage>2124</fpage>
            <lpage>2129</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.284.5423.2124</pubid>
                  <pubid idtype="pmpid" link="fulltext">10381871</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Mosaic bacterial chromosomes: a challenge <it>en route </it>to a tree of genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Martin</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>BioEssays</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <fpage>99</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1521-1878(199902)21:2&lt;99::AID-BIES3>3.3.CO;2-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10193183</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>On the occurrence of horizontal gene transfer among an arbitrarily chosen group of 26 genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Syvanen</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2002</pubdate>
            <volume>54</volume>
            <fpage>258</fpage>
            <lpage>266</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s0023901-0007-z</pubid>
                  <pubid idtype="pmpid">11821918</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Archaeal and bacterial hyperthermophiles: horizontal gene exchange or common ancestry?</p>
            </title>
            <aug>
               <au>
                  <snm>Kyrpides</snm>
                  <fnm>NC</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>298</fpage>
            <lpage>299</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(99)01811-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10431189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Lateral gene transfer, genome surveys, and the phylogeny of prokaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Stiller</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>BD</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>1443</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1126/science.286.5444.1443a</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Something for everyone: Horizontal gene transfer in evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Kurland</snm>
                  <fnm>CG</fnm>
               </au>
            </aug>
            <source>EMBO Rep</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>92</fpage>
            <lpage>95</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/embo-reports/kvd042</pubid>
                  <pubid idtype="pmpid">11265763</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Microbial genes in the human genome: lateral transfer or gene loss?</p>
            </title>
            <aug>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>292</volume>
            <fpage>1903</fpage>
            <lpage>1906</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1061036</pubid>
                  <pubid idtype="pmpid" link="fulltext">11358996</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Phylogenetic analyses do not support horizontal gene transfers from bacteria to vertebrates.</p>
            </title>
            <aug>
               <au>
                  <snm>Stanhope</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Lupas</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Italia</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Koretke</snm>
                  <fnm>KK</fnm>
               </au>
               <au>
                  <snm>Volker</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>940</fpage>
            <lpage>944</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35082058</pubid>
                  <pubid idtype="pmpid" link="fulltext">11418856</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>About the last common ancestor, the universal life-tree and lateral gene transfer: a reappraisal.</p>
            </title>
            <aug>
               <au>
                  <snm>Glansdorff</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2000</pubdate>
            <volume>38</volume>
            <fpage>177</fpage>
            <lpage>185</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2000.02126.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11069646</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Interpreting the universal phylogenetic tree.</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>8392</fpage>
            <lpage>8396</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">26958</pubid>
                  <pubid idtype="pmpid" link="fulltext">10900003</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.15.8392</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Molecular archaeology of the <it>Escherichia coli </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>9413</fpage>
            <lpage>9417</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21352</pubid>
                  <pubid idtype="pmpid" link="fulltext">9689094</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.16.9413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Limitations of compositional approach to identifying horizontally transferred genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2001</pubdate>
            <volume>53</volume>
            <fpage>244</fpage>
            <lpage>250</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s002390010214</pubid>
                  <pubid idtype="pmpid" link="fulltext">11523011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Codon bias and base composition are poor indicators of horizontally transferred genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Koski</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Morton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>404</fpage>
            <lpage>412</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11230541</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>On surrogate methods for detecting lateral gene transfer.</p>
            </title>
            <aug>
               <au>
                  <snm>Ragan</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Lett</source>
            <pubdate>2001</pubdate>
            <volume>201</volume>
            <fpage>187</fpage>
            <lpage>191</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1097(01)00262-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">11470360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Reconciling the many facets of lateral gene transfer.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>2002</pubdate>
            <volume>10</volume>
            <fpage>1</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0966-842X(01)02282-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">11755071</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Detection of lateral gene transfer among microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Ragan</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>620</fpage>
            <lpage>626</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(00)00244-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">11682304</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Dynamic diversity of the tryptophan pathway in the chlamydiae: reductive evolution and a novel operon for tryptophan recapture.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0005.1</fpage>
            <lpage>0005.13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">150452</pubid>
                  <pubid idtype="pmpid" link="fulltext">11806828</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-9-research0051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Biosynthesis of aromatic amino acids.</p>
            </title>
            <aug>
               <au>
                  <snm>Henner</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yanofsky</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>In Bacillus subtilis and other Gram-positive Bacteria: Biochemistry, Physiology, and Molecular Genetics</source>
            <publisher>Washington, DC: ASM Press</publisher>
            <editor>Sonenshein AL, Hoch J, Losick R</editor>
            <pubdate>1993</pubdate>
            <fpage>269</fpage>
            <lpage>280</lpage>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Substrate ambiguity of 3-deoxy-<it>D-manno</it>-octulosonate 8-phosphate synthase from <it>Neisseria gonorrhoeae </it>in the context of its membership in a protein family containing a subset of 3-deoxy-<it>D-arabino</it>-heptulosonate 7-phosphate synthases.</p>
            </title>
            <aug>
               <au>
                  <snm>Subramaniam</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Xia</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1998</pubdate>
            <volume>180</volume>
            <fpage>119</fpage>
            <lpage>127</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">106857</pubid>
                  <pubid idtype="pmpid" link="fulltext">9422601</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The correct phylogenetic relationship of KdsA (3-deoxy-<it>D-manno</it>-octulosonate 8-phosphate synthase) with one of two independently evolved classes of AroA (3-deoxy-<it>D-arabino</it>-heptulosonate 7-phosphate synthase.</p>
            </title>
            <aug>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2002</pubdate>
            <volume>54</volume>
            <fpage>416</fpage>
            <lpage>423</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11847568</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Microbial origin of plant-type 2-keto-3-deoxy-<it>D-arabino-</it>heptulosonate 7-phosphate synthases, exemplified by the chorismate-and tryptophan-regulated enzyme from <it>Xanthomonas campestris</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Gosset</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>4061</fpage>
            <lpage>4070</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">95290</pubid>
                  <pubid idtype="pmpid" link="fulltext">11395471</pubid>
                  <pubid idtype="doi">10.1128/JB.183.13.4061-4070.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Cyclohexadienyl dehydrogenase from <it>Pseudomonas stutzeri </it>exemplifies a widespread type of tyrosine-pathway dehydrogenase in the TyrA protein family.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Comp Biochem Physiol C Toxicol Pharmacol</source>
            <pubdate>2000</pubdate>
            <volume>125</volume>
            <fpage>65</fpage>
            <lpage>83</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0742-8413(99)00090-0</pubid>
                  <pubid idtype="pmpid">11790331</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Homology: a personal view on some of the problems.</p>
            </title>
            <aug>
               <au>
                  <snm>Fitch</snm>
                  <fnm>WM</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>227</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02005-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10782117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>On interspecies gene transfer: the case of the <it>argF </it>gene of <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Van Vliet</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Boyen</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Glansdorff</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Ann Inst Pasteur Microbiol</source>
            <pubdate>1988</pubdate>
            <volume>139</volume>
            <fpage>493</fpage>
            <lpage>496</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0769-2609(88)90111-1</pubid>
                  <pubid idtype="pmpid">3052532</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Detection of genes with atypical nucleotide sequence in microbial genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2002</pubdate>
            <volume>54</volume>
            <fpage>365</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11847562</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Reinvestigation of a new type of aerobic benzoate metabolism in the proteobacterium <it>Azoarcus evansii</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Mohamed</snm>
                  <fnm>MES</fnm>
               </au>
               <au>
                  <snm>Zaar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ebenau-Jehle</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fuchs</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>1899</fpage>
            <lpage>1908</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">95084</pubid>
                  <pubid idtype="pmpid" link="fulltext">11222587</pubid>
                  <pubid idtype="doi">10.1128/JB.183.6.1899-1908.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Two similar gene clusters coding for enzymes of a new type of aerobic 2-aminobenzoate (anthranilate) metabolism in the bacterium <it>Azoarcus evansii</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Sch&#252;hle</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ghisla</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fuchs</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>5268</fpage>
            <lpage>5278</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">95408</pubid>
                  <pubid idtype="pmpid" link="fulltext">11514509</pubid>
                  <pubid idtype="doi">10.1128/JB.183.18.5268-5278.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>The emerging periplasm-localized subclass of AroQ chorismate mutases, exemplified by those from <it>Salmonella typhimurium </it>and <it>Pseudomonas aeruginosa</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Calhoun</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>research0030.1</fpage>
            <lpage>0030.16</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">55327</pubid>
                  <pubid idtype="pmpid" link="fulltext">11532214</pubid>
                  <pubid idtype="doi">10.1186/gb-2001-2-8-research0030</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Assembly of the <it>Pseudomonas aeruginosa </it>nonribosomal peptide siderophore pyochelin: <it>In vitro </it>reconstitution of aryl-4,2-bisthiazoline synthetase activity from PchD, PchE, and PchF.</p>
            </title>
            <aug>
               <au>
                  <snm>Quadri</snm>
                  <fnm>LEN</fnm>
               </au>
               <au>
                  <snm>Keating</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>CT</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1999</pubdate>
            <volume>38</volume>
            <fpage>14941</fpage>
            <lpage>14954</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi991787c</pubid>
                  <pubid idtype="pmpid" link="fulltext">10555976</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p><it>In vitro </it>reconstitution of the <it>Pseudomonas aeruginosa </it>nonribosomal peptide synthesis of pyochelin: characterization of backbone tailoring thiazoline reductase and <it>N</it>-methyltransferase activities.</p>
            </title>
            <aug>
               <au>
                  <snm>Patel</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>CT</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2001</pubdate>
            <volume>40</volume>
            <fpage>9023</fpage>
            <lpage>9031</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi010519n</pubid>
                  <pubid idtype="pmpid" link="fulltext">11467965</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Intragenomic base content variation is a potential source of biases when searching for horizontally transferred genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Guindon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Perri&#232;re</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>1838</fpage>
            <lpage>1840</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11504864</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>The closest BLAST hit is often not the nearest neighbor.</p>
            </title>
            <aug>
               <au>
                  <snm>Koski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2001</pubdate>
            <volume>52</volume>
            <fpage>540</fpage>
            <lpage>542</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11443357</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Identification and characterization of genes for a second anthranilate synthase in <it>Pseudomonas aeruginosa </it>: interchangeability of the two anthranilate synthases and evolutionary implications.</p>
            </title>
            <aug>
               <au>
                  <snm>Essar</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Eberly</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hadero</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Crawford</snm>
                  <fnm>IP</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1990</pubdate>
            <volume>172</volume>
            <fpage>884</fpage>
            <lpage>900</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">208517</pubid>
                  <pubid idtype="pmpid">2153661</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Functional analysis of genes for biosynthesis of pyocyanin and phenazine-1-carboxamide from <it>Pseudomonas aeruginosa </it>PA01.</p>
            </title>
            <aug>
               <au>
                  <snm>Mavrodi</snm>
                  <fnm>DV</fnm>
               </au>
               <au>
                  <snm>Bonsall</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Delaney</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Soule</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Phillips</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Thomashow</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>6454</fpage>
            <lpage>6465</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">100142</pubid>
                  <pubid idtype="pmpid" link="fulltext">11591691</pubid>
                  <pubid idtype="doi">10.1128/JB.183.21.6454-6465.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Regulation of the <it>Streptomyces coelicolor </it>calcium-dependent antibiotic by <it>absA</it>, encoding a cluster-linked two-component system.</p>
            </title>
            <aug>
               <au>
                  <snm>Ryding</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Champness</snm>
                  <fnm>WC</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2002</pubdate>
            <volume>184</volume>
            <fpage>794</fpage>
            <lpage>805</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139508</pubid>
                  <pubid idtype="pmpid" link="fulltext">11790750</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Enzyme recruitment in the evolution of new function.</p>
            </title>
            <aug>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Annu Rev Microbiol</source>
            <pubdate>1976</pubdate>
            <volume>30</volume>
            <fpage>409</fpage>
            <lpage>425</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.mi.30.100176.002205</pubid>
                  <pubid idtype="pmpid" link="fulltext">791073</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>CodonW as a freeware release for codon usage analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Peden</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <url>http://www.molbiol.ox.ac.uk/cu/culong.html#Codonw</url>
         </bibl>
         <bibl id="B44">
            <title>
               <p>GenBank Database</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html</url>
         </bibl>
         <bibl id="B45">
            <aug>
               <au>
                  <snm>Lehmann</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>D'Abrera</snm>
                  <fnm>HJM</fnm>
               </au>
            </aug>
            <source>Nonparametrics: Statistical Methods Based on Ranks, Rev Edn</source>
            <publisher>Englewood Cliffs, NJ: Prentice-Hall</publisher>
            <pubdate>1998</pubdate>
            <fpage>292</fpage>
            <lpage>323</lpage>
         </bibl>
         <bibl id="B46">
            <title>
               <p>A backtranslation method based on codon usage strategy.</p>
            </title>
            <aug>
               <au>
                  <snm>Pesole</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Attimonelli</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Liuni</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1988</pubdate>
            <volume>16</volume>
            <fpage>1715</fpage>
            <lpage>1728</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3281142</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Codon usage tabulated from the international DNA sequence databases: status for the year 2000.</p>
            </title>
            <aug>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ikemura</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>292</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102460</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592250</pubid>
                  <pubid idtype="doi">10.1093/nar/28.1.292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Codon Usage Database</p>
            </title>
            <url>http://www.kazusa.or.jp/codon</url>
         </bibl>
         <bibl id="B49">
            <title>
               <p>The RDP-II (Ribosomal Database Project).</p>
            </title>
            <aug>
               <au>
                  <snm>Maidak</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Cole</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lilburn</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>CT</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Saxman</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Farris</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Garrity</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Tiedje</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>173</fpage>
            <lpage>174</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29785</pubid>
                  <pubid idtype="pmpid" link="fulltext">11125082</pubid>
                  <pubid idtype="doi">10.1093/nar/29.1.173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Ribosomal Database Project II</p>
            </title>
            <url>http://rdp.cme.msu.edu/html</url>
         </bibl>
         <bibl id="B51">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7984417</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <aug>
               <au>
                  <snm>Hall</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Biological Sequence Alignment Editor for Windows 95/98/NT, 5.0.9 Edn</source>
            <publisher>Raleigh: North Carolina State University</publisher>
            <url>http://www.mbio.ncsu.edu/BioEdit/bioedit.html</url>
         </bibl>
         <bibl id="B53">
            <title>
               <p>PHYLIP - Phylogeny inference package (version 3.2).</p>
            </title>
            <aug>
               <au>
                  <snm>Felsenstein</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cladistics</source>
            <pubdate>1989</pubdate>
            <volume>5</volume>
            <fpage>164</fpage>
            <lpage>166</lpage>
         </bibl>
         <bibl id="B54">
            <title>
               <p>CDD: a database of conserved domain alignments with links to domain three-dimensional structure.</p>
            </title>
            <aug>
               <au>
                  <snm>Marchler-Bauer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Panchenko</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Thiessen</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Geer</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nucleic Acid Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>281</fpage>
            <lpage>283</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99109</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752315</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.281</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>NCBI Conserved Domain Database</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml</url>
         </bibl>
         <bibl id="B56">
            <title>
               <p>ERGO</p>
            </title>
            <url>http://wit.integratedgenomics.com/ERGO</url>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Significance of two distinct types of tryptophan synthase beta chain in Bacteria, Archaea and higher plants.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Forst</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0004.1</fpage>
            <lpage>0004.13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">150451</pubid>
                  <pubid idtype="pmpid" link="fulltext">11806827</pubid>
                  <pubid idtype="doi">10.1186/gb-2001-3-1-research0004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Multiple chromosomes in bacteria: The yin and yang of <it>trp </it>gene localization in <it>Rhodobacter sphaeroides </it>2.4.1.</p>
            </title>
            <aug>
               <au>
                  <snm>Mackenzie</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Simmons</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kaplan</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1999</pubdate>
            <volume>153</volume>
            <fpage>525</fpage>
            <lpage>538</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10511537</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
