<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-4-37</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>An improved probability mapping approach to assess genome mosaicism</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Zhaxybayeva</snm>
               <fnm>Olga</fnm>
               <insr iid="I1"/>
               <email>olga@carrot.mcb.uconn.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Gogarten</snm>
               <fnm>J Peter</fnm>
               <insr iid="I1"/>
               <email>gogarten@uconn.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Molecular and Cell Biology, University of Connecticut, 91 North Eagleville Road, Storrs, CT, 06269-3125, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>1</issue>
         <fpage>37</fpage>
         <url>http://www.biomedcentral.com/1471-2164/4/37</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi" link="fulltext">10.1186/1471-2164-4-37</pubid>
               <pubid idtype="pmpid" link="fulltext">12974984</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>19</day>
               <month>6</month>
               <year>2003</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>15</day>
               <month>9</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>15</day>
               <month>9</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Zhaxybayeva and Gogarten; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <kwdg>
         <kwd>maximum likelihood mapping</kwd>
         <kwd>long-branch attraction</kwd>
         <kwd>horizontal gene transfer</kwd>
         <kwd>taxon sampling</kwd>
         <kwd>bootstrap support values mapping</kwd>
      </kwdg>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Maximum likelihood and posterior probability mapping are useful visualization techniques that are used to ascertain the mosaic nature of prokaryotic genomes. However, posterior probabilities, especially when calculated for four-taxon cases, tend to overestimate the support for tree topologies. Furthermore, because of poor taxon sampling four-taxon analyses suffer from sensitivity to the long branch attraction artifact. Here we extend the probability mapping approach by improving taxon sampling of the analyzed datasets, and by using bootstrap support values, a more conservative tool to assess reliability.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Quartets of orthologous proteins were complemented with homologs from selected reference genomes. The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping. The more conservative nature of the plotted support values allows to focus further analyses on those protein families that strongly disagree with the majority or plurality of genes present in the analyzed genomes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Posterior probability is a non-conservative measure for support, and posterior probability mapping only provides a quick estimation of phylogenetic information content of four genomes. This approach can be utilized as a pre-screen to select genes that might have been horizontally transferred. Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation. Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The analysis of four-taxon trees promises to provide valuable insight and visual documentation of genome mosaicism <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. However, like other four-taxon analyses, our probability mapping approach for comparative genome analyses <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> is vulnerable to the long branch attraction (LBA) artifact because it analyzes datasets consisting of only four sequences. LBA is a well-known phylogenetic artifact <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. It is especially well studied for the case of four-taxon trees (e.g., see <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>). In short, regardless of the reconstruction method and model used, if the branches are long enough, the reconstructed tree might be affected by LBA although to different degrees. Furthermore, four-taxon analyses were shown to be instable and misleading under some circumstances <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. Addition of more taxa can break up the long branches and increases reliability. Simulation studies have shown that increase of the size of a dataset by introducing additional homologous sequences improves the accuracy of the reconstruction <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> (see <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> for the recent discussion). An increase in the sequence lengths of the analyzed data also can improve the reliability of phylogenetic reconstruction <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, but lumping different putative orthologs into a single dataset would defeat the purpose of the probability mapping approach, i.e., the detection of genes that have incompatible evolutionary histories. Merging proteins with different histories into concatenated datasets would not help to resolve their phylogenies.</p>
         <p>Here we report an extension of probability mapping that increases the number of homologous sequences per dataset, throughout the rest of the article referred as Operational Taxonomic Unit (OTU) sampling, but retains the power to visualize genomic mosaicism from the original approach. A quartet of orthologous proteins (QuartOP) is defined as four homologs from four genomes that pick each other as top-scoring reciprocal hits in BLAST searches of the respective genomes (for more details see <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>). For each QuartOP detected in a genome quartet we add homologous sequences and evaluate the branching order of the QuartOP in 100 bootstrap samples. The bootstrap support values then are mapped into a barycentric coordinate system. We compare the mapping results with previously reported ones <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, and give examples that illustrate the utility of this approach in detecting horizontally transferred genes.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Interdomain Genome Quartets</p>
            </st>
            <p>In <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> we described the analyses of several interdomain genome quartets. Some of the analyses were performed using a posterior probability mapping approach referred to as Maximum Likelihood (ML) mapping, a name that was coined in the original description of this approach <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. We will use this term throughout the manuscript. In ML mapping posterior probabilities are calculated from the maximum likelihood values (see <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> for the details). One noteworthy finding was that in the genome quartet including <it>Synechocystis </it>sp., <it>Halobacterium </it>sp., <it>Aquifex aeolicus </it>and <it>Thermotoga maritima </it>the grouping of <it>Halobacterium </it>sp. with <it>Synechocystis </it>sp. was recovered by many more QuartOPs than the grouping expected following 16S rRNA phylogeny (see Fig. <figr fid="F1">1A</figr>). Note that throughout the manuscript we refer to a particular tree by mentioning two species out of four (e.g., in this case grouping of <it>Halobacterium </it>sp. with <it>Synechocystis </it>sp.); however, the trees are unrooted and therefore grouping of the other two taxa is implied. To test if this association was specific for <it>Synechocystis </it>sp., we had repeated the analyses replacing <it>Synechocystis </it>sp. with <it>Bacillus subtilis</it>. The results were qualitatively the same (data not shown). To test for the possibility that LBA <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> might be the reason for the strong support of <it>Halobacterium </it>sp. grouping with <it>Synechocystis </it>sp., we had repeated the analyses replacing the <it>Halobacterium </it>sp. genome with that from <it>Archaeoglobus fulgidus</it>, another archaeon. The majority of QuartOPs supported the grouping of the thermophilic archaeon <it>Archaeoglobus </it>with the thermophilic bacteria <it>Aquifex </it>and <it>Thermotoga </it>(see Fig. <figr fid="F1">1B</figr>). In this study, we reanalyzed the above-mentioned genome quartets by adding homologous sequences from sixty reference genomes to each QuartOP creating what we call "extended datasets". The dataflow is depicted in Figure <figr fid="F2">2</figr>. For each extended dataset we obtained bootstrap support values for each of the three four-taxon "subtrees" and we plotted the bootstrap support values into barycentric coordinates. Throughout this manuscript we use a graph theory definition of a subtree, i.e. "A tree G' whose graph vertices and graph edges form subsets of the graph vertices and graph edges of a given tree G" <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. In particular, sequences (OTUs) included in the subtree are not required to be neighbors in the original tree. Subtrees defined according to these rules are different from subclades (see figure <figr fid="F2">2</figr> for an illustration). For example, if the topology ((A,D),(B,C)) is supported by a given bootstrap sample, this means that in the tree calculated from this sample the sequence from genome A groups closer to the one from D than to the one from B or C (figure <figr fid="F2">2</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Posterior probability maps of genome quartets containing <it>Synechocystis </it>sp</p>
               </caption>
               <text>
                  <p><b>Posterior probability maps of genome quartets containing <it>Synechocystis </it>sp. </b>Posterior probabilities were calculated according the maximum likelihood mapping approach described in <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B17">17</abbr></abbrgrp>. Tree topologies assigned to the vertices are depicted in New Hampshire tree format near the corresponding vertex of the triangle and they should be considered as unrooted tree topologies. The three numbers associated with each tree topology indicate how many QuartOPs fall into each of the three zones: "total" (i.e. posterior probability for the tree topology is larger than posterior probabilities for the other two topologies), 90% and 99% posterior probability respectively. <b>A) </b>Genome quartet consisting of <it>Synechocystis </it>sp., <it>Halobacterium </it>sp., <it>Aquifex aeolicus </it>and <it>Thermotoga maritima. </it>The majority of the QuartOPs support the grouping of the <it>Halobacterium </it>sp. with <it>Synechocystis </it>sp. <b>B) </b>Genome quartet consisting of <it>Synechocystis </it>sp., <it>Archaeoglobus fulgidus</it>, <it>Aquifex aeolicus </it>and <it>Thermotoga maritima. </it>The archaeon &#8211; <it>Synechocystis </it>sp. grouping is supported by fewer QuartOPs than in panel A.</p>
               </text>
               <graphic file="1471-2164-4-37-1" hint_layout="double"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Dataflow for construction and mapping of extended datasets for all QuartOPs in a genome quartet</p>
               </caption>
               <text>
                  <p><b>Dataflow for construction and mapping of extended datasets for all QuartOPs in a genome quartet. </b>See Materials and Methods for details.</p>
               </text>
               <graphic file="1471-2164-4-37-2" hint_layout="double"/>
            </fig>
            <p>Maps of bootstrap support values calculated from the extended datasets are shown in figure <figr fid="F3">3</figr>. The results are similar to the analyses using ML mapping (see <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and figure <figr fid="F1">1</figr>). At every level of support the plurality consensus groups the mesophilic archaeon with the mesophilic bacterium and the two thermophilic bacteria with one another (figure <figr fid="F1">1</figr> and <figr fid="F3">3</figr>, panels A). The plurality support changes in favor of the archaeon &#8211; <it>Aquifex </it>grouping, when the genome of a mesophilic archaeon is replaced with that of an extremely thermophilic archaeon <it>Archaeoglobus fulgidus </it>(figure <figr fid="F1">1</figr> and <figr fid="F3">3</figr>, panels B). The main difference between ML maps and bootstrap support maps from extended datasets is that the confidence values are much lower for the extended datasets evaluated with bootstrap. The cause for these lower support values is discussed below.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Maps of bootstrap support from extended datasets for genome quartets containing <it>Synechocystis </it>sp</p>
               </caption>
               <text>
                  <p><b>Maps of bootstrap support from extended datasets for genome quartets containing <it>Synechocystis </it>sp. </b>The three numbers associated with each tree topology indicate how many QuartOPs fall into each of the three zones: "total", 70% and 90% bootstrap support respectively. For other figure notations see figure <figr fid="F1">1</figr>. <b>A) </b>Genome quartet consisting of <it>Synechocystis </it>sp., <it>Halobacterium </it>sp., <it>Aquifex aeolicus </it>and <it>Thermotoga maritima. </it><b>B) </b>Genome quartet consisting of <it>Synechocystis </it>sp., <it>Archaeoglobus fulgidus</it>, <it>Aquifex aeolicus </it>and <it>Thermotoga maritima. </it>These maps are similar to the ML maps depicted in figure <figr fid="F1">1</figr>.</p>
               </text>
               <graphic file="1471-2164-4-37-3" hint_layout="double"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Interdomain transfer, interphylum transfer, or shared artifact?</p>
            </st>
            <p>The relation between the different bacterial phyla, and the placement of the bacterial root remains uncertain (e.g., see <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>). Therefore, it is not clear which of the three unrooted trees for the genome quartet in question represents the true organismal phylogeny of the four genomes. The lack of phylogenetic signal alone should result in QuartOPs that map to the center of the triangle. However, we observe many genes that prefer one topology to the other two. The genes in figures <figr fid="F1">1A</figr> and <figr fid="F3">3A</figr> that group the ortholog from <it>Halobacterium </it>with its putative ortholog from <it>Synechocystis </it>might do so for a variety of different reasons:</p>
            <p>A) Horizontal gene transfer</p>
            <p>&#160;&#160;&#160;A<sub>1</sub>) between a mesophilic bacterium and a mesophilic archaeon</p>
            <p>&#160;&#160;&#160;A<sub>2</sub>) between the extremely mesophilic bacteria <it>Aquifex </it>and <it>Thermotoga</it></p>
            <p>B) Phylogenetic reconstruction artifacts</p>
            <p>&#160;&#160;&#160;B<sub>1</sub>) due to long branch attraction</p>
            <p>&#160;&#160;&#160;B<sub>2</sub>) due to compositional bias</p>
            <p>&#160;&#160;&#160;B<sub>3</sub>) due to the lack of phylogenetic signal</p>
            <p>C) Unrecognized paralogy</p>
            <p>D) This grouping reflects organismal evolutionary history.</p>
            <p>It is important to realize that any of the processes listed under A through C, alone or in combination, might result in well supported QuartOPs that group the halobacterial homolog with the cyanobacterial counterpart. Some of these possibilities can be distinguished in the individual phylogenies of the extended datasets. Table <tblr tid="T1">1</tblr> summarizes these findings. There are many datasets which don't conform to the 16S rRNA based expectation in more than one respect: <it>Aquifex </it>and <it>Thermotoga </it>are recovered as sister groups, and either the cyanobacterial sequences groups within a cluster of Archaeal sequences, or the halobacterial sequence groups within a well-supported cluster of Bacterial homologs. We considered these unexpected phylogenies as supported, if the branch that separated this group from the other homologs had at least 70% bootstrap support (note: in this case we used the bootstrap support for individual branches, not the support for subtrees). At first sight this might appear as a rather low level of support; however, adding more sequences tends to shorten the internal branches and thus lowers their individual bootstrap support value (for discussion see <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>). The support for the subtree topology (see figure <figr fid="F2">2</figr>) is larger than 90% for all entries. There is only one case where the observed subtree ((<it>Aquifex</it>, <it>Thermotoga</it>), (<it>Halobacterium</it>, <it>Synechocystis</it>)) appears to be due to unrecognized paralogy. In four instances a well-supported branch suggests an interdomain gene transfer and in 11 instances an exchange between <it>Thermotoga </it>and <it>Aquifex</it>. These findings apparently support the notion that genes are frequently transferred between divergent prokaryotes; however, <it>Halobacterium </it>has an amino acid composition that often deviates significantly from that of the other homologs. The instances where both the halobacterial sequence and its phylogenetic neighbor failed a test for homogeneous composition are indicated in table <tblr tid="T1">1</tblr>. This leaves eight well-supported instances of at least one horizontal gene transfer event out of the twelve datasets without shared compositional biases.</p>
            <tbl id="T1" hint_layout="double">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Analyses of the tree topologies for 18 candidates for horizontal gene transfer. Support for putative transfers between Bacteria and Archaea is shown in the B&amp;A column, and between <it>Aquifex </it>and <it>Thermotoga </it>is shown in the A&amp;T column. The compositional bias is listed as "strong", if both the halobacterial sequence and its nearest phylogenetic neighbor failed the test for homogeneous composition. See Materials and Methods for details on performed analyses.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>
                           <b>ID</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Functional Assignment</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>B&amp;A</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>A&amp;T</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Comments</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Compositional bias</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>008</p>
                     </c>
                     <c ca="left">
                        <p>thymidylate kinase</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>020</p>
                     </c>
                     <c ca="left">
                        <p>seryl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>054</p>
                     </c>
                     <c ca="left">
                        <p>valyl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>062</p>
                     </c>
                     <c ca="left">
                        <p>excision nuclease chain A</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>072</p>
                     </c>
                     <c ca="left">
                        <p>chromosome segregation SMC protein</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>None</p>
                     </c>
                     <c ca="left">
                        <p>Cyanobacteria, <it>Rickettsia </it>and <it>Aquifex </it>group within Archaea</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>076</p>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p><it>Halobacterium </it>groups within Bacteria</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>080</p>
                     </c>
                     <c ca="left">
                        <p>prolyl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>Archaeal type homologs are found in some bacteria</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>100</p>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Uninterpretable: no resemblance with assumed organismal phylogeny</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>105</p>
                     </c>
                     <c ca="left">
                        <p>DNA gyrase, subunit B</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>Archaea do not form a group</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>106</p>
                     </c>
                     <c ca="left">
                        <p>arginyl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>107</p>
                     </c>
                     <c ca="left">
                        <p>DNA gyrase, subunit A</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>Both <it>Thermotoga </it>and <it>Aquifex </it>group with Archaea</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>110</p>
                     </c>
                     <c ca="left">
                        <p>cysteinyl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c ca="left">
                        <p>Both <it>Thermotoga </it>and <it>Aquifex </it>group within Archaea</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>113</p>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>Both <it>Thermotoga </it>and <it>Aquifex </it>group within Archaea</p>
                     </c>
                     <c ca="left">
                        <p>Strong</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>121</p>
                     </c>
                     <c ca="left">
                        <p>chorismate synthase</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>122</p>
                     </c>
                     <c ca="left">
                        <p>3-phosphoshikimate-1-carboxyvinyltransferase</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>134</p>
                     </c>
                     <c ca="left">
                        <p>histidinol dehydrogenase</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>Putative paralogs</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>144</p>
                     </c>
                     <c ca="left">
                        <p>50S ribosomal protein L2</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>145</p>
                     </c>
                     <c ca="left">
                        <p>30S ribosomal protein S19</p>
                     </c>
                     <c ca="left">
                        <p>weak</p>
                     </c>
                     <c ca="left">
                        <p>strong</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                     <c>
                        <p>-</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The analyses depicted in figure <figr fid="F3">3</figr> and table <tblr tid="T1">1</tblr> demonstrate that bootstrap support value mapping in general, and support value mapping using extended datasets in particular, are useful in screening for genes that were transferred between divergent organisms. Replacing the genome from a mesophilic archaeon with that from an extremely thermophilic one, changes the topology of the subtree that has plurality support. This observation is in agreement with the hypothesis that genes are more frequently shared between organisms that live in similar environments <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. However, given that the <it>Halobacterium </it>genome is renowned for its large number of genes with bacterial character <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>, the total number of genes identified in this study as putatively transferred between the mesophilic bacteria and the halobacteria is very small. There are several reasons for this observation. Useful phylogenetic information retained in molecular sequences is constantly overwritten by more recent substitution events. The more divergent the analyzed genomes are, the more QuartOPs will be undecided about the most supported topology. Furthermore, support value mapping can only identify gene transfers that resulted in orthologous replacement. Last but not least, the applied approach to assemble QuartOPs is overly restrictive. Lineage specific duplications result in two orthologs being present in a single genome. These genes are paralogs of one another, but both are orthologs to the gene present in the genomes that branch off before the lineage specific duplication <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Despite these shortcomings, support value mapping, especially when using extended datasets, provides a quick method to appraise the extent of genomic mosaicism, to delineate preliminarily the major flows of genes in microbial evolution (plurality or majority consensus), and to find subsets of potentially transferred genes.</p>
         </sec>
         <sec>
            <st>
               <p>Screen for more recent interphylum transfers</p>
            </st>
            <p>To assess the utility of probability and bootstrap support values mapping for detecting more recent interphylum gene transfer events, we calculated extended datasets for the genome quartet of <it>Synechocystis </it>sp., <it>Chlorobium tepidum</it>, <it>Rhodobacter capsulatus </it>and <it>Rhodopseudomonas palustris </it>(see figure <figr fid="F4">4</figr> and <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>). This genome quartet has a strong phylogenetic signal grouping together the two alpha proteobacteria <it>R. capsulatus </it>and <it>R. palustris</it>. This example had previously been utilized to demonstrate the validity of ML mapping <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> showing that the vast majority of QuartOPs group the proteins from the two more closely related organisms together. However, there are 14 QuartOPs that support the two alternative topologies with 99% posterior probability. These have to be regarded as candidates for horizontal gene transfer. Analysis of this genome quartet using extended datasets shows that some of these 14 QuartOPs are also supported by high bootstrap support values (above 90%). Figures <figr fid="F5">5</figr> through <figr fid="F8">8</figr> provide further analysis of the extended datasets for these QuartOPs. The cases of the cation-transporting ATPases (figure <figr fid="F5">5</figr>) and the hypothetical proteins depicted in figure <figr fid="F6">6</figr> probably represent unrecognized paralogies. The proteins from <it>R. palustris </it>and <it>R. capsulatus </it>each group together with homologs from other alpha proteobacteria, and in some instances a single genome encodes both paralogs (<it>Bradyrhizobium japonicum </it>in case of hypothetical protein family and <it>Sinorhizobium meliloti </it>in case of the cation-transporting ATPases). It appears likely that <it>R. palustris </it>has lost one and <it>R. capsulatus </it>the other paralog. In these two instances the unexpected behavior of the QuartOPs is due to failure of the strategy to select orthologous genes. In contrast, the cases of the water channel protein family and the methionyl-tRNA synthetases are best explained by horizontal gene transfer. None of the reference genomes contains paralogs whose differential loss might explain the observed phylogenies (figures <figr fid="F7">7</figr> and <figr fid="F8">8</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Genome quartet of <it>Synechocystis </it>sp., <it>Chlorobium tepidum</it>, <it>Rhodobacter capsulatus </it>and <it>Rhodopseudomonas palustris</it></p>
               </caption>
               <text>
                  <p><b>Genome quartet of <it>Synechocystis </it>sp., <it>Chlorobium tepidum</it>, <it>Rhodobacter capsulatus </it>and <it>Rhodopseudomonas palustris</it>. </b><b>A) </b>Posterior probability map calculated using probability mapping as described in <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B17">17</abbr></abbrgrp>. <b>B) </b>Bootstrap support map (see <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> for methodology of bootstrap support map reconstruction). Only the four putatively orthologous sequences were utilized in the analyses. <b>C) </b>Bootstrap support map from extended datasets. For details on the figure notations see legends for figures <figr fid="F1">1</figr> and <figr fid="F3">3</figr>. The majority of QuartOPs support one tree topology grouping two alpha proteobacteria together. The QuartOPs located in the two other corners of the triangle are candidates for horizontal gene transfer.</p>
               </text>
               <graphic file="1471-2164-4-37-4" hint_layout="double"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Phylogeny of homologs of cation-transporting ATPases</p>
               </caption>
               <text>
                  <p><b>Phylogeny of homologs of cation-transporting ATPases. </b>Members of the QuartOP are highlighted in red. The three support values indicated on a branch are bootstrap support values from Neighbor-Joining trees based on TREE-PUZZLE distances, bootstrap support values from parsimony analyses and posterior probabilities from Bayesian analyses respectively. See Materials and Methods for details on phylogenetic reconstruction. The finding of other homologs from alpha proteobacteria grouping with the <it>Rhodopseudomonas </it>and <it>Rhodobacter </it>sequences, and the finding of both homologs coexisting in the same genome (<it>Sinorhizobium</it>) suggests that this QuartOP represents a case of unrecognized paralogy with differential gene loss.</p>
               </text>
               <graphic file="1471-2164-4-37-5" hint_layout="double"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Phylogeny of hypothetical protein homologs</p>
               </caption>
               <text>
                  <p><b>Phylogeny of hypothetical protein homologs. </b>See figure <figr fid="F5">5</figr> for notations and Materials and Methods for details on phylogenetic reconstruction. This QuartOP represents another likely example of unrecognized paralogy. See text and figure <figr fid="F5">5</figr> for discussion.</p>
               </text>
               <graphic file="1471-2164-4-37-6" hint_layout="double"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Phylogeny of water channel proteins family</p>
               </caption>
               <text>
                  <p><b>Phylogeny of water channel proteins family. </b>See figure <figr fid="F5">5</figr> for notations and Materials and Methods for details on phylogenetic reconstruction. The two homologs from <it>Rhodobacter </it>and <it>Rhodopseudomonas </it>are separated by a well-supported branch. The homologs grouping with <it>Rhodobacter </it>do not have a paralog among the homologs grouping with <it>Rhodopseudomonas </it>and vice versa. This QuartOP remains a strong candidate for horizontal gene transfer.</p>
               </text>
               <graphic file="1471-2164-4-37-7" hint_layout="double"/>
            </fig>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Phylogeny of methionyl-tRNA synthetases</p>
               </caption>
               <text>
                  <p><b>Phylogeny of methionyl-tRNA synthetases. </b>See figure <figr fid="F5">5</figr> for notations and Materials and Methods for details on phylogenetic reconstruction. No paralogs were detected in the reference genomes. This QuartOP remains a strong candidate for horizontal gene transfer.</p>
               </text>
               <graphic file="1471-2164-4-37-8" hint_layout="double"/>
            </fig>
            <p>The higher frequency of unrecognized paralogs among the putatively horizontally transferred genes is due to the much larger number of QuartOPs analyzed. The detected number of unrecognized paralogs corresponds to less than 1% of the QuartOPs that contain sufficient phylogenetic information to support a topology with more than 90% bootstrap support (figure <figr fid="F4">4C</figr>). Every instance of unrecognized paralogy will result in a QuartOP deviating from the majority consensus revealed in this analysis.</p>
         </sec>
         <sec>
            <st>
               <p>Loss of support strength: due to conservative measure or taxon sampling?</p>
            </st>
            <p>It is difficult to compare posterior probabilities of QuartOPs directly with bootstrap support values of much larger datasets. Empirical studies <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> as well as the simulation studies <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp> indicate that bootstrap measures are much more conservative than Bayesian posterior probabilities. In the four-taxon cases analyzed in <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> a posterior probability of 0.99 calculated according to Strimmer and von Haeseler <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> was found to correspond to only 70% bootstrap support calculated from non-extended datasets.</p>
            <p>To demonstrate that the observed drop in support is due to the more conservative nature of bootstrapping and not due to increased OTU sampling we re-analyzed the same genome quartets using ML mapping (figure <figr fid="F1">1</figr> and figure <figr fid="F4">4A</figr>), bootstrap support values calculated from only the four aligned QuartOPs (<supplr sid="S2">Additional file: 2</supplr> and figure <figr fid="F4">4B)</figr>, and bootstrap support values calculated from the extended datasets (figure <figr fid="F3">3</figr> and figure <figr fid="F4">4C</figr>) as described in figure <figr fid="F2">2</figr>. Note that in case of ML mapping only posterior probabilities greater than 0.99 are counted as strong support (greater than 0.9 for moderate support), whereas in case of the bootstrap support value maps greater than 0.9 is classified as strong support (greater than 0.7 for moderate support). Table <tblr tid="T2">2</tblr> summarizes the overall number of QuartOPs supporting different topologies with different measures of support. There is a dramatic drop in the number of QuartOPs with strong support from the 99% posterior probabilities to 90% bootstrap from non-extended datasets. However, the added accuracy obtained through increased OTU sampling does not change the support as radically as the shift from posterior probabilities to the bootstrap support measure. Apparently, the increased accuracy due to better OTU sampling on average increases the bootstrap support of a given subtree as often as it lowers the support value. The higher support values found for ML mapping are solely due to the less conservative nature of the calculated support measure.</p>
            <tbl id="T2" hint_layout="double">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Comparison of confidence levels for different types of mappings. Table entries give the numbers of QuartOPs in the indicated genome quartets that prefer one of the three tree topologies with the specified level of support.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Genome Quartet</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>99% posterior probability</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>90% bootstrap support from non-extended datasets</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>90% bootstrap support from extended datasets</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Interdomain quartet with <it>Halobacterium </it>(see figure <figr fid="F1">1A</figr>, <figr fid="F3">3A</figr> and suppl. material)</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="center">
                        <p>33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Interdomain quartet with <it>Archaeoglobus </it>(see figure <figr fid="F1">1B</figr>, <figr fid="F3">3B</figr> and suppl. material)</p>
                     </c>
                     <c ca="center">
                        <p>99</p>
                     </c>
                     <c ca="center">
                        <p>42</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Interphylum quartet (see figure <figr fid="F4">4</figr>)</p>
                     </c>
                     <c ca="center">
                        <p>327</p>
                     </c>
                     <c ca="center">
                        <p>291</p>
                     </c>
                     <c ca="center">
                        <p>319</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>The original posterior probability mapping methods reported in <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> return results similar to those obtained from the analyses of extended datasets. ML mapping is much faster than the bootstrap support values mapping of extended datasets reported here. In interpreting results, however, one needs to be aware of the non-conservative nature of the posterior probability mapping approach, and of the greater susceptibility of four-taxon analyses to the long branch attraction artifact. The faster ML mapping approach has utility as a quick estimation of phylogenetic information content of four genomes. Even though ML mapping greatly overestimates reliability, our results illustrate the utility of ML mapping as a pre-screen for putative horizontal gene transfer events. The use of extended datasets combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation. However, even an increase in OTU sampling and the simultaneous use of a more conservative probability measure does not obviate the need to inspect the phylogenies of candidate genes to detect instances of unrecognized paralogy. Given the public availability of over 100 prokaryotic genomes, appropriate reference genomes can be selected in most instances to distinguish differential loss of paralogs from horizontal gene transfer events.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>The methodology of obtaining QuartOPs for four genomes is described in <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. For each sequence in a QuartOP we detect the top-scoring BLAST <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> hit with an E-value above 10<sup>-8 </sup>in each of 60 completely sequenced archaeal and bacterial reference genomes (<it>Aeropyrum pernix</it>, <it>Archaeoglobus fulgidus, Anabaena </it>sp., <it>Aquifex aeolicus, Agrobacterium tumifaciens, Borrelia burgdorferi, Bradyrhizobium japonicum, Bifidobacterium longum, Bacillus subtilis, Brucella suis, Buchnera </it>sp., <it>Clostridium acetobutylicum, Caulobacter crescentus, Corynebacterium glutamicum, Campylobacter jejuni, Clamydophila pneumoniae, Deinococcus radiodurans, Escherichia coli </it>K12, <it>Fusobacterium nucleatum, Halobacterium </it>sp., <it>Haemophilus influenzae, Helicobacter pylori, Leptospira interrogans, Lactococcus lactis, Listeria monocytogenes, Lactobacillus plantarum, Mycoplasma genitalium, Methanococcus jannaschii, Methanopyrus kandleri, Mezorhizobium loti, Methanosarcina mazei, Methanobacterium thermoautotrophicum, Mycobacterium tuberculosis, Neisseria meningitides, Oceanobacillus iheyensis, Pseudomonas aeruginosa, Pyrobaculum aerophilum, Pyrococcus horikoshii, Pasteurella multocida, Rickettsia conorii, Ralstonia solanacearum, Staphylococcus aureus, Streptomyces coelicolor, Sinorhizobium meliloti, Shewanella oneidensis, Sulfolobus solfataricus, Salmonella typhi, Synechocystis </it>sp., <it>Thermoplasma acidophilum, Thermosynechococcus elongates, Thermotoga maritime, Treponema pallidum, Thermoanaerobacter tengcongensis, Tropheryma whipplei, Ureaplasma urealyticum, Vibrio cholerae, Wigglesworthia brevipalpis, Xanthomonas campestris, Xylella fastidiosa, Yersinia pestis</it>). These genomes were downloaded from the NCBI web page <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The resulting sequences are added to the QuartOP dataset and duplicated sequences are eliminated. The datasets are aligned with ClustalW <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, and 100 bootstrap samples are generated using the SEQBOOT program from the PHYLIP package version 3.6a2.1 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The distances are generated using TREE-PUZZLE version 5.1 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> under the auto-detected substitution model. Neighbor-Joining trees are calculated from these distances using NEIGHBOR from the PHYLIP package version 3.6a2.1 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The resulting trees are parsed with respect to which of the three four-taxa subtrees they contain (see figure <figr fid="F2">2</figr>) using an in-house Java program that utilizes PAL library classes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The resulting bootstrap support vectors are plotted into barycentric coordinates using GNUPLOT version 3.7 <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Scripts for data manipulation were written in Perl and used many of the SEALS package subroutines <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
         <p>The <it>Rhodobacter capsulatus </it>genome data were obtained from Integrated Genomics <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Genome sequence for <it>Chlorobium tepidum </it>was downloaded from TIGR <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The <it>Rhodopseudomonas palustris </it>genome was downloaded from JGI <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Other genomes for the genome quartets were downloaded from NCBI <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
         <p>The trees depicted in Figures <figr fid="F5">5</figr> through <figr fid="F8">8</figr> are neighbor-joining trees calculated using the NEIGHBOR program from PHYLIP version 3.6a2.1 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The distances used in NEIGHBOR were calculated in TREE-PUZZLE version 5.1 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> with the option to correct for Among Site Rate Variation using a discrete approximation of a Gamma distribution with eight rate categories and estimating the shape parameter. The three indicated support values are bootstrap support values calculated from 100 bootstrap samples analyzed with NEIGHBOR from the distance calculated in TREE-PUZZLE, bootstrap support values calculated from 100 bootstrap samples analyzed with the PROTPARS program from PHYLIP version 3.6a2.1 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, and posterior probabilities as calculated with MrBayes version 3.0B4 <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> (The analyses were performed independently three times, 200,000 generations each; the lowest posterior probability for the bipartition from the three runs is shown).</p>
         <p>For eighteen potential candidates for the horizontal gene transfer between <it>Halobacterium </it>sp. and <it>Synechocystis </it>sp., or between <it>Aquifex aeolicus </it>and <it>Thermotoga maritima </it>phylogenetic trees were calculated and inspected manually. The neighbor-joining trees were calculated using the NEIGHBOR program from PHYLIP version 3.6a2.1 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The distances used in NEIGHBOR were calculated in TREE-PUZZLE version 5.1 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The trees were evaluated for potential transfers between Bacteria and Archaea, and between <it>Thermotoga </it>and <it>Aquifex</it>. 100 bootstrap samples were analyzed to assess the reliability of the branches on the tree. The possibility for the transfer was considered "strong" if the bootstrap support was above 70%, "weak" if the bootstrap support was lower, and "none" if no indication for the transfer could be inferred from the phylogenetic tree. Compositional bias for <it>Halobacterium </it>sp. and its closest phylogenetic neighbor was evaluated using a chi-square test at a 5% significance level as implemented in TREE-PUZZLE version 5.1. If both sequences failed the test, this is indicated as "strong" in the table. The results of these analyses are summarized in Table <tblr tid="T1">1</tblr>. The phylogenetic trees are available as additional data (see <supplr sid="S1">1</supplr>).</p>
         <suppl id="S1">
            <title>
               <p>Additional File 1</p>
            </title>
            <text>
               <p>Phylogenetic trees for the datasets presented in Table <tblr tid="T1">1</tblr>. The trees are in the PDF format that can be viewed with Adobe Acrobat Reader <url>http://www.acrobat.com</url>. The eighteen trees are archived into one file trees.zip, and the archive can be expanded using WinZip <url>http://www.winzip.com/</url> for Windows, StuffIt for Macintosh <url>http://www.stuffit.com/</url>, or unzip utility for Unix. The tree files are named after their ID number listed in Table <tblr tid="T1">1</tblr>. The names of the OTUs in the trees are the first 10 symbols of the genus name (see Materials and Methods for the list of the full genome names).</p>
            </text>
            <file name="1471-2164-4-37-S1.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional File 2</p>
            </title>
            <text>
               <p>Bootstrap support maps for non-extended datasets for genome quartets presented in figures <figr fid="F1">1</figr> and <figr fid="F3">3</figr>. The maps are in the PDF format that can be viewed with Adobe Acrobat Reader <url>http://www.acrobat.com</url>.</p>
            </text>
            <file name="1471-2164-4-37-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>LBA &#8211; Long Branch Attraction</p>
         <p>HGT &#8211; Horizontal Gene Transfer</p>
         <p>ML &#8211; Maximum Likelihood</p>
         <p>QuartOP &#8211; Quartet of Orthologous Proteins</p>
         <p>OTU &#8211; Operational Taxonomic Unit</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported through the NASA Astrobiology Institute at Arizona State University, the NASA Exobiology Program, and in part through the NSF Microbial Genetics Program.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The mosaic nature of the eukaryotic nucleus</p>
            </title>
            <aug>
               <au>
                  <snm>Ribeiro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1998</pubdate>
            <volume>15</volume>
            <fpage>779</fpage>
            <lpage>788</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9656480</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Whole-genome analysis of photosynthetic prokaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Raymond</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhaxybayeva</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Gogarten</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Gerdes</snm>
                  <fnm>SY</fnm>
               </au>
               <au>
                  <snm>Blankenship</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>1616</fpage>
            <lpage>1620</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1126/science.1075558</pubid>
                  <pubid idtype="pmpid" link="fulltext">12446909</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Evolution of photosynthetic prokaryotes: a maximum-likelihood mapping approach</p>
            </title>
            <aug>
               <au>
                  <snm>Raymond</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhaxybayeva</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Gogarten</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Blankenship</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Philos Trans R Soc Lond B Biol Sci</source>
            <pubdate>2003</pubdate>
            <volume>358</volume>
            <fpage>223</fpage>
            <lpage>230</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1098/rstb.2002.1181</pubid>
                  <pubid idtype="pmpid" link="fulltext">12594930</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Bootstrap, Bayesian probability and maximum likelihood mapping: Exploring new tools for comparative genome analyses</p>
            </title>
            <aug>
               <au>
                  <snm>Zhaxybayeva</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Gogarten</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>4</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid" link="fulltext">100357</pubid>
                  <pubid idtype="pmpid" link="fulltext">11918828</pubid>
                  <pubid idtype="doi" link="fulltext">10.1186/1471-2164-3-4</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Defining the core of nontransferable prokaryotic genes: the euryarchaeal core</p>
            </title>
            <aug>
               <au>
                  <snm>Nesbo</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Boucher</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2001</pubdate>
            <volume>53</volume>
            <fpage>340</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1007/s002390010224</pubid>
                  <pubid idtype="pmpid" link="fulltext">11675594</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Cases in which parsimony and compatibility methods will be positively misleading</p>
            </title>
            <aug>
               <au>
                  <snm>Felsenstein</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Syst. Zool.</source>
            <pubdate>1978</pubdate>
            <volume>27</volume>
            <fpage>401</fpage>
            <lpage>410</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Likelihood and Inconsistency</p>
            </title>
            <aug>
               <au>
                  <snm>Farris</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Cladistics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>199</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubid idtype="doi" link="fulltext">10.1006/clad.1999.0104</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Success of Parsimony in the Four-Taxon Case: Long-Branch Repulsion by Likelihood in the Farris Zone</p>
            </title>
            <aug>
               <au>
                  <snm>Siddall</snm>
                  <fnm>ME</fnm>
               </au>
            </aug>
            <source>Cladistics</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>209</fpage>
            <lpage>220</lpage>
            <xrefbib>
               <pubid idtype="doi" link="fulltext">10.1006/clad.1998.0063</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Phylogenies from molecular sequences: inference and reliability</p>
            </title>
            <aug>
               <au>
                  <snm>Felsenstein</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>1988</pubdate>
            <volume>22</volume>
            <fpage>521</fpage>
            <lpage>565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1146/annurev.ge.22.120188.002513</pubid>
                  <pubid idtype="pmpid" link="fulltext">3071258</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The robustness of two phylogenetic methods: four-taxon simulations reveal a slight superiority of maximum likelihood over neighbor joining</p>
            </title>
            <aug>
               <au>
                  <snm>Huelsenbeck</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1995</pubdate>
            <volume>12</volume>
            <fpage>843</fpage>
            <lpage>849</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7476130</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Success of Phylogenetic Methods in the Four-Taxon Case</p>
            </title>
            <aug>
               <au>
                  <snm>Huelsenbeck</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Hillis</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>1993</pubdate>
            <volume>42</volume>
            <fpage>247</fpage>
            <lpage>264</lpage>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The pitfalls of molecular phylogeny based on four species as illustrated by the Cetacea/Artiodactyla relationships.</p>
            </title>
            <aug>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Douzery</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Journal of Mammalian Evolution</source>
            <pubdate>1994</pubdate>
            <volume>2</volume>
            <fpage>133</fpage>
            <lpage>152</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Instability of quartet analyses of molecular sequence data by the maximum likelihood method: the Cetacea/Artiodactyla relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Adachi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hasegawa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>72</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1006/mpev.1996.0059</pubid>
                  <pubid idtype="pmpid" link="fulltext">8812307</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Is it better to add taxa or characters to a difficult phylogenetic problem?</p>
            </title>
            <aug>
               <au>
                  <snm>Graybeal</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>1998</pubdate>
            <volume>47</volume>
            <fpage>9</fpage>
            <lpage>17</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1080/106351598260996</pubid>
                  <pubid idtype="pmpid" link="fulltext">12064243</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Is sparse taxon sampling a problem for phylogenetic inference?</p>
            </title>
            <aug>
               <au>
                  <snm>Hillis</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Pollock</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>McGuire</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Zwickl</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2003</pubdate>
            <volume>52</volume>
            <fpage>124</fpage>
            <lpage>126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1080/10635150309356</pubid>
                  <pubid idtype="pmpid" link="fulltext">12554446</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Taxon sampling, bioinformatics, and phylogenomics</p>
            </title>
            <aug>
               <au>
                  <snm>Rosenberg</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2003</pubdate>
            <volume>52</volume>
            <fpage>119</fpage>
            <lpage>124</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1080/10635150309344</pubid>
                  <pubid idtype="pmpid" link="fulltext">12554445</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Strimmer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>von Haeseler</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>6815</fpage>
            <lpage>6819</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid" link="fulltext">21241</pubid>
                  <pubid idtype="pmpid" link="fulltext">9192648</pubid>
                  <pubid idtype="doi" link="fulltext">10.1073/pnas.94.13.6815</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Eric Weisstein's World of Mathematics</p>
            </title>
            <url>http://mathworld.wolfram.com/</url>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Ancient Phylogenetic Relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Gribaldo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Theoretical Population Biology</source>
            <pubdate>2002</pubdate>
            <volume>61</volume>
            <fpage>391</fpage>
            <lpage>408</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1006/tpbi.2002.1593</pubid>
                  <pubid idtype="pmpid" link="fulltext">12167360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Monophyletic origins of the metazoa: an evolutionary link with fungi</p>
            </title>
            <aug>
               <au>
                  <snm>Wainright</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Hinkle</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sogin</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Stickel</snm>
                  <fnm>SK</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>260</volume>
            <fpage>340</fpage>
            <lpage>342</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8469985</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Horizontal Gene Transfer Accelerates Genome Innovation and Evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Jain</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rivera</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Lake</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Horizontal gene transfer in prokaryotes: quantification and classification</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Annu Rev Microbiol</source>
            <pubdate>2001</pubdate>
            <volume>55</volume>
            <fpage>709</fpage>
            <lpage>742</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1146/annurev.micro.55.1.709</pubid>
                  <pubid idtype="pmpid" link="fulltext">11544372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Genome sequence of Halobacterium species NRC-1</p>
            </title>
            <aug>
               <au>
                  <snm>Ng</snm>
                  <fnm>WV</fnm>
               </au>
               <au>
                  <snm>Kennedy</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Mahairas</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Berquist</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shukla</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Lasky</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Baliga</snm>
                  <fnm>NS</fnm>
               </au>
               <au>
                  <snm>Thorsson</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Sbrogna</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Swartzell</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weir</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dahl</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Welti</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Goo</snm>
                  <fnm>YA</fnm>
               </au>
               <au>
                  <snm>Leithauser</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Keller</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cruz</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Danson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Hough</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Maddocks</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Jablonski</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Angevine</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Dale</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Isenbarger</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Peck</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Pohlschroder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spudich</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Jung</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Alam</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Freitas</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Daniels</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Dennis</snm>
                  <fnm>PP</fnm>
               </au>
               <au>
                  <snm>Omer</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Ebhardt</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lowe</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Riley</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hood</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>DasSarma</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>12176</fpage>
            <lpage>12181</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid" link="fulltext">17314</pubid>
                  <pubid idtype="pmpid" link="fulltext">11016950</pubid>
                  <pubid idtype="doi" link="fulltext">10.1073/pnas.190337797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Homology a personal view on some of the problems</p>
            </title>
            <aug>
               <au>
                  <snm>Fitch</snm>
                  <fnm>WM</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>227</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1016/S0168-9525(00)02005-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10782117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Comparison of bayesian and maximum likelihood bootstrap measures of phylogenetic reliability</p>
            </title>
            <aug>
               <au>
                  <snm>Douady</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Delsuc</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Boucher</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
               <au>
                  <snm>Douzery</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <fpage>248</fpage>
            <lpage>254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1093/molbev/msg042</pubid>
                  <pubid idtype="pmpid" link="fulltext">12598692</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Bayes or bootstrap? A simulation study comparing the performance of bayesian markov chain monte carlo sampling and bootstrapping in assessing phylogenetic confidence</p>
            </title>
            <aug>
               <au>
                  <snm>Alfaro</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Zoller</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lutzoni</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <fpage>255</fpage>
            <lpage>266</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1093/molbev/msg028</pubid>
                  <pubid idtype="pmpid" link="fulltext">12598693</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid" link="fulltext">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi" link="fulltext">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>National Center for Biotechnology Information</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/</url>
         </bibl>
         <bibl id="B29">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7984417</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>PHYLIP (Phylogeny Inference Package) version 3.6.</p>
            </title>
            <aug>
               <au>
                  <snm>Felsenstein</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Distributed by the author. Department of Genetics, University of Washington, Seattle</source>
            <pubdate>1993</pubdate>
         </bibl>
         <bibl id="B31">
            <title>
               <p>TREE-PUZZLE: maximum likelihood phylogenetic analysis using  quartets and parallel computing</p>
            </title>
            <aug>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Strimmer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>von Haeseler</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>502</fpage>
            <lpage>504</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1093/bioinformatics/18.3.502</pubid>
                  <pubid idtype="pmpid" link="fulltext">11934758</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>PAL: an object-oriented programming library for molecular evolution and phylogenetics</p>
            </title>
            <aug>
               <au>
                  <snm>Drummond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Strimmer</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>662</fpage>
            <lpage>663</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1093/bioinformatics/17.7.662</pubid>
                  <pubid idtype="pmpid" link="fulltext">11448888</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>GNUPLOT Central</p>
            </title>
            <url>http://www.gnuplot.info</url>
         </bibl>
         <bibl id="B34">
            <title>
               <p>SEALS: a system for easy analysis of lots of sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>ISMB</source>
            <pubdate>1997</pubdate>
            <volume>5</volume>
            <fpage>333</fpage>
            <lpage>339</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9322058</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Integrated Genomics</p>
            </title>
            <url>http://www.integratedgenomics.com/</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>The Institute for Genomic Research</p>
            </title>
            <url>http://www.tigr.org</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>DOE Joint Genome Institute</p>
            </title>
            <url>http://www.jgi.doe.gov/JGI_microbial/html/index.html</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>MRBAYES: Bayesian inference of phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Huelsenbeck</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Ronquist</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>754</fpage>
            <lpage>755</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi" link="fulltext">10.1093/bioinformatics/17.8.754</pubid>
                  <pubid idtype="pmpid" link="fulltext">11524383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
