<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-8-115</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Opinion</dochead>
      <bibl>
         <title>
            <p>Comparative genomics of archaea: how much have we learned in six years, and what's next?</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Makarova</snm>
               <mi>S</mi>
               <fnm>Kira</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2" ca="yes">
               <snm>Koonin</snm>
               <mi>V</mi>
               <fnm>Eugene</fnm>
               <insr iid="I1"/>
               <email>koonin@ncbi.nlm.nih.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>8</issue>
         <fpage>115</fpage>
         <url>http://genomebiology.com/2003/4/8/115</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-8-115</pubid>
               <pubid idtype="pmpid">12914651</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>16</day>
               <month>7</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Comparative genomics of archaea: how much have we learned in six years, and what's next?</p>
      </shorttitle>
      <shortabs>
         <p>With 16 complete archaeal genomes sequenced to date, comparative genomics has revealed a conserved core of 313 genes that are represented in all sequenced archaeal genomes, plus a variable 'shell' that is prone to lineage-specific gene loss and horizontal gene exchange.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Archaea comprise one of the three distinct domains of life (with bacteria and eukaryotes). With 16 complete archaeal genomes sequenced to date, comparative genomics has revealed a conserved core of 313 genes that are represented in all sequenced archaeal genomes, plus a variable 'shell' that is prone to lineage-specific gene loss and horizontal gene exchange. The majority of archaeal genes have not been experimentally characterized, but novel functional pathways have been predicted.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>"A phylogenetic analysis based upon ribosomal RNA sequence characterization reveals that living systems represent one of three aboriginal lines of descent: (i) the eubacteria, comprising all typical bacteria; (ii) the archaebacteria, containing methanogenic bacteria; and (iii) the urkaryotes, now represented in the cytoplasmic component of eukaryotic cells."</p>
         <p>
            <it>CR Woese and GE Fox, 1977 </it>
            <abbrgrp>
               <abbr bid="B1">1</abbr>
            </abbrgrp>
         </p>
      </sec>
      <sec>
         <st>
            <p>Archaea before and after genomes</p>
         </st>
         <p>The quotation above neatly summarizes what is arguably one of the most important scientific discoveries of the twentieth century (rather remarkably, this quote is the entire abstract of Woese and Fox's groundbreaking article <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>). So profound are its implications that the debate rages to this day: did Carl Woese and George Fox really discover a new domain of life, which is equal in status to bacteria and eukaryotes <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, or is it 'merely' an unusual branch of bacteria <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>? This debate is reflected even in the different names that, 25 years after their description as a distinct, third line of the evolution of life, are still applied to this group of organisms: on the one hand, archaea, in adherence with the three-domain interpretation, and on the other archaeabacteria, emphasizing the purported affinity with bacteria. Of course, Woese and Fox did not actually discover these unusual organisms; some of the would-be archaea have been known for decades and their unusual properties, such as extreme halophilic and extreme thermophilic phenotypes, have been described in considerable detail (see, for example, <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>). The revolutionary aspect of Woese and Fox's work was subtler and more profound: by comparing certain parts of the genomic sequences of various organisms, they came up with a three-domain classification of life, in which a group of prokaryotes they designated archaebacteria has been accorded the status of a distinct domain (subsequently renamed archaea, to emphasize the fundamental separation from other domains), on an equal footing with bacteria and eukaryotes. Numerous microbiologists had seen archaea before, but without Woese and Fox's foray into genome analysis no-one recognized these organisms for what they really were. Their way of comparing genome sequences was, by today's standards, extremely crude, as they analyzed not even sequences but oligonucleotide catalogues of rRNA genes. It is all the more astounding that the principal conclusion achieved with this 'primitive' approach stands to this day, 25 years and 16 complete (and several more nearly complete) archaeal genome sequences later (Table <tblr tid="T1">1</tblr>).</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Completely sequenced archaeal genomes</p>
            </caption>
            <tblbdy cols="8">
               <r>
                  <c ca="left">
                     <p>
                        Species
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        Abbreviation
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        Optimal growth temperature (&#176;C)
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        Lifestyle and other features
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        Number of proteins*
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        Number (%) proteins in COGs
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        Date of genome release
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        Reference
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="8">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>Euryarchaeota</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Archaeoglobus fulgidus DSM</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Afu</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>83</p>
                  </c>
                  <c ca="left">
                     <p>Anaerobic, sulfate-reducing chemolito- or chemorgano-autotroph, motile</p>
                  </c>
                  <c ca="center">
                     <p>2,420</p>
                  </c>
                  <c ca="center">
                     <p>1,953 (81%)</p>
                  </c>
                  <c ca="center">
                     <p>1997</p>
                  </c>
                  <c ca="center">
                     <p>
                        [124]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Halobacterium sp. NRC-1</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Hsp</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>37</p>
                  </c>
                  <c ca="left">
                     <p>Aerobic chemorganotroph, obligate halophile, with a cell envelope; motile; two extrachromosomal elements</p>
                  </c>
                  <c ca="center">
                     <p>2,622</p>
                  </c>
                  <c ca="center">
                     <p>1,809 (69%)</p>
                  </c>
                  <c ca="center">
                     <p>2000</p>
                  </c>
                  <c ca="center">
                     <p>
                        [125]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Methanocaldococcus jannaschii</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Mja</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>85</p>
                  </c>
                  <c ca="left">
                     <p>Chemolitoautotroph, strict anaerobe, methanogen, motile; two extrachromosomal elements</p>
                  </c>
                  <c ca="center">
                     <p>1,758</p>
                  </c>
                  <c ca="center">
                     <p>1,448 (82%)</p>
                  </c>
                  <c ca="center">
                     <p>1996</p>
                  </c>
                  <c ca="center">
                     <p>
                        [27]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Methanopyrus kandleri AV19</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Mka</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>110</p>
                  </c>
                  <c ca="left">
                     <p>Chemolitoautotroph, strict anaerobe, methanogen, with high cellular salt concentration</p>
                  </c>
                  <c ca="center">
                     <p>1,691</p>
                  </c>
                  <c ca="center">
                     <p>1,253 (74%)</p>
                  </c>
                  <c ca="center">
                     <p>2002</p>
                  </c>
                  <c ca="center">
                     <p>
                        [45]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Methanosarcina acetivorans C2A</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Mac</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>37</p>
                  </c>
                  <c ca="left">
                     <p>Chemolitoautotroph, anaerobe possibly capable of aerobic growth; nitrogen-fixing, versatile methanogen; motile, and able to form multicellular structures</p>
                  </c>
                  <c ca="center">
                     <p>4,540</p>
                  </c>
                  <c ca="center">
                     <p>3,142 (69%)</p>
                  </c>
                  <c ca="center">
                     <p>2002</p>
                  </c>
                  <c ca="center">
                     <p>
                        [55]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Methanosarcina mazei Goe1</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Mma</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>37</p>
                  </c>
                  <c ca="left">
                     <p>As for <b>Mac</b></p>
                  </c>
                  <c ca="center">
                     <p>3,371</p>
                  </c>
                  <c ca="center">
                     <p>N/A</p>
                  </c>
                  <c ca="center">
                     <p>2002</p>
                  </c>
                  <c ca="center">
                     <p>
                        [54]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Methanothermobacter thermoautotrophicus delta H</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Mth</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>65</p>
                  </c>
                  <c ca="left">
                     <p>Chemolitoautotroph, strict anaerobe, nitrogen-fixing, methanogen</p>
                  </c>
                  <c ca="center">
                     <p>1,873</p>
                  </c>
                  <c ca="center">
                     <p>1,500 (80%)</p>
                  </c>
                  <c ca="center">
                     <p>1997</p>
                  </c>
                  <c ca="center">
                     <p>
                       [126]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Pyrococcus horikoshii</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Pho</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>96</p>
                  </c>
                  <c ca="left">
                     <p>Anaerobic heterotroph, sulfur enhances growth; motile</p>
                  </c>
                  <c ca="center">
                     <p>1,801</p>
                  </c>
                  <c ca="center">
                     <p>1,425 (79%)</p>
                  </c>
                  <c ca="center">
                     <p>1998</p>
                  </c>
                  <c ca="center">
                     <p>
                        [127]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Pyrococcus abyssi</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Pab</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>96</p>
                  </c>
                  <c ca="left">
                     <p>As for <b>Pho</b></p>
                  </c>
                  <c ca="center">
                     <p>1,769</p>
                  </c>
                  <c ca="center">
                     <p>1,506 (85%)</p>
                  </c>
                  <c ca="center">
                     <p>2001</p>
                  </c>
                  <c ca="center">
                     <p>
                        [128]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Pyrococcus furiosus DSM 3638</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Pfu</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>96</p>
                  </c>
                  <c ca="left">
                     <p>As for <b>Pho</b></p>
                  </c>
                  <c ca="center">
                     <p>2,065</p>
                  </c>
                  <c ca="center">
                     <p>N/A</p>
                  </c>
                  <c ca="center">
                     <p>2001</p>
                  </c>
                  <c ca="center">
                     <p>
                        [129]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Thermoplasma acidophilum</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Tac</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>59</p>
                  </c>
                  <c ca="left">
                     <p>Facultative anaerobe, chemorganotroph, thermoacidophilic, anaerobically able to metabolize sulfur; motile, with a plasma membrane</p>
                  </c>
                  <c ca="center">
                     <p>1,482</p>
                  </c>
                  <c ca="center">
                     <p>1,261 (85%)</p>
                  </c>
                  <c ca="center">
                     <p>2000</p>
                  </c>
                  <c ca="center">
                     <p>
                        [96]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Thermoplasma volcanium</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Tvo</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>60</p>
                  </c>
                  <c ca="left">
                     <p>As for <b>Tac</b></p>
                  </c>
                  <c ca="center">
                     <p>1,499</p>
                  </c>
                  <c ca="center">
                     <p>1,277 (85%)</p>
                  </c>
                  <c ca="center">
                     <p>2000</p>
                  </c>
                  <c ca="center">
                     <p>
                        [130]
                     </p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>Crenarchaeota</b>
                     </p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Pyrobaculum aerophilum</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Pae</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>100</p>
                  </c>
                  <c ca="left">
                     <p>Facultative nitrate-reducing anaerobe</p>
                  </c>
                  <c ca="center">
                     <p>1,840</p>
                  </c>
                  <c ca="center">
                     <p>1,236 (67%)</p>
                  </c>
                  <c ca="center">
                     <p>2002</p>
                  </c>
                  <c ca="center">
                     <p>
                        [131]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Aeropyrum pernix</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Ape</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>90</p>
                  </c>
                  <c ca="left">
                     <p>Aerobic chemorganotroph; sulfur enhances growth</p>
                  </c>
                  <c ca="center">
                     <p>2,605</p>
                  </c>
                  <c ca="center">
                     <p>1,529 (59%)</p>
                  </c>
                  <c ca="center">
                     <p>1999</p>
                  </c>
                  <c ca="center">
                     <p>
                        [132]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Sulfolobus solfataricus</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Sso</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>80</p>
                  </c>
                  <c ca="left">
                     <p>Aerobe metabolizing sulfur; thermo-acidophilic chemorganotroph; motile</p>
                  </c>
                  <c ca="center">
                     <p>2,977</p>
                  </c>
                  <c ca="center">
                     <p>2,207 (74%)</p>
                  </c>
                  <c ca="center">
                     <p>2001</p>
                  </c>
                  <c ca="center">
                     <p>
                        [97]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <it>Sulfolobus tokodaii</it>
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        <b>Sto</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>80</p>
                  </c>
                  <c ca="left">
                     <p>As for <b>Sso</b></p>
                  </c>
                  <c ca="center">
                     <p>2,826</p>
                  </c>
                  <c ca="center">
                     <p>N/A</p>
                  </c>
                  <c ca="center">
                     <p>2001</p>
                  </c>
                  <c ca="center">
                     <p>
                        [133]
                     </p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*According to the original genome annotation</p>
            </tblfn>
         </tbl>
         <p>In the years following Woese and Fox's breakthrough <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, many unique features of archaea have become apparent. To begin with, many of these organisms thrive under conditions that, by the usual standards of biology, seem unimaginable, such as in the water in the vicinity of the hydrothermal vents called 'black smokers' heated to over-boiling temperatures and saturated with hydrogen sulfide, or in extreme salinity <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. In the most extreme hyperthermophilic habitats, archaea are, in fact, the only detectable life forms. In more moderate environments, archaea coexist with bacteria and eukaryotes, and their ecological importance is being increasingly recognized <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The first molecular biological studies showed that archaea are highly unusual and clearly distinct from bacteria at the molecular level. In particular, the structure of the membrane glycerolipids in archaea is different from that of bacterial and eukaryal cells, and archaea do not contain murein, the predominant component of bacterial cell walls <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>.</p>
         <p>But the most striking differences between archaea and bacteria are seen in the organization of their information-processing systems. The structures of ribosomes and chromatin, the presence of histones, and sequence similarity between proteins involved in translation, transcription, replication and DNA repair all point to a closer relationship between archaea and eukaryotes than between either of these and bacteria <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Moreover, the key components of the DNA replication machinery - such as the polymerases involved in elongation and initiation and the replicative helicases - are not homologous, or at least not orthologous, in archaea and eukaryotes on the one hand, and bacteria on the other <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B22">22</abbr></abbrgrp>. This observation led to the hypothesis that replication of double-stranded DNA as the principal form of replication of the genetic material was 'invented' twice, independently: once in bacteria and once in the ancestor of archaea and eukaryotes <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>. In contrast many - although not all - of the metabolic pathways of archaea more closely resemble their bacterial rather than eukaryotic counterparts <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. These studies support the status of archaea as a distinct domain of life with specific connections to eukaryotes, and emphasize the unusual and unique nature of archaeal genomes.</p>
         <p>The new age of archaea began in 1996 with the whole-genome shotgun sequencing of the first archaeal genome, that of <it>Methanococcus </it>(now <it>Methanocaldococcus</it>) <it>jannaschii </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The <it>Methanococcus </it>'genomescape' at first looked largely mysterious, with clear functional assignments produced for only 38% of the genes <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. A more detailed computational analysis that pushed the methodology available at the time to its limits yielded general functional predictions for up to 70% of the genes, showing that a solid connection between the genomes of archaea and those of other, better known forms of life did exist <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Nevertheless, the fact remained that, more than anything, the first sequenced archaeal genome revealed the depth of our ignorance of the biology of this remarkable group of organisms. Subsequent genome sequencing, while certainly less extensive than the devoted 'archaeologists' would wish, produced a rich sampling of genomes of taxonomically diverse archaea (Table <tblr tid="T1">1</tblr>). This set of completely sequenced genomes includes multiple representatives of the two major divisions of the archaea established by phylogenetic analysis of rRNA, namely the Euryarchaeota and the Crenarchaeota <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, as well as the principal ecological types of archaea, such as hyperthermophiles, moderate thermophiles, and mesophiles, as well as halophiles and methanogens; autotrophic and heterotrophic forms, and anaerobes and aerobes are also represented by multiple species (Table <tblr tid="T1">1</tblr>).</p>
         <p>Some potentially important branches of archaea are still missing from sequence databases, however, such as the mysterious Korachaeota, which might have branched off the trunk of the phylogenetic tree prior to the divergence of the remainder of the archaea <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, and the equally intriguing Nanoarchaea that so far seem to have the smallest genomes of all known cellular life forms <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. These lacunae notwithstanding, the available sampling of archaeal genomes is substantial and is complemented by an even greater diversity of bacterial and eukaryotic genomes that are available for comparative analysis. This article critically assesses the contribution of comparative genomics to our understanding of the functional systems of archaeal cells and their evolution. We pose the following question: what have we learned from comparisons of archaeal genomes that could not easily have been learned by other, more traditional approaches? We suggest some tentative answers, as we see them. What follows is a viewpoint from behind a computer terminal; we realize that, from the experimenter's bench, the perspective might be somewhat different.</p>
      </sec>
      <sec>
         <st>
            <p>Evolutionary archaeogenomics</p>
         </st>
         <p>From the beginning of comparative genomics, it has been obvious that genome comparisons will yield valuable functional and evolutionary information only within a framework of the rational classification of genes and proteins. In our view, perhaps the most natural form of such a classification is a system of orthologous gene sets, which allows a researcher to analyze the evolutionary fate of each individual gene <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Orthologs are homologous genes that evolved from a single ancestral gene in the last common ancestor of the compared genomes, whereas paralogs are genes related via duplication within a genome <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. When duplication(s) succeeds speciation, a family of paralogs in one species should be considered orthologous to the corresponding family in the other species <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Insomuch as orthologous relationships are correctly defined, phyletic (or phylogenetic) patterns of orthologous gene sets help in the prediction of gene functions and provide clues to the prevailing trends in genome evolution (a phyletic pattern is defined, simply, as the pattern of representation of genomes in each orthologous set) <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B31">31</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. These phyletic patterns are captured in the database of Clusters of Orthologous Groups of proteins (COGs) <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, and here we use COGs for a systematic survey of archaeal genomes (most of the phyletic pattern analyses can be done directly on the COG website by using the phyletic pattern search tool <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>).</p>
         <p>The most common phyletic patterns found in archaea are shown in Table <tblr tid="T2">2</tblr>. Not unpredictably, the top pattern consists of the 313 COGs that are represented in all archaeal genomes sequenced so far. What is more remarkable is that this apparent conserved core of archaeal genomes has undergone only limited shrinkage since the time it was first defined by comparative analysis of four archaeal genomes <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> (Figure <figr fid="F1">1</figr>). Extrapolating from the effect (or rather the near lack thereof) of the latest additions to the collection of archaeal genomes on the size of the conserved core of archaeal genes, we are compelled to conclude that around 300 genes are shared by all archaea, encode essential functions and have not been subject to non-orthologous gene displacement during archaeal evolution (non-orthologous gene displacement is a widespread phenomenon whereby a gene responsible for an essential function is displaced by an unrelated or distantly related gene responsible for the same function <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>).</p>
         <tbl id="T2">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>The top 15 phyletic patterns in proteins from archaea</p>
            </caption>
            <tblbdy cols="3">
               <r>
                  <c ca="left">
                     <p>
                        Pattern*
                     </p>
                     <p>
                        <graphic file="gb-2003-4-8-115-i1.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        Number of COGs (and of the complementary pattern, CP)
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        Comments and examples
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="3">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i2.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>313 </b>(0)</p>
                  </c>
                  <c ca="left">
                     <p>Archaeal core, including 200 COGs present in both <b>B</b><sup>&#8224; </sup>and <b>E</b>, 34 present in at least one <b>B</b>, 63 present in at least one <b>E</b>, 16 unique for <b>A</b></p>
                     <p><b>CP: </b>Only COG0564, pseudouridylate synthase, 23S RNA-specific pseudouridylate synthase present in all <b>E </b>(in which it has an apparently mitochondrial origin) and <b>B</b>, but not in <b>A</b>. In all <b>A </b>another specific pseudouridylate synthase is present (COG1258)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i3.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>163 </b>(3)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern reflects a large number of genes acquired via HGT<sup>&#8224; </sup>in <b>Mac </b>(see [55]), including F<sub>0</sub>F<sub>1</sub>-type ATP synthase and NADH<b>:</b>ubiquinone oxidoreductase, and a specific signal transduction system based on several apoptosis-related domains</p>
                     <p><b>CP: </b>The small number of such COGs indicates that the archaeal core is almost fully conserved in <b>Mac</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i4.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>79 </b>(14)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern reflects a substantial amount of HGT in <b>Hsp</b>; see [125]</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i5.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>47 </b>(7)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern consists of COGs including four methanogens and <b>Afu</b>; these organisms specifically share several metabolic pathways (see [45]). The set includes subunits of coenzyme F420-reducing hydrogenase, formylmethanofuran dehydrogenase, CO dehydrogenase/acetyl-CoA synthase and other enzymes of energy metabolism These might have originally evolved in methanogens and subsequently transferred to <b>Afu.</b></p>
                     <p><b>CP: </b>Sugar ABC transporter and some fatty acid biosynthesis enzymes are missing from methanogens and <b>Afu</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i6.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>40 </b>(2)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern is specific for four methanogens, including unique pathways for coenzyme M biosynthesis and reduction and 14 uncharacterized proteins, many of which are likely to be unique enzymes involved in biosynthesis of other specific coenzymes and their utilization</p>
                     <p><b>CP: </b>COG2096, cob(I)alamin adenosyltransferase and COG1058, predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA, for which functional substitutes remain to be identified</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i7.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>33 </b>(16)</p>
                  </c>
                  <c ca="left">
                     <p>A pattern specific for thermophilic methanogens (<b>Mth, Mja </b>and <b>Mka</b>), comprising mostly uncharacterized COGs, it includes a specific membrane complex EhaA-EhaP (approximately 18 components) involved in hydrogen production and possibly electron transfer [45,134]</p>
                     <p><b>CP: </b>Specific gene loss: peptide ABC-type transporter, NADH: ubiquinone oxidoreductase, malic enzyme (COG0281), and cysteinyl-tRNA synthetase (COG0215; see text)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i8.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>28 </b>(6)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern reflects a substantial amount of HGT in <b>Sso</b>, including several enzymes of carbohydrate metabolism (beta-glucosidase, alpha-L-fucosidase, and malto-oligosyl trehalose synthase) [97]</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i9.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>27 </b>(1)</p>
                  </c>
                  <c ca="left">
                     <p>This reflects a substantial amount of HGT in <b>Afu</b></p>
                     <p><b>CP: </b>COG0449, glucosamine 6-phosphate synthetase, which catalyzes the first step in hexosamine metabolism. A functional substitute remains to be identified</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i10.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>25 </b>(4)</p>
                  </c>
                  <c ca="left">
                     <p>A pattern specific for two mesophilic archaea, probably resulting from independent HGT</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i11.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>23 </b>(7)</p>
                  </c>
                  <c ca="left">
                     <p>This pattern includes genes that might have been acquired via HGT in <b>Mja</b>, in particular three enzymes of biotin biosynthesis: pimeloyl-CoA synthetase (COG1424), dethiobiotin synthetase (COG0132), and adenosylmethionine-8-amino-7-oxononanoate aminotransferase (COG0161)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i12.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>21 </b>(13)</p>
                  </c>
                  <c ca="left">
                     <p>A crenarchaea-specific pattern, including 11 COGs that do not have orthologs outside this lineage. Among genes shared with bacteria but not euryarchaeota are three subunits of aerobic-type CO dehydrogenase and CO dehydrogenase maturation factor. Genes specifically shared with eukaryotes are three ribosomal proteins (S30, S25 and L13E)</p>
                     <p><b>CP: </b>Euryarchaea-specific pattern, including two subunits of archaeal DNA polymerase II and ERCC4-like helicase, division GTPase FtsZ (COG0206) and ATP-dependent protease LonB (COG1067) plus six COGs that do not have orthologs outside this lineage</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i13.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>20 </b>(0)</p>
                  </c>
                  <c ca="left">
                     <p>Apparent independent HGT to <b>Mac </b>and <b>Afu</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i14.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>19 </b>(16)</p>
                  </c>
                  <c ca="left">
                     <p>Apparent specific gene loss in the <it>Thermoplasma </it>lineage: two subunits of topoisomerase VI (COG1389, 1697), adenylate cyclase of class 2 (COG1437), and predicted exosome subunits (COG1325, COG1931)</p>
                     <p><b>CP: </b>genes apparently acquired via HGT in <it>Thermoplasma</it>, including bacterial nucleoid DNA-binding protein HU (COG0776). See also [96]</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i15.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>18 </b>(6)</p>
                  </c>
                  <c ca="left">
                     <p>Apparent gene loss in <b>Ape</b>, including 9 enzymes of purine biosynthesis [135].</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <graphic file="gb-2003-4-8-115-i16.gif"/>
                     </p>
                  </c>
                  <c ca="center">
                     <p><b>17 </b>(11)</p>
                  </c>
                  <c ca="left">
                     <p>Apparent HGT in <it>Pyrococci</it>. Includes two subunits of allophanate hydrolase (COG1984, 2049), two enzymes of carbohydrate metabolism, &#946;-galactosidase (COG1874) and endoglucanase (COG2730)</p>
                     <p><b>CP: </b>Specific gene loss in the <it>Pyrococcus </it>lineage includes five enzymes of heme biosynthesis</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*The pattern of appearance within the 13 sequenced archaeal species currently available in the COG database. Species abbreviations are as given in Table <tblr tid="T1">1</tblr> and are written vertically. <sup>&#8224;</sup>Abbreviations: <b>A</b>, archaea; <b>B</b>, bacteria; <b>E</b>, eukaryotes; <b>CP</b>, complementary pattern; HGT, horizontal gene transfer.</p>
            </tblfn>
         </tbl>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>The archaeal gene core: changes resulting from the appearance of new genome sequences</p>
            </caption>
            <text>
               <p>The archaeal gene core: changes resulting from the appearance of new genome sequences. Black bars indicate the current set of pan-archaeal genes (313 COGs); gray indicates COGs that are not part of the current pan-archaeal core but are seen to be conserved after the addition of the given genome sequence. The genomes are listed from left to right in chronological order of release of the complete sequence; species name abbreviations are as in Table <tblr tid="T1">1</tblr>.</p>
            </text>
            <graphic file="gb-2003-4-8-115-1"/>
         </fig>
         <p>Of the COGs represented in all archaea, 16 so far have no members from other domains of life and comprise a unique archaeal genomic signature, whereas 61 are exclusively archaeo-eukaryotic. The majority of the pan-archaeal genes are known to be involved in, or are implicated in, information processing, particularly translation and RNA modification (Figure <figr fid="F2">2</figr>). Strikingly, among the 61 COGs that are uniquely shared by archaea and eukaryotes, only two do not, technically, belong to the information-processing machinery (COG1936, a nucleotide kinase, and COG3642, a protein kinase typically fused to a metalloprotease domain); the 10 uncharacterized COGs in this category consist of proteins whose predicted biochemical activity (GTPase, methyltransferase or RNA-binding protein) suggests a role in translation or RNA modification.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Functional breakdown of genes within the conserved archaeal core</p>
            </caption>
            <text>
               <p>Functional breakdown of genes within the conserved archaeal core. 'Universal' indicates genes with orthologs in both bacteria and eukaryotes; 'eukaryotic', genes with orthologs only in eukaryotes; 'bacterial', genes with orthologs only in bacteria; 'archaeal', genes without non-archaeal orthologs. The data on orthology and functional classification are derived from the COGs.</p>
            </text>
            <graphic file="gb-2003-4-8-115-2"/>
         </fig>
         <p>Thus, phyletic pattern analysis strongly supports the identity of archaea as a distinct group of organisms with a stable, conserved core of genes that primarily encodes proteins involved in the replication and expression of the genome. Furthermore, there is clearly a subset of genes, again primarily associated with information processing, that is shared by archaea and eukaryotes, to the exclusion of bacteria; this is compatible with the archaeo-eukaryotic affinity suggested by phylogenetic analyses of rRNA and proteins involved in translation, transcription and replication. The fact that this archaeo-eukaryotic component is quantitatively small, however, shows that the process of evolution has been more complex than simple vertical inheritance and has involved extensive horizontal gene transfer (HGT) between archaea and bacteria, at least outside the core gene set <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B41">41</abbr></abbrgrp>. An intensely mixing pool of genes coding for metabolic enzymes, structural components of the cell and other proteins outside the central information-processing machinery might have existed after the divergence of bacteria and archaea but prior to the separation of the major archaeal and bacterial lineages.</p>
         <p>More recent HGT, which has emerged as a major aspect of prokaryotic evolution in general <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>, was apparently prominent in all archaea, although gene exchange with bacteria seems to have been much less extensive in hyperthermophiles than in mesophiles such as <it>Methanosarcina </it>or even <it>Halobacterium </it><abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr></abbrgrp>. Apparent preferential HGT has been noticed between archaea and hyperthermophilic bacteria, such as <it>Aquifex </it>and <it>Thermotoga</it>; when compared to bacterial mesophiles these bacteria have many more proteins with greater similarity to archaeal than to bacterial homologs <abbrgrp><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>. With HGT, or more precisely the pivotal role of HGT in evolution, remaining a controversial subject <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>, this conclusion has been disputed on the grounds that <it>Aquifex </it>and <it>Thermotoga </it>might be early-branching bacteria retaining ancestral features in many protein sequences <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. But this argument seems untenable simply because of the obvious split of the gene complements of these bacteria into 'garden variety' bacterial genes and 'archaeal' genes <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. The reality of horizontal gene flow from archaea to thermophilic bacteria becomes even more tangible upon examination of the proteins encoded in the genome of <it>Thermoanaerobacter tengcongensis </it><abbrgrp><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>, which contains many more 'archaeal' genes than appear in other bacteria of the <it>Bacillus-Clostridium </it>group and to which the early-branching argument would not apply.</p>
         <p>Although archaeal hyperthemophiles do not appear to have many genes acquired via HGT from bacteria, at least after the divergence of the archaeal lineages, horizontal gene exchange between archaea themselves might have been extensive. Strikingly, even within the conserved core of archaeal genes, major diversity of phylogenetic tree topologies has been observed (<abbrgrp><abbr bid="B53">53</abbr></abbrgrp> and Y.I. Wolf and E.V.K., unpublished observations). As noted by Nesbo and coworkers <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, "the notion that there is a core of nontransferable genes...has not been proven and may be unprovable". These findings do not invalidate the notion of a core of indispensable genes that are conserved across archaea but suggest a wide spread of xenologous gene displacement, whereby an essential gene is displaced by an ortholog from a distant lineage, typically via an intermediate stage of redundancy <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>.</p>
         <p>Other phyletic patterns that are common among archaea seem primarily to reflect HGT or gene loss prevalent in individual archaeal lineages (Table <tblr tid="T2">2</tblr>). Thus, <it>Methanosarcina</it>, a mesophile with by far the largest genome among the sequenced archaeal genomes, is represented in numerous COGs that have no other archaeal members but are present in various groups of bacteria. This organism, which coexists with a diverse bacterial biota, appears to be a veritable sink for horizontally acquired bacterial genes <abbrgrp><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp>. Similar, if less dramatic, evidence of apparent horizontal gene transfer was seen in <it>Halobacterium</it>, <it>Sulfolobus</it>, and <it>A. fulgidus </it>(Table <tblr tid="T2">2</tblr>; <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>). Of further note are the patterns of genes that are ubiquitous in one of the major branches of archaea, namely Euryarchaeota or Crenarchaeota, but are missing from the other branch. While quantitatively small, the set of euryarchaea-specific genes includes those for several crucial cellular functions, such as the two subunits of DNA polymerase II and the FtsZ GTPase that is required for cell division in Euryarchaeota and bacteria but missing from Crenarchaeota and eukaryotes.</p>
         <p>Phyletic patterns can be used for interesting and potentially useful forays into functional genomics - more specifically for the identification of the genomic cognates of particular phenotypes. The most dramatic phenotypic characteristic of archaea is hyperthermophily, and attempts have been made to use the phyletic pattern approach to identify a gene set typical of hyperthermophiles. Strikingly, there is only one COG that is represented in all hyperthermophiles (both bacteria and archaea) but not in any other sequenced genomes, the reverse gyrase (<abbrgrp><abbr bid="B56">56</abbr></abbrgrp>; COG1110). Reverse gyrase consists of a topoisomerase and a helicase domain and functions to introduce negative supercoiling into DNA; this activity is apparently required for DNA replication and gene expression at extreme high temperatures <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. But 'clean' phyletic patterns that have an unequivocal association with a given phenotype are an exception rather than the rule, so flexible pattern selection approaches have been employed. Our recent analysis of phyletic patterns enriched in archaeal and bacterial hyperthermophiles yielded around 60 COGs potentially related to this phenotype <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. About one quarter of these COGs encode parts of a predicted DNA repair system that is largely characteristic of thermophiles (<abbrgrp><abbr bid="B59">59</abbr></abbrgrp> and see below). The remaining COGs in this set suggest the existence of a transcriptional regulator that might be involved in adaptation to hyperthermal environments, and a distinct class of enzymes, the <it>S</it>-adenosyl methionine (SAM)-radical enzymes, whose chemistry is likely to be particularly efficient under these conditions <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. Finally, a substantial number of COGs are specific for methanogens or shared by the methanogens and <it>A. fulgidus </it>(Table <tblr tid="T2">2</tblr> and <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>). Many of these include known or predicted enzymes involved in methanogenesis and associated metabolic pathways <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B60">60</abbr></abbrgrp>; others remain to be characterized and are likely to encode additional components of these pathways.</p>
         <p>Further functional and evolutionary information can be extracted from complementary phyletic patterns, which are the signature of non-orthologous gene displacement <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B61">61</abbr></abbrgrp>. Although the complementarity is, most often, only partially due to redundancy in some species, several cases of near-perfect complementarity among archaea are notable, such as the two classes of unrelated lysyl-tRNA synthetases <abbrgrp><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>, and two forms of thymidylate synthase that are also unrelated to each other <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B64">64</abbr></abbrgrp>. Below, when discussing functional genomics of the archaea, we return to the use of conserved and complementary phyletic patterns for functional prediction.</p>
      </sec>
      <sec>
         <st>
            <p>Genome-wide phylogeny of archaea and reconstruction of archaeal ancestors</p>
         </st>
         <p>Comparative genomics nowadays includes a new variety of phylogenetic analysis, which for short has been dubbed genome-tree construction. Under this approach, phylogenetic trees are built not from the sequences of a single gene (such as an rRNA) but from concatenated sequences of multiple genes (proteins), from other, integral measures of the evolutionary distance between genomes (for example, the median of the distribution of evolutionary rates between orthologs), or from non-sequence-based measures such as the similarity of gene repertoire and gene orders <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. Generally, it appears that trees produced from concatenated alignments of gene products that are not particularly prone to HGT yield the best resolution <abbrgrp><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr></abbrgrp>. All genome-tree analyses unequivocally supported the monophyly of archaea and the monophyly of Crenarchaeota. Beyond that, however, the genome-tree topology is not necessarily compatible with that of rRNA-based trees. Thus, genome-tree analysis cast doubt on the bifurcation of Euryarchaeota and Crenarchaeota being the first split in archaeal evolution; in some of these analyses, <it>Halobacterium </it>and <it>Thermoplasma </it>branch off first, suggesting that Crenarchaeota are a highly derived lineage that evolved from within Euryarchaeota <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. The same versions of genome-trees strongly suggest monophyly of methanogens, which is compatible with their distinct gene repertoire and life style <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>; but alternative trees constructed from concatenated multiple alignments of a different assortment of translation machinery components support the original divergence of Crenarchaeota and Euryarchaeota but reject the monophyly of methanogens <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B69">69</abbr></abbrgrp>. It appears that a robust phylogeny of archaea will require many additional genome sequences and perhaps also further refinement of phylogenetic methods dealing with long branches and with large amounts of data. The reconstruction of the best approximation of archaeal phylogeny is of interest not so much in and of itself, but more in terms of clarifying the tempo and mode of evolution of this remarkable group of organisms. A definitive tree topology will help answer fundamental questions, such as whether methanogenesis evolved only once or several times, whether the role of histones in chromatin formation is ancestral or derived in the archaeo-eukaryotic lineage, and even the exact evolutionary relationship between archaea and eukaryotes.</p>
         <p>Phylogenetic trees can also be employed for reconstruction of the gene sets of ancestral life forms. Given a species tree topology and phyletic patterns of the maximum possible number of orthologous gene sets (or COGs), the most parsimonious evolutionary scenario, which includes the minimum possible number of elementary events, can be reconstructed using various parsimony algorithms <abbrgrp><abbr bid="B70">70</abbr><abbr bid="B71">71</abbr></abbrgrp>. The elementary events included in this type of analysis are gene gain and gene loss. Gene gain in a given lineage may occur either as emergence of new genes (COGs), primarily via duplication with subsequent radical divergence, or as HGT from other lineages. The relative likelihood of gene loss and gene gain (the gain penalty) substantially affects the reconstructed evolutionary scenario and the gene composition of the reconstructed ancestral genomes - but this parameter is a major unknown. Nevertheless, examination of the gene sets for the last universal common ancestor (LUCA) derived with different gain penalties showed, perhaps rather unexpectedly, that the assumption of equal probabilities of gains and losses (a gain penalty of 1) yields a reasonable reconstruction of the main functional systems of the cell <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>.</p>
         <p>We therefore applied our version of the weighted parsimony algorithm <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>, with that assumption, to the updated set of bacterial, archaeal and eukaryotic genomes (also assuming the dichotomy of Euryarchaeota and Crenarchaeota suggested by rRNA trees and some of the genome-trees) and the results are schematically shown in Figure <figr fid="F3">3</figr> (see also additional data file). This reconstruction suggests that the common ancestor of archaea could have had around 900 genes, with substantial gene gain but only minimal gene loss compared to the more ancient common ancestor of the archaeo-eukaryotic lineage. Obviously, the conserved core of the pan-archaeal genes is a subset of the reconstructed ancestral gene set, but it seems striking that approximately two thirds of the ancestral genes have been lost from at least one of the sequenced archaeal genomes (Figure <figr fid="F3">3</figr>).</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>The most parsimonious scenario for the evolution of the main lineages of life</p>
            </caption>
            <text>
               <p>The most parsimonious scenario for the evolution of the main lineages of life. The red numbers in ovals near the internal nodes show the size of the reconstructed gene sets of the respective ancestral forms. Green numbers show gene gains and brown numbers gene losses assigned to each of the branches in the tree. LUCA, last universal common ancestor.</p>
            </text>
            <graphic file="gb-2003-4-8-115-3"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>From genome comparisons to functional and structural genomics of the archaea</p>
         </st>
         <p>In the era of comparative genomics, experimental studies on a genomic scale lag woefully behind computational studies. The great majority of the genes in most species will never be studied experimentally, and our understanding of the biochemistry and physiology of the respective organisms therefore depends on the transfer of information from functionally characterized orthologs <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B72">72</abbr></abbrgrp>. For both bacteria and eukaryotes, such transfer is facilitated by the availability of a vast body of experimental data on model organisms, such as <it>Escherichia coli</it>, <it>Bacillus subtilis</it>, the yeast <it>Saccharomyces cerevisiae </it>or the fruit fly <it>Drosophila melanogaster</it>. The situation is quite different for archaea because, some genetic studies of mesophilic archaeal species notwithstanding <abbrgrp><abbr bid="B73">73</abbr></abbrgrp>, there is, so far, no satisfactory model system; this results primarily from the fact that most of these organisms grow slowly and are hard to cultivate. The functions of most of the archaeal genes have therefore been predicted by sequence analysis. Moreover, on many occasions the similarity between an archaeal protein and its functionally characterized homolog is so low that computational methods for sequence analysis have to be extended to the limit of their power.</p>
         <p>A substantial fraction of the functional predictions for archaeal proteins appear 'trivial' in the sense that the respective proteins are highly conserved orthologs of well-characterized proteins from model organisms and, for all practical purposes, the validity of the prediction is beyond reasonable doubt (which is not to say that there are no important details of the functions of these proteins that can be uncovered only by experiment). For many other proteins, however, the prediction remains only a pointer to the probable biochemical function while the biology remains a mystery. A rough breakdown of the state of functional characterization of several archaea with sequenced genomes is given in Figure <figr fid="F4">4</figr>. The substantial fraction of genes for which only general, typically biochemical, prediction is available, is testimony to the current limited understanding of archaeal biology. Moreover, even some of the more definitive predictions only serve to emphasize the biological differences between archaea and the bacterial or eukaryotic models from which the predictions are inferred (Table <tblr tid="T3">3</tblr>). A good example is the archaeal ortholog of the bacterial DNA primase (DnaG), which is a highly conserved protein present in all archaea <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The discovery of a predicted bacterial-type primase in archaea was unexpected, given that the archaeal replication system is orthologous to that of eukaryotes and, in particular, archaea encode the two subunits of the eukaryotic-type primase (COG1467 and COG2219; it should be noted parenthetically that detection of the large primase subunit itself required extremely careful sequence analysis due to the low similarity to the eukaryotic ortholog <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>). Given that the niche of the replicative primase seems to be occupied by the eukaryotic-type enzyme <abbrgrp><abbr bid="B74">74</abbr><abbr bid="B75">75</abbr></abbrgrp>, the DnaG ortholog is likely to have a critical role in repair, but beyond this general idea its function has yet to be determined by direct experimentation; such experiments have the potential to reveal completely new repair systems and pathways. Other proteins implicated in repair as a result of exhaustive sequence analysis, such as the putative nucleases encoded by COG1833 and COG1628 (Table <tblr tid="T3">3</tblr>), illustrate the same point: the biochemical activities are predicted but the biology remains to be investigated experimentally.</p>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>Functional breakdown of genes in each of the sequenced archaeal genomes</p>
            </caption>
            <text>
               <p>Functional breakdown of genes in each of the sequenced archaeal genomes. The data are from COGs; species name abbreviations are as in Table <tblr tid="T1">1</tblr>.</p>
            </text>
            <graphic file="gb-2003-4-8-115-4"/>
         </fig>
         <tbl id="T3">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Examples of computational and experimental discovery of unexpected functions in archaea.</p>
            </caption>
            <tblbdy cols="3">
               <r>
                  <c ca="left">
                     <p>
                        COG numbers [37,38]
                        
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        Function and comments
                     </p>
                  </c>
                  <c ca="left">
                     <p>
                        References
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="3">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c cspan="3" ca="left">
                     <p>
                        <b>Computational predictions</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>0012, 1325, 1603, 1369, 0638, 1500, 1097, 689, 2123, 1996, 2136, 2892, 0618, 1782, 1096, 3286, 1761 and more</p>
                  </c>
                  <c ca="left">
                     <p>Archaeal exosome. Orthologs of eukaryotic exosome subunits form the largest conserved superoperon in archaea, after the ribosomal superoperon, suggesting the existence of a physical complex</p>
                  </c>
                  <c ca="left">
                     <p>
                        [88]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1769, 1336, 3337, 1583, 1367, 1604, 1517, 1857, 1688, 1203, 1468, 1518, 2254, 1343, 1353, 1421, 1337, 1567, 1332, 4343</p>
                  </c>
                  <c ca="left">
                     <p>DNA repair system represented primarily in thermophiles</p>
                  </c>
                  <c ca="left">
                     <p>
                        [59]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>0358</p>
                  </c>
                  <c ca="left">
                     <p>Bacterial-type DNA primase (DnaG orthologs)</p>
                  </c>
                  <c ca="left">
                     <p>
                        [24]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1311</p>
                  </c>
                  <c ca="left">
                     <p>Small subunit of euryarchaeal DNA polymerase II, predicted PHP family phosphohydrolase (probably phosphatase); eukaryotic homologs appear to be inactivated</p>
                  </c>
                  <c ca="left">
                     <p>
                        [123]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1833</p>
                  </c>
                  <c ca="left">
                     <p>Uri superfamily endonuclease.</p>
                  </c>
                  <c ca="left">
                     <p>
                        [136]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1628</p>
                  </c>
                  <c ca="left">
                     <p>Endonuclease V homologs.</p>
                  </c>
                  <c ca="left">
                     <p>K.S.M. and E.V.K., unpublished observations</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1679,1786</p>
                  </c>
                  <c ca="left">
                     <p>Aconitase catalytic core and an interacting 'swiveling domain'</p>
                  </c>
                  <c ca="left">
                     <p>K.S.M. and E.V.K., unpublished observations</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1711</p>
                  </c>
                  <c ca="left">
                     <p>Possible subunit of the DNA replication machinery</p>
                  </c>
                  <c ca="left">
                     <p>K.S.M. and E.V.K., unpublished observations</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1310</p>
                  </c>
                  <c ca="left">
                     <p>Zn<sup>2+</sup>-dependent hydrolase homologous to the eukaryotic ubiquitin isopeptidase contained in the proteasome and COP9 signalosome</p>
                  </c>
                  <c ca="left">
                     <p>
                        [137,138]
                     </p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="3" ca="left">
                     <p>
                        <b>Computational predictions validated by experiments</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1708</p>
                  </c>
                  <c ca="left">
                     <p>'Minimal' nucleotidyltransferases</p>
                  </c>
                  <c ca="left">
                     <p>
                        [100,139]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1830</p>
                  </c>
                  <c ca="left">
                     <p>Fructose-1,6-bisphosphate aldolases (DhnA family)</p>
                  </c>
                  <c ca="left">
                     <p>
                        [76,77]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1351</p>
                  </c>
                  <c ca="left">
                     <p>Thymidylate synthase</p>
                  </c>
                  <c ca="left">
                     <p>
                       [61,64]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1685</p>
                  </c>
                  <c ca="left">
                     <p>Shikimate kinase (predicted on the basis of operon organization)</p>
                  </c>
                  <c ca="left">
                     <p>
                        [140]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>3635</p>
                  </c>
                  <c ca="left">
                     <p>Phosphoglycerate mutase</p>
                  </c>
                  <c ca="left">
                     <p>
                           [24,141]
                     </p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c cspan="3" ca="left">
                     <p>
                        <b>Experimental discovery of unexpected protein functions in archaea</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1384</p>
                  </c>
                  <c ca="left">
                     <p>Class I lysyl-tRNA synthetase</p>
                  </c>
                  <c ca="left">
                     <p>[62]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1933</p>
                  </c>
                  <c ca="left">
                     <p>DNA polymerase II</p>
                  </c>
                  <c ca="left">
                     <p>
                        [104]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1980</p>
                  </c>
                  <c ca="left">
                     <p>Fructose 1,6-bisphosphatase</p>
                  </c>
                  <c ca="left">
                     <p>
                        [142]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1630</p>
                  </c>
                  <c ca="left">
                     <p>NurA, a novel 5'-3' nuclease encoded next to Rad50 and Mre11 orthologs; present in all sequenced archaeal genomes and some bacteria</p>
                  </c>
                  <c ca="left">
                     <p>[143] and K.S.M. and E.V.K., unpublished observations</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1812</p>
                  </c>
                  <c ca="left">
                     <p><it>S</it>-adenosylmethionine synthetase, was identified by mass tags</p>
                  </c>
                  <c ca="left">
                     <p>
                        [144]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1591</p>
                  </c>
                  <c ca="left">
                     <p>Holliday junction resolvase</p>
                  </c>
                  <c ca="left">
                     <p>
                       [101]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1581</p>
                  </c>
                  <c ca="left">
                     <p>Alba, a major DNA-binding chromatin protein in Crenarchaeota</p>
                  </c>
                  <c ca="left">
                     <p>
                        [106]
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1945</p>
                  </c>
                  <c ca="left">
                     <p>Pyruvoyl-dependent arginine decarboxylase (PvlArgDC), involved in polyamine biosynthesis</p>
                  </c>
                  <c ca="left">
                     <p>
                        [145]
                     </p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p>Some of the other functional predictions inferred from sequence analysis directly help filling glaring gaps in otherwise well-characterized pathways of archaeal metabolism. A good example of such focused prediction is the identification of an archaeal fructose-1,6-bisphophate aldolase, an indispensable glycolytic enzyme, which was first predicted computationally to be a member of the DhnA family of aldolases by our group <abbrgrp><abbr bid="B76">76</abbr></abbrgrp> and subsequently identified experimentally <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>. In the same vein, during work for this article, we predicted the missing archaeal aconitase, an essential enzyme of the tricarboxylic acid cycle (Table <tblr tid="T3">3</tblr>; K.S.M. and E.V.K., unpublished observations).</p>
         <p>The identities of a considerable number of proteins responsible for essential functions in archaea remain a mystery. Perhaps the most notable case is the missing cysteinyl-tRNA synthetase of thermophilic methanogens. Cysteine is incorporated into the proteins of these organisms as readily as in any others, but they lack an ortholog of cysteinyl-tRNA synthetase. Two different solutions for this paradox have been proposed, one involving an uncharacterized protein that has been proposed to be a 'third class' of aminoacyl-tRNA synthetases <abbrgrp><abbr bid="B78">78</abbr></abbrgrp>, and the other based on the apparent ability of the archaeal prolyl-tRNA synthetase to couple tRNA<sup>Cys </sup>with cysteine <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>. The first hypothesis has been refuted by our group upon more detailed sequence analysis <abbrgrp><abbr bid="B80">80</abbr></abbrgrp>, however, and the second did not seem to be compatible with subsequent structural studies <abbrgrp><abbr bid="B81">81</abbr></abbrgrp>. The real cysteinyl-RNA synthetase of methanogens seems still to be hiding among uncharacterized proteins. Gaping holes also remain in archaeal pathways of isoleucine biosynthesis <abbrgrp><abbr bid="B82">82</abbr></abbrgrp>, heme biosynthesis <abbrgrp><abbr bid="B83">83</abbr></abbrgrp>, biotin biosynthesis <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, and several others.</p>
         <p>Beyond straightforward (even if highly sensitive) sequence analysis, a powerful approach to the prediction of functions involves analysis of various forms of genomic context, or establishing 'guilt by association' <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B84">84</abbr><abbr bid="B85">85</abbr><abbr bid="B86">86</abbr><abbr bid="B87">87</abbr></abbrgrp>. The associations employed to infer gene functions may be manifest at different levels, including the phyletic patterns discussed above, juxtaposition of domains in multidomain proteins, clustering of genes in (predicted) operons, co-expression, and protein-protein interaction. The last two of these types of data, obtained through transcriptomic and proteomic efforts, are becoming increasingly important in the functional genomics of eukaryotes and, to a somewhat lesser extent, bacteria, but are so far unavailable for archaea. The main type of context information in archaea has therefore been obtained by analyzing conserved elements of gene order and multidomain proteins. Only a relatively small fraction (10-15%) of each archaeal genome is covered by evolutionarily conserved gene strings that can be predicted to form operons <abbrgrp><abbr bid="B87">87</abbr></abbrgrp>. Nevertheless, by comparing gene orders in multiple genomes, partially conserved gene neighborhoods can be reconstructed and examination of some of these leads to predictions of functional systems whose existence has not previously been suspected (Table <tblr tid="T3">3</tblr>).</p>
         <p>The most notable illustrations of this approach (both from our own group) are the prediction of the archaeal exosome <abbrgrp><abbr bid="B88">88</abbr></abbrgrp> and a potential new repair system typical of archaeal and bacterial thermophiles <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. The eukaryotic exosome is a multisubunit complex that consists of RNAses, helicases and RNA-binding proteins and is involved in the exonucleolytic degradation of various classes of RNA <abbrgrp><abbr bid="B89">89</abbr><abbr bid="B90">90</abbr><abbr bid="B91">91</abbr></abbrgrp>. During comparative analysis of gene order in prokaryotic genomes, it was observed that a distinct set of genes, some of which encode orthologs of eukaryotic exosome components, form a partially conserved predicted superoperon, which includes in total over 15 genes (although none of the archaeal genomes contains every one of these within the predicted superoperon). In addition to RNAses and RNA-binding proteins (with an RNA helicase apparently encoded in a separate operon), the exosomal superoperon also encodes a proteasome subunit and a subunit of prefoldin, a co-translational molecular chaperone (<abbrgrp><abbr bid="B88">88</abbr></abbrgrp> and Figure <figr fid="F5">5a</figr>). Thus, these observations point to the existence of a multifunctional macromolecular complex that could couple post-translational protein folding with regulated, ATP-dependent degradation of RNA and proteins. This complex remains to be discovered experimentally, and the potential implications for new functional and physical interactions in eukaryotes are also open to experimental study.</p>
         <fig id="F5">
            <title>
               <p>Figure 5</p>
            </title>
            <caption>
               <p>Prediction of gene functions in archaea by genomic context analysis</p>
            </caption>
            <text>
               <p>Prediction of gene functions in archaea by genomic context analysis. <b>(a) </b>The superoperon coding for the predicted archaeal exosome (see [88]). <b>(b) </b>The partially conserved gene neighborhood coding for the predicted repair system found in archaeal and bacterial thermophiles (see [59] for details). <b>(c-e) </b>Predicted operons containing uncharacterized genes in the neighborhood of genes from the following COGs: COG1594, DNA-directed RNA polymerase, subunit M, and transcription elongation factor TFIIS (RPB9); COG0592, encoding a DNA polymerase sliding clamp subunit (PCNA ortholog); COG1631, ribosomal protein L44E; COG1095, DNA-directed RNA polymerase, subunit E' (RPB7); COG2093, DNA-directed RNA polymerase, subunit E" (RPE2); COG2004, ribosomal protein S24E; COG1709, transcriptional regulator; COG3425, 3-hydroxy-3-methylglutaryl CoA synthase (PksG); COG0183, acetyl-CoA acetyltransferase (Fad A/PaaJ orthologs). UC, uncharacterized, shown by white arrows. Species abbreviations are as in Table <tblr tid="T1">1</tblr>. Genes are shown not to scale and are denoted by their respective genes names (some are discussed further in the text); arrows indicate the direction of transcription. A solid line connects genes in a predicted operon. Species that have the same operon organization as the listed species are indicated in parentheses. Orthologous genes are aligned. Genes with similar general functions are shown by the same shading. Broken lines show that genes are in the same predicted operon but are not adjacent. Small arrows indicate the presence of additional functionally related genes in the same predicted operon; these genes are not shown for lack of space.</p>
            </text>
            <graphic file="gb-2003-4-8-115-5"/>
         </fig>
         <p>A more sophisticated comparison of gene orders, which required special algorithms for delineation of partially conserved genomic neighborhoods <abbrgrp><abbr bid="B92">92</abbr></abbrgrp>, led us to predict a distinct DNA repair system that is most prevalent in thermophiles and includes genes for a predicted novel DNA polymerase, a helicase, two nucleases and several uncharacterized genes, at least one of which could encode a novel nuclease (<abbrgrp><abbr bid="B59">59</abbr></abbrgrp> and Figure <figr fid="F5">5b</figr>). Furthermore, this neighborhood contains multiple, diverged versions of a gene coding for a protein with a probable structural role dubbed RAMP (repair-associated mysterious protein). The proliferation of RAMP genes (Figure <figr fid="F5">5b</figr>) is an example of a potentially adaptive lineage-specific expansion of a gene family; such expansions are discussed below in greater detail.</p>
         <p>Additional, simpler cases of functional prediction via 'guilt by association' are illustrated in Figure <figr fid="F5">5c,d,e</figr>. The gene for the uncharacterized protein represented by COG1711 (Figure <figr fid="F5">5c</figr>) forms an evolutionarily highly conserved gene pair with the gene for the clamp subunit of DNA polymerase (ortholog of the eukaryotic PCNA). The orthologs of COG1711 proteins are conserved in all eukaryotes, and this protein might be an essential but still uncharacterized component of the archaeo-eukaryotic DNA replication machinery (K.S.M. and E.V.K., unpublished observations). The gene represented by uncharacterized COG1909 is squeezed between genes for RNA polymerase subunits and that for a ribosomal protein (Figure <figr fid="F5">5d</figr>). Examination of the multiple alignments that lead to this COG shows conservation of polar residues compatible with an enzymatic function (K.S.M. and E.V.K., unpublished observations). There are no readily detectable eukaryotic orthologs for this protein, which is therefore likely to be an archaea-specific enzyme with a house-keeping function.</p>
         <p>Finally, uncharacterized COG1545 consists of genes encoding putative zinc-ribbon-containing proteins that form a stable gene pair with the gene for acetyl-CoA acetyltransferase, a central enzyme of fatty acid biosynthesis (Figure <figr fid="F5">5e</figr>). Both these genes show remarkable paralogous expansion in several archaea, probably as a result of a series of duplications of the gene doublet. It appears likely that proteins from COG1545 form a complex with acetyl-CoA acetyltransferase, with the zinc-ribbon protein regulating and/or stabilizing the enzyme. The predictions depicted in Figure <figr fid="F5">5c,d,e</figr> and other similar ones (<abbrgrp><abbr bid="B87">87</abbr></abbrgrp>; and K.S.M. and E.V.K., unpublished observations) are not particularly precise, even in terms of the biochemical activity of the respective proteins. Nevertheless, guilt by association implicates each of these proteins in specific biological functions, and the evolutionary conservation of both the proteins themselves and the gene order all but proves that their functions are essential. Thus, these proteins appear to be excellent targets for experimental studies, which have the potential to reveal new facets of central cellular processes in archaea.</p>
         <p>Comparative-genomic analysis of prokaryotes and eukaryotes points to lineage-specific expansion (proliferation) of paralogous gene families as a major means by which organisms adapt to their specific environment and lifestyle <abbrgrp><abbr bid="B93">93</abbr><abbr bid="B94">94</abbr><abbr bid="B95">95</abbr></abbrgrp>. A number of such expansions are seen in archaea but in most cases we have, at best, only a vague understanding of the associated biology; several examples are given in Figure <figr fid="F6">6</figr>. The expansion of two groups of permeases in <it>Thermoplasma </it>and <it>Sulfolobus </it>(Figure <figr fid="F6">6a</figr>) clearly reflects the heterotrophic metabolism of the former <abbrgrp><abbr bid="B96">96</abbr></abbrgrp> and the chemo-organotrophic lifestyle of the latter <abbrgrp><abbr bid="B97">97</abbr></abbrgrp>. The specific proliferation of ferredoxin in methanogens (Figure <figr fid="F6">6b</figr>) is also easily explained by the role of these proteins in the oxidoreduction reactions of methanogenesis <abbrgrp><abbr bid="B98">98</abbr></abbrgrp>. The remaining two cases in Figure <figr fid="F6">6c,d</figr> are much more enigmatic. The congruent proliferation of the transcription-initiation factors TFIIB and TFIID in <it>Halobacterium </it>(Figure <figr fid="F6">6c</figr>) might point to unusual aspects of transcription regulation in this archaeon but the details remain obscure. The proliferation of two subunits of a predicted nucleotidyltransferase in several archaea <abbrgrp><abbr bid="B99">99</abbr><abbr bid="B100">100</abbr></abbrgrp> (Figure <figr fid="F6">6d</figr>) is of special interest and might have something to do with thermal adaptation, but the actual functions and even the substrates of these enzymes remain a mystery. Other lineage-specific expansions, such as that of distinct families of predicted ATPases in <it>Methanocaldococcus </it>and <it>Pyrococcus</it>, or a specific family of RadA(RecA)-like ATPases and the UspA-family of NTP-binding proteins in several archaeal species <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>, suggest the existence of unusual pathways, perhaps involved in stress response and signal transduction, but the actual biology associated with these expansions can only be uncovered experimentally.</p>
         <fig id="F6">
            <title>
               <p>Figure 6</p>
            </title>
            <caption>
               <p>Lineage-specific expansions of paralogous gene families in archaea</p>
            </caption>
            <text>
               <p>Lineage-specific expansions of paralogous gene families in archaea. The vertical axis shows the number of members of the indicated COGs. <b>(a) </b>COG0477, permeases of the major facilitator superfamily; COG0531, amino-acid transporters. <b>(b) </b>COG1145, ferredoxin. <b>(c) </b>COG2101, TATA-box binding protein (TBP), a component of transcription initiation factors TFIID and TFIIIB; COG1405, Brf1 subunit of transcription-initiation factor TFIIIB and transcription-initiation factor TFIIB. <b>(d) </b>COG1708, 'minimal' nucleotidyltransferase catalytic subunit; COG2250, 'minimal' nucleotidyltransferase accessory subunit. Species abbreviations are as in Table <tblr tid="T1">1</tblr>.</p>
            </text>
            <graphic file="gb-2003-4-8-115-6"/>
         </fig>
         <p>Archaeal comparative genomics is a young field and so far, as we have seen, largely predictive. But a few experimental studies have already been instigated as a result of comparative-genomic predictions. The discovery of the archaeal fructose-1,6-bisphosphate aldolase mentioned above <abbrgrp><abbr bid="B76">76</abbr><abbr bid="B77">77</abbr></abbrgrp> is a case in point, and several other examples of experimental validation of predictions are given in Table <tblr tid="T3">3</tblr>. It does not seem to be chance that these examples all involve metabolic enzymes for which the specific reaction could be predicted precisely. Validation is likely to be much more difficult for proteins of other functional groups, such as putative repair enzymes, for which the actual substrates are harder to predict.</p>
         <p>For some conserved archaeal proteins, functions cannot be predicted computationally despite considerable effort. Several important discoveries have been made by experimental characterization of such mysterious proteins. The most notable cases include the archaeal Holliday-junction resolvase, which is not related to its functional analog in bacteria <abbrgrp><abbr bid="B101">101</abbr><abbr bid="B102">102</abbr><abbr bid="B103">103</abbr></abbrgrp>, and DNA polymerase II, a highly conserved euryarchaeal protein that is not found outside this lineage and shows no detectable sequence similarity to any other proteins <abbrgrp><abbr bid="B104">104</abbr><abbr bid="B105">105</abbr></abbrgrp>. Additional examples of direct experimental determination of the functions of archaeal proteins that could not be predicted by computational techniques (at least not before the experiment had been reported) are given in Table <tblr tid="T3">3</tblr>.</p>
         <p>Especially notable is the story of the Alba protein, a DNA-binding component of chromatin in Crenarchaeota <abbrgrp><abbr bid="B106">106</abbr><abbr bid="B107">107</abbr></abbrgrp>. As noted above, crenarchaea lack histones and in these organisms Alba appears to be the main chromatin protein, in a striking case of non-orthologous gene displacement. But orthologs of Alba are also present in thermophilic Euryarchaeota and in some eukaryotic lineages, where its functions remain to be elucidated. The most remarkable discovery regarding Alba is the regulation of its interaction with DNA and with the chromatin-associated protein deacetylase Sir2 via lysine acetylation and deacetylation <abbrgrp><abbr bid="B106">106</abbr><abbr bid="B108">108</abbr></abbrgrp>. In eukaryotes, regulation of chromatin dynamics via acetylation and deacetylation occurs through histone tails <abbrgrp><abbr bid="B109">109</abbr></abbrgrp>. Thus, a special case of non-orthologous gene displacement seems to have taken place whereby the regulation mechanism is conserved but the actual substrates are different in archaea and eukaryotes. To add an extra twist to the story, <it>Thermoplasma </it>lacks both histones and Alba but has the bacterial DNA-binding protein HU, pointing to three distinct solutions to the problem of chromatin organization in archaea <abbrgrp><abbr bid="B107">107</abbr></abbrgrp>.</p>
         <p>The last subject we have to briefly touch upon is structural genomics of the archaea. The ultimate goal of the structural genomics enterprise is determining the three-dimensional structure for all proteins, or at least for all sufficiently different proteins encoded in the genomes of diverse life forms <abbrgrp><abbr bid="B110">110</abbr></abbrgrp>. This goal is far from being reached, and targets for structural determination have been prioritized by different researchers on the basis of different principles, from nearly random choice to relatively elaborate strategies, including the use of the COG database <abbrgrp><abbr bid="B111">111</abbr><abbr bid="B112">112</abbr><abbr bid="B113">113</abbr><abbr bid="B114">114</abbr><abbr bid="B115">115</abbr></abbrgrp>. The development of structural genomics so far has been a mixture of success, when informative and interesting structures have been solved, and mild disappointment in cases when the structure determination did not seem to shed any light on a protein's function. Structural genomics could be particularly important in the case of archaea, for which a miniscule number of structures had been solved prior to the launch of structural genomic initiatives, and in which proteins often show low similarity to bacterial or eukaryotic homologs, making homology modeling difficult.</p>
         <p>Notable developments that illustrate both the benefits and the pitfalls of structural genomics, are the concerted effort on 'structural proteomics' of <it>Methanothermobacter thermoautotrophicus </it><abbrgrp><abbr bid="B116">116</abbr></abbrgrp> and a similar project on <it>M. jannaschii </it><abbrgrp><abbr bid="B117">117</abbr></abbrgrp>. The elucidation of the structure of the <it>M. jannaschii </it>protein MJ0577 <abbrgrp><abbr bid="B117">117</abbr></abbrgrp> is an excellent case for the power of structural genomics. Analysis of this structure and accompanying biochemical experiments revealed a distinct nucleotide-binding domain that is distantly related to the catalytic domains of class I aminoacyl-tRNA synthetases and belongs to the so-called HUP fold of nucleotide-binding domains <abbrgrp><abbr bid="B118">118</abbr></abbrgrp>. Together with comprehensive sequence analysis, the determination of this structure provided the structural, functional and evolutionary context for the UspA protein family, which is specifically expanded in archaea <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. The exact function(s) of these proteins remains unknown but, in this case, structural genomics ensured a substantial functional insight. On several other occasions, however, determination of the structures of archaeal proteins has failed to provide clear functional clues; these remain structures in search of a function.</p>
      </sec>
      <sec>
         <st>
            <p>What's around the corner?</p>
         </st>
         <p>The first sequenced archaeal genome was a veritable <it>terra incognita</it>. Six years after that sequence appeared, the archaeal genomescape looks quite different. The principal landmarks have been mapped and now, when a new archaeal genome is released, we largely know what to expect from it. Computational approaches to comparative genomics, combining in-depth sequence and structure comparison with genome context analysis, have led to the reconstruction of the central functional systems of archaeal cells. But these approaches have also produced numerous isolated predictions of biochemical activities of archaeal proteins that remain to be fitted into a general picture, and this can be done only through 'wet' experiments, although new genome sequences will substantially help by enriching the genomic context. A shrinking but still notable set of archaeal genes includes those that encode highly conserved proteins without any clue to function; solving these mysteries has the potential to bring out truly new biology. Furthermore, in this article we have not even touched upon important aspects of archaeal genomics, such as the in-depth studies of the translation system, which have revealed several highly unusual, remarkable mechanisms and enzymatic systems <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B119">119</abbr></abbrgrp> or the identification of regulatory sites in DNA and patterns of transcription regulation <abbrgrp><abbr bid="B120">120</abbr><abbr bid="B121">121</abbr></abbrgrp>. The latter avenue of research is still in its infancy but will certainly grow in scale once more archaeal genomes, and in particular closely related ones, are sequenced.</p>
         <p>Because of the lack of established model systems for archaeal experimental biology and the resulting difficulty with large-scale experimentation, clues from genome comparison are even more crucial for archaeal functional genomics than they are in the case of bacteria or eukaryotes. So far, the input of comparative genomics into actual experiments has been less prominent than we would hope. Simply put, it is not often that experimenters rush to test predictions produced by <it>in silico </it>genome comparison and, furthermore, it is even rarer that targets for functional characterization are carefully prioritized on the basis of how unusual and fundamental the predictions are. As discussed above, however, the few cases when such tests have been performed are encouraging. It is our hope that the future belongs to a much tighter integration of comparative, structural and functional genomics.</p>
         <p>Beyond functional studies, archaeal genomics is fundamental to our understanding of two critical transitions in the evolution of life. The first is the primary split between the bacterial and archaeo-eukaryotic lineages, which might have involved the origin of the DNA-replication machinery and of the large, double-stranded DNA genomes themselves <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>, and the second is the origin of eukaryotes <abbrgrp><abbr bid="B122">122</abbr></abbrgrp>. With regard to the latter problem, archaea are a particularly valuable source of information because, on many occasions, they seem to have retained primitive traits while eukaryotes have undergone major changes. A characteristic example is the small DNA polymerase subunit, which has all the hallmarks of an active phosphatase in archaea, but not in eukaryotes, in which the phosphatase activity is predicted to be inactivated <abbrgrp><abbr bid="B123">123</abbr></abbrgrp>. Indubitably, archaea resemble the common ancestor of the archaeo-eukaryotic line of descent more closely than eukaryotes do, so archaeal genomics is our best chance to reconstruct this critical intermediate in the evolution of life. We are confident that comparative archaeogenomics has a bright future, with major progress in both the functional and the evolutionary avenues of research expected within the next few years.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data file</p>
         </st>
         <p>The list of genes in the reconstructed gene set of the last common ancestor of archaea is available (additional data file <supplr sid="s1">1</supplr>).</p>
         <suppl id="s1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>The list of genes in the reconstructed gene set of the last common ancestor of archaea</p>
            </caption>
            <text>
               <p>The list of genes in the reconstructed gene set of the last common ancestor of archaea</p>
            </text>
            <file name="gb-2003-4-8-115-s1.txt">
               <p>Click here for additional data file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Boris Mirkin for producing the data used for Figure <figr fid="F3">3</figr> and Stephen Bell, Michael Galperin, Dieter S&#246;ll and Yuri Wolf for useful discussions.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Phylogenetic structure of the prokaryotic domain: the primary kingdoms.</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Fox</snm>
                  <fnm>GE</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1977</pubdate>
            <volume>74</volume>
            <fpage>5088</fpage>
            <lpage>5090</lpage>
            <xrefbib>
               <pubid idtype="pmpid">270744</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The phylogeny of prokaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Fox</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Stackebrandt</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hespell</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Maniloff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dyer</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Balch</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Tanner</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Magrum</snm>
                  <fnm>LJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1980</pubdate>
            <volume>209</volume>
            <fpage>457</fpage>
            <lpage>463</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6771870</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya.</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Kandler</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Wheelis</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1990</pubdate>
            <volume>87</volume>
            <fpage>4576</fpage>
            <lpage>4579</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2112744</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Are archaebacteria merely derived 'prokaryotes'?</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1981</pubdate>
            <volume>289</volume>
            <fpage>95</fpage>
            <lpage>96</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6161309</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Two empires or three?</p>
            </title>
            <aug>
               <au>
                  <snm>Mayr</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>9720</fpage>
            <lpage>9723</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.95.17.9720</pubid>
                  <pubid idtype="pmpid" link="fulltext">9707542</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Default taxonomy: Ernst Mayr's view of the microbial world.</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>11043</fpage>
            <lpage>11046</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.95.19.11043</pubid>
                  <pubid idtype="pmpid" link="fulltext">9736686</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Life's third domain (Archaea): an established fact or an endangered paradigm?</p>
            </title>
            <aug>
               <au>
                  <snm>Gupta</snm>
                  <fnm>RS</fnm>
               </au>
            </aug>
            <source>Theor Popul Biol</source>
            <pubdate>1998</pubdate>
            <volume>54</volume>
            <fpage>91</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/tpbi.1998.1376</pubid>
                  <pubid idtype="pmpid" link="fulltext">9733652</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Lysis and dissolution of cells and envelopes of an extremely halophilic bacterium.</p>
            </title>
            <aug>
               <au>
                  <snm>Kushner</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1964</pubdate>
            <volume>87</volume>
            <fpage>1147</fpage>
            <lpage>1156</lpage>
            <xrefbib>
               <pubid idtype="pmpid">5874536</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Lipids of <it>Thermoplasma acidophilum</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Langworthy</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Mayberry</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1972</pubdate>
            <volume>112</volume>
            <fpage>1193</fpage>
            <lpage>1200</lpage>
            <xrefbib>
               <pubid idtype="pmpid">4344918</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p><it>Sulfolobus</it>: a new genus of sulfur-oxidizing bacteria living at low pH and high temperature.</p>
            </title>
            <aug>
               <au>
                  <snm>Brock</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Brock</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Belly</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Weiss</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Arch Mikrobiol</source>
            <pubdate>1972</pubdate>
            <volume>84</volume>
            <fpage>54</fpage>
            <lpage>68</lpage>
            <xrefbib>
               <pubid idtype="pmpid">4559703</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Extremophiles and their adaptation to hot environments.</p>
            </title>
            <aug>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1999</pubdate>
            <volume>452</volume>
            <fpage>22</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(99)00663-8</pubid>
                  <pubid idtype="pmpid">10376671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Life in hot springs and hydrothermal vents.</p>
            </title>
            <aug>
               <au>
                  <snm>Segerer</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Burggraf</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fiala</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pley</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
            </aug>
            <source>Orig Life Evol Biosph</source>
            <pubdate>1993</pubdate>
            <volume>23</volume>
            <fpage>77</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11536528</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A phylogenetic perspective on hyperthermophilic microorganisms.</p>
            </title>
            <aug>
               <au>
                  <snm>DeLong</snm>
                  <fnm>EF</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2001</pubdate>
            <volume>330</volume>
            <fpage>3</fpage>
            <lpage>11</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11210508</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Environmental diversity of bacteria and archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>DeLong</snm>
                  <fnm>EF</fnm>
               </au>
               <au>
                  <snm>Pace</snm>
                  <fnm>NR</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2001</pubdate>
            <volume>50</volume>
            <fpage>470</fpage>
            <lpage>478</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1080/106351501750435040</pubid>
                  <pubid idtype="pmpid">12116647</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Archaeal tetraether lipids: unique structures and applications.</p>
            </title>
            <aug>
               <au>
                  <snm>Hanford</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Peeples</snm>
                  <fnm>TL</fnm>
               </au>
            </aug>
            <source>Appl Biochem Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>97</volume>
            <fpage>45</fpage>
            <lpage>62</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1385/ABAB:97:1:45</pubid>
                  <pubid idtype="pmpid" link="fulltext">11900115</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Structural research on surface layers: a focus on stability, surface layer homology domains, and surface layer-cell wall interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>Engelhardt</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Peters</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Struct Biol</source>
            <pubdate>1998</pubdate>
            <volume>124</volume>
            <fpage>276</fpage>
            <lpage>302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jsbi.1998.4070</pubid>
                  <pubid idtype="pmpid" link="fulltext">10049812</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Archaea and the origin(s) of DNA replication proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Edgell</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1997</pubdate>
            <volume>89</volume>
            <fpage>995</fpage>
            <lpage>998</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9215620</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Diversity of prokaryotic chromosomal proteins and the origin of the nucleosome.</p>
            </title>
            <aug>
               <au>
                  <snm>Sandman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Reeve</snm>
                  <fnm>JN</fnm>
               </au>
            </aug>
            <source>Cell Mol Life Sci</source>
            <pubdate>1998</pubdate>
            <volume>54</volume>
            <fpage>1350</fpage>
            <lpage>1364</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s000180050259</pubid>
                  <pubid idtype="pmpid">9893710</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Archaeal histones and nucleosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Sandman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Soares</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>WT</fnm>
               </au>
               <au>
                  <snm>Reeve</snm>
                  <fnm>JN</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2001</pubdate>
            <volume>334</volume>
            <fpage>116</fpage>
            <lpage>129</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11398455</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Comparative analysis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale.</p>
            </title>
            <aug>
               <au>
                  <snm>Lecompte</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ripp</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Thierry</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Moras</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Poch</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>5382</fpage>
            <lpage>5390</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkf693</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490706</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Evolution of the Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brochier</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Theor Popul Biol</source>
            <pubdate>2002</pubdate>
            <volume>61</volume>
            <fpage>409</fpage>
            <lpage>422</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/tpbi.2002.1592</pubid>
                  <pubid idtype="pmpid" link="fulltext">12167361</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Did DNA replication evolve twice independently?</p>
            </title>
            <aug>
               <au>
                  <snm>Leipe</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>3389</fpage>
            <lpage>3401</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/27.17.3389</pubid>
                  <pubid idtype="pmpid" link="fulltext">10446225</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The origin of DNA genomes and DNA replication proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Curr Opin Microbiol</source>
            <pubdate>2002</pubdate>
            <volume>5</volume>
            <fpage>525</fpage>
            <lpage>532</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1369-5274(02)00360-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">12354562</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mushegian</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>619</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1997.4821861.x</pubid>
                  <pubid idtype="pmpid">9379893</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Horizontal gene transfer among genomes: the complexity hypothesis.</p>
            </title>
            <aug>
               <au>
                  <snm>Jain</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rivera</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lake</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>3801</fpage>
            <lpage>3806</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.96.7.3801</pubid>
                  <pubid idtype="pmpid" link="fulltext">10097118</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
            </aug>
            <source>Sequence - Evolution - Function. Computational Approaches in Comparative Genomics</source>
            <publisher>New York: Kluwer Academic</publisher>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Complete genome sequence of the methanogenic archaeon, <it>Methanococcus jannaschii</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Bult</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>FitzGerald</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Clayton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gocayne</snm>
                  <fnm>JD</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>273</volume>
            <fpage>1058</fpage>
            <lpage>1073</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8688087</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>A molecular view of microbial diversity and the biosphere.</p>
            </title>
            <aug>
               <au>
                  <snm>Pace</snm>
                  <fnm>NR</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>276</volume>
            <fpage>734</fpage>
            <lpage>740</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.276.5313.734</pubid>
                  <pubid idtype="pmpid" link="fulltext">9115194</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont.</p>
            </title>
            <aug>
               <au>
                  <snm>Huber</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hohn</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Rachel</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fuchs</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Wimmer</snm>
                  <fnm>VC</fnm>
               </au>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>417</volume>
            <fpage>63</fpage>
            <lpage>67</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/417063a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11986665</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>The phylum Nanoarchaeota: present knowledge and future perspectives of a unique form of life.</p>
            </title>
            <aug>
               <au>
                  <snm>Huber</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hohn</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
               <au>
                  <snm>Rachel</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Res Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>154</volume>
            <fpage>165</fpage>
            <lpage>171</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0923-2508(03)00035-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">12706504</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>A genomic perspective on protein families.</p>
            </title>
            <aug>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>631</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5338.631</pubid>
                  <pubid idtype="pmpid" link="fulltext">9381173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Distinguishing homologous from analogous proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Fitch</snm>
                  <fnm>WM</fnm>
               </au>
            </aug>
            <source>Syst Zool</source>
            <pubdate>1970</pubdate>
            <volume>19</volume>
            <fpage>99</fpage>
            <lpage>113</lpage>
            <xrefbib>
               <pubid idtype="pmpid">5449325</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Homology: a personal view on some of the problems.</p>
            </title>
            <aug>
               <au>
                  <snm>Fitch</snm>
                  <fnm>WM</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>227</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02005-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10782117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Orthology, paralogy and proposed classification for paralog subtypes.</p>
            </title>
            <aug>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>619</fpage>
            <lpage>620</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02793-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12446146</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Gaasterland</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ragan</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Microb Comp Genomics</source>
            <pubdate>1998</pubdate>
            <volume>3</volume>
            <fpage>199</fpage>
            <lpage>217</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10027190</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Assigning protein functions by comparative genome analysis: protein phylogenetic profiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marcotte</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yeates</snm>
                  <fnm>TO</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>4285</fpage>
            <lpage>4288</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.96.8.4285</pubid>
                  <pubid idtype="pmpid" link="fulltext">10200254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The COG database: new developments in phylogenetic classification of proteins from complete genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Garkavtsev</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Shankavaram</snm>
                  <fnm>UT</fnm>
               </au>
               <au>
                  <snm>Rao</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Kiryutin</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Fedorova</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>22</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/29.1.22</pubid>
                  <pubid idtype="pmpid" link="fulltext">11125040</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Prokaryotic COGs project phyletic pattern search</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/COG/new/release/phylox.cgi</url>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell.</p>
            </title>
            <aug>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>608</fpage>
            <lpage>628</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10413400</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Non-orthologous gene displacement.</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mushegian</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <fpage>334</fpage>
            <lpage>336</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0168-9525(96)20010-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">8855656</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Archaeal genomics: do archaea have a mixed heritage?</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
               <au>
                  <snm>Logsdon</snm>
                  <fnm>JM</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <fpage>R209</fpage>
            <lpage>R211</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9512414</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Phylogenetic classification and the universal tree.</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>284</volume>
            <fpage>2124</fpage>
            <lpage>2129</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.284.5423.2124</pubid>
                  <pubid idtype="pmpid" link="fulltext">10381871</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Lateral genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Trends Cell Biol</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>M5</fpage>
            <lpage>M8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0962-8924(99)01664-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10611671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Horizontal gene transfer in prokaryotes - quantification and classification.</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Annu Rev Microbiol</source>
            <pubdate>2001</pubdate>
            <volume>55</volume>
            <fpage>709</fpage>
            <lpage>42</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.micro.55.1.709</pubid>
                  <pubid idtype="pmpid" link="fulltext">11544372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>The complete genome of hyperthermophile <it>Methanopyrus kandleri </it>AV19 and monophyly of archaeal methanogens.</p>
            </title>
            <aug>
               <au>
                  <snm>Slesarev</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Mezhevaya</snm>
                  <fnm>KV</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Polushin</snm>
                  <fnm>NN</fnm>
               </au>
               <au>
                  <snm>Shcherbinina</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Shakhova</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Belova</snm>
                  <fnm>GI</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>4644</fpage>
            <lpage>4649</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.032671499</pubid>
                  <pubid idtype="pmpid" link="fulltext">11930014</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>442</fpage>
            <lpage>444</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(98)01553-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">9825671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of <it>Thermotoga maritima</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Clayton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gill</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Gwinn</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Haft</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Ketchum</snm>
                  <fnm>KA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1999</pubdate>
            <volume>399</volume>
            <fpage>323</fpage>
            <lpage>329</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/20601</pubid>
                  <pubid idtype="pmpid" link="fulltext">10360571</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Ancient horizontal gene transfer.</p>
            </title>
            <aug>
               <au>
                  <snm>Brown</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>121</fpage>
            <lpage>132</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1000</pubid>
                  <pubid idtype="pmpid" link="fulltext">12560809</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Archaeal and bacterial hyperthermophiles: horizontal gene exchange or common ancestry?</p>
            </title>
            <aug>
               <au>
                  <snm>Kyrpides</snm>
                  <fnm>NC</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>298</fpage>
            <lpage>299</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(99)01811-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">10431189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Reply. Archaeal and bacterial hyperthermophiles: horizontal gene exchange or common ancestry?</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>299</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(99)01786-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10431190</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>A complete sequence of the <it>T. tengcongensis </it>genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Bao</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Tian</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Xuan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Xue</snm>
                  <fnm>Y</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>689</fpage>
            <lpage>700</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.219302</pubid>
                  <pubid idtype="pmpid" link="fulltext">11997336</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>
                  <it>Thermoanaerobacter tengcongensis</it>
               </p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/sutils/taxik.cgi?gi=237</url>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Defining the core of nontransferable prokaryotic genes: the euryarchaeal core.</p>
            </title>
            <aug>
               <au>
                  <snm>Nesbo</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Boucher</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2001</pubdate>
            <volume>53</volume>
            <fpage>340</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s002390010224</pubid>
                  <pubid idtype="pmpid" link="fulltext">11675594</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>The genome of <it>Methanosarcina mazei: </it>evidence for lateral gene transfer between bacteria and archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Deppenmeier</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Johann</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hartsch</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Merkl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schmitz</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Martinez-Arias</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Henne</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wiezer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Baumer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jacobi</snm>
                  <fnm>C</fnm>
               </au>
               <etal/>
            </aug>
            <source>J Mol Microbiol Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>4</volume>
            <fpage>453</fpage>
            <lpage>461</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12125824</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>The genome of <it>M. acetivorans </it>reveals extensive metabolic and physiological diversity.</p>
            </title>
            <aug>
               <au>
                  <snm>Galagan</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Roy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Endrizzi</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Macdonald</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>FitzHugh</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Calvo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Engels</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Smirnov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Atnoor</snm>
                  <fnm>D</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>532</fpage>
            <lpage>542</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.223902</pubid>
                  <pubid idtype="pmpid" link="fulltext">11932238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>A hot story from comparative genomics: reverse gyrase is the only hyperthermophile-specific protein.</p>
            </title>
            <aug>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>236</fpage>
            <lpage>237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02650-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12047940</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>The unique DNA topology and DNA topoisomerases of hyperthermophilic archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bergerat</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lopez-Garcia</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Rev</source>
            <pubdate>1996</pubdate>
            <volume>18</volume>
            <fpage>237</fpage>
            <lpage>248</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0168-6445(96)00015-0</pubid>
                  <pubid idtype="pmpid">8639331</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Potential genomic determinants of hyperthermophily.</p>
            </title>
            <aug>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>172</fpage>
            <lpage>176</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(03)00047-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">12683966</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>A DNA repair system specific for thermophilic archaea and bacteria predicted by genomic context analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>482</fpage>
            <lpage>496</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/30.2.482</pubid>
                  <pubid idtype="pmpid" link="fulltext">11788711</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Biosynthesis of the methanogenic cofactors.</p>
            </title>
            <aug>
               <au>
                  <snm>White</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Vitam Horm</source>
            <pubdate>2001</pubdate>
            <volume>61</volume>
            <fpage>299</fpage>
            <lpage>337</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1159/000055337</pubid>
                  <pubid idtype="pmpid">11153270</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Who's your neighbor? New computational approaches for functional genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2000</pubdate>
            <volume>18</volume>
            <fpage>609</fpage>
            <lpage>613</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/76443</pubid>
                  <pubid idtype="pmpid" link="fulltext">10835597</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>A euryarchaeal lysyl-tRNA synthetase: resemblance to class I synthetases.</p>
            </title>
            <aug>
               <au>
                  <snm>Ibba</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Curnow</snm>
                  <fnm>AW</fnm>
               </au>
               <au>
                  <snm>Pridmore</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Vothknecht</snm>
                  <fnm>UC</fnm>
               </au>
               <au>
                  <snm>Gardner</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Soll</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>1119</fpage>
            <lpage>1122</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5340.1119</pubid>
                  <pubid idtype="pmpid" link="fulltext">9353192</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Aminoacyl-tRNA synthesis in archaea: different but not unique.</p>
            </title>
            <aug>
               <au>
                  <snm>Praetorius-Ibba</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ibba</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>48</volume>
            <fpage>631</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12694610</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>An alternative flavin-dependent mechanism for thymidylate synthesis.</p>
            </title>
            <aug>
               <au>
                  <snm>Myllykallio</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lipowski</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Leduc</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Filee</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Liebl</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <fpage>105</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1072113</pubid>
                  <pubid idtype="pmpid" link="fulltext">12029065</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Genome trees and the tree of life.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>472</fpage>
            <lpage>479</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02744-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">12175808</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Genome trees constructed using five different approaches suggest new major bacterial clades.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2001</pubdate>
            <volume>1</volume>
            <fpage>8</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/1471-2148-1-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11734060</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores.</p>
            </title>
            <aug>
               <au>
                  <snm>Clarke</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Beiko</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Ragan</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Charlebois</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2002</pubdate>
            <volume>184</volume>
            <fpage>2072</fpage>
            <lpage>2080</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1128/JB.184.8.2072-2080.2002</pubid>
                  <pubid idtype="pmpid" link="fulltext">11914337</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>SHOT: a web server for the construction of genome phylogenies.</p>
            </title>
            <aug>
               <au>
                  <snm>Korbel</snm>
                  <fnm>JO</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Huynen</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>158</fpage>
            <lpage>162</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(01)02597-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">11858840</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Archaeal phylogeny based on ribosomal proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Matte-Tailliez</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Brochier</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>631</fpage>
            <lpage>639</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11961097</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Genomes in flux: the evolution of archaeal and proteobacterial gene content.</p>
            </title>
            <aug>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huynen</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>17</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.176501</pubid>
                  <pubid idtype="pmpid" link="fulltext">11779827</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Mirkin</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Fenner</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2003</pubdate>
            <volume>3</volume>
            <fpage>2</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/1471-2148-3-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12515582</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.</p>
            </title>
            <aug>
               <au>
                  <snm>Wilson</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Kreychman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>297</volume>
            <fpage>233</fpage>
            <lpage>249</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.3550</pubid>
                  <pubid idtype="pmpid" link="fulltext">10704319</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Gene transfer systems and their applications in Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Luo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wasserfallen</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Syst Appl Microbiol</source>
            <pubdate>2001</pubdate>
            <volume>24</volume>
            <fpage>15</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11403394</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>The archaeal DNA primase: biochemical characterization of the p41-p46 complex from <it>Pyrococcus furiosus</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Komori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bocquier</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Cann</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Kohda</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>45484</fpage>
            <lpage>45490</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M106391200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11584001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Archaeal primase: bridging the gap between RNA and DNA polymerases.</p>
            </title>
            <aug>
               <au>
                  <snm>Bocquier</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cann</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Komori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kohda</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>452</fpage>
            <lpage>456</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(01)00119-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">11301257</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>Aldolases of the DhnA family: a possible solution to the problem of pentose and hexose biosynthesis in archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Lett</source>
            <pubdate>2000</pubdate>
            <volume>183</volume>
            <fpage>259</fpage>
            <lpage>264</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1097(99)00612-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">10675594</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>Archaeal fructose-1,6-bisphosphate aldolases constitute a new family of archaeal type class I aldolase.</p>
            </title>
            <aug>
               <au>
                  <snm>Siebers</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Brinkmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Dorr</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Tjaden</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lilie</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>van der Oost</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Verhees</snm>
                  <fnm>CH</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>28710</fpage>
            <lpage>28718</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M103447200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11387336</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <title>
               <p>An aminoacyl tRNA synthetase whose sequence fits into neither of the two known classes.</p>
            </title>
            <aug>
               <au>
                  <snm>Fabrega</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Farrow</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Mukhopadhyay</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>de Crecy-Lagard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ortiz</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Schimmel</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>110</fpage>
            <lpage>114</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35075121</pubid>
                  <pubid idtype="pmpid" link="fulltext">11333988</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B79">
            <title>
               <p>One polypeptide with two aminoacyl-tRNA synthetase activities.</p>
            </title>
            <aug>
               <au>
                  <snm>Stathopoulos</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Longman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Vothknecht</snm>
                  <fnm>UC</fnm>
               </au>
               <au>
                  <snm>Becker</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Ibba</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Soll</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>479</fpage>
            <lpage>482</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5452.479</pubid>
                  <pubid idtype="pmpid" link="fulltext">10642548</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B80">
            <title>
               <p><it>Quod erat demonstrandum</it>? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Iyer</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mushegian</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Zhulin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>research0051.1</fpage>
            <lpage>0051.11</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2001-2-12-research0051</pubid>
                  <pubid idtype="pmpid" link="fulltext">11790254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B81">
            <title>
               <p>The structural basis of cysteine aminoacylation of tRNAPro by prolyl-tRNA synthetases.</p>
            </title>
            <aug>
               <au>
                  <snm>Kamtekar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kennedy</snm>
                  <fnm>WD</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stathopoulos</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Soll</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>1673</fpage>
            <lpage>1678</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0437911100</pubid>
                  <pubid idtype="pmpid" link="fulltext">12578991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B82">
            <title>
               <p>Significance of two distinct types of tryptophan synthase beta chain in Bacteria, Archaea and higher plants.</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Forst</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bonner</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0004.1</fpage>
            <lpage>0004.13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2001-3-1-research0004</pubid>
                  <pubid idtype="pmpid" link="fulltext">11806827</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B83">
            <title>
               <p>A whole genome view of prokaryotic haem biosynthesis.</p>
            </title>
            <aug>
               <au>
                  <snm>Panek</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>O'Brian</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Microbiology</source>
            <pubdate>2002</pubdate>
            <volume>148</volume>
            <fpage>2273</fpage>
            <lpage>2282</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12177321</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B84">
            <title>
               <p>Exploitation of gene context.</p>
            </title>
            <aug>
               <au>
                  <snm>Huynen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lathe</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>366</fpage>
            <lpage>370</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(00)00098-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">10851194</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B85">
            <title>
               <p>Predicting protein function by genomic context: quantitative evaluation and qualitative inferences.</p>
            </title>
            <aug>
               <au>
                  <snm>Huynen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lathe</snm>
                  <fnm>W</fnm>
                  <suf>3rd</suf>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1204</fpage>
            <lpage>1210</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.8.1204</pubid>
                  <pubid idtype="pmpid" link="fulltext">10958638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B86">
            <title>
               <p>Guilt by association: contextual information in genome analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1074</fpage>
            <lpage>1077</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.8.1074</pubid>
                  <pubid idtype="pmpid" link="fulltext">10958625</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B87">
            <title>
               <p>Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Kondrashov</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>356</fpage>
            <lpage>372</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.GR-1619R</pubid>
                  <pubid idtype="pmpid" link="fulltext">11230160</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B88">
            <title>
               <p>Prediction of the archaeal exosome and its connections with the proteasome and the translation and transcription machineries by a comparative-genomic approach.</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>240</fpage>
            <lpage>252</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.162001</pubid>
                  <pubid idtype="pmpid" link="fulltext">11157787</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B89">
            <title>
               <p>The exosome: a versatile RNA processing machine.</p>
            </title>
            <aug>
               <au>
                  <snm>Decker</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <fpage>R238</fpage>
            <lpage>R240</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9583939</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B90">
            <title>
               <p>The exosome: a proteasome for RNA?</p>
            </title>
            <aug>
               <au>
                  <snm>van Hoof</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1999</pubdate>
            <volume>99</volume>
            <fpage>347</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10571176</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B91">
            <title>
               <p>The exosome: a conserved eukaryotic RNA processing complex containing multiple 3'&#8594;5' exoribonucleases.</p>
            </title>
            <aug>
               <au>
                  <snm>Mitchell</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Petfalski</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shevchenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tollervey</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1997</pubdate>
            <volume>91</volume>
            <fpage>457</fpage>
            <lpage>466</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9390555</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B92">
            <title>
               <p>Connected gene neighborhoods in prokaryotic genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Murvai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Czabarka</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Szekely</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>2212</fpage>
            <lpage>2223</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/30.10.2212</pubid>
                  <pubid idtype="pmpid" link="fulltext">12000841</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B93">
            <title>
               <p>The role of lineage-specific gene family expansion in the evolution of eukaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Lespinet</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1048</fpage>
            <lpage>1059</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.174302</pubid>
                  <pubid idtype="pmpid" link="fulltext">12097341</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B94">
            <title>
               <p>Lineage-specific gene expansions in bacterial and archaeal genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Jordan</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Spouge</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>555</fpage>
            <lpage>565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.GR-1660R</pubid>
                  <pubid idtype="pmpid" link="fulltext">11282971</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B95">
            <title>
               <p>Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.</p>
            </title>
            <aug>
               <au>
                  <snm>Remm</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Storm</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>314</volume>
            <fpage>1041</fpage>
            <lpage>1052</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.5197</pubid>
                  <pubid idtype="pmpid" link="fulltext">11743721</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B96">
            <title>
               <p>The genome sequence of the thermoacidophilic scavenger <it>Thermoplasma acidophilum</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Ruepp</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Graml</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Santos-Martinez</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Koretke</snm>
                  <fnm>KK</fnm>
               </au>
               <au>
                  <snm>Volker</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Stocker</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lupas</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Baumeister</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>407</volume>
            <fpage>508</fpage>
            <lpage>513</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35035069</pubid>
                  <pubid idtype="pmpid" link="fulltext">11029001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B97">
            <title>
               <p>The complete genome of the crenarchaeon <it>Sulfolobus solfataricus </it>P2.</p>
            </title>
            <aug>
               <au>
                  <snm>She</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Confalonieri</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Zivanovic</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Allard</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Awayez</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Chan-Weiher</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Clausen</snm>
                  <fnm>IG</fnm>
               </au>
               <au>
                  <snm>Curtis</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>De Moors</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>7835</fpage>
            <lpage>7840</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.141222098</pubid>
                  <pubid idtype="pmpid" link="fulltext">11427726</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B98">
            <title>
               <p>The unique biochemistry of methanogenesis.</p>
            </title>
            <aug>
               <au>
                  <snm>Deppenmeier</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Prog Nucleic Acid Res Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>71</volume>
            <fpage>223</fpage>
            <lpage>283</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12102556</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B99">
            <title>
               <p>DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>1609</fpage>
            <lpage>1618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/27.7.1609</pubid>
                  <pubid idtype="pmpid">10075991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B100">
            <title>
               <p>The HI0073/HI0074 protein pair from <it>Haemophilus influenzae </it>is a member of a new nucleotidyltransferase family: structure, sequence analyses, and solution studies.</p>
            </title>
            <aug>
               <au>
                  <snm>Lehmann</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lim</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chalamasetty</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Krajewski</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Melamud</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Galkin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Howard</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kelman</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Reddy</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Murzin</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Herzberg</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2003</pubdate>
            <volume>50</volume>
            <fpage>249</fpage>
            <lpage>260</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10260</pubid>
                  <pubid idtype="pmpid" link="fulltext">12486719</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B101">
            <title>
               <p>A Holliday junction resolvase from <it>Pyrococcus furiosus</it>: functional similarity to <it>Escherichia coli </it>RuvC provides evidence for conserved mechanism of homologous recombination in Bacteria, Eukarya, and Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Komori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sakae</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shinagawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Morikawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>8873</fpage>
            <lpage>8878</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.96.16.8873</pubid>
                  <pubid idtype="pmpid" link="fulltext">10430863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B102">
            <title>
               <p>Survey and summary: Holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>3417</fpage>
            <lpage>3432</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/28.18.3417</pubid>
                  <pubid idtype="pmpid" link="fulltext">10982859</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B103">
            <title>
               <p>Hjc resolvase is a distantly related member of the type II restriction endonuclease family.</p>
            </title>
            <aug>
               <au>
                  <snm>Daiyasu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Komori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sakae</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Toh</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>4540</fpage>
            <lpage>4543</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/28.22.4540</pubid>
                  <pubid idtype="pmpid" link="fulltext">11071943</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B104">
            <title>
               <p>A novel DNA polymerase family found in Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Komori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cann</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Koga</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1998</pubdate>
            <volume>180</volume>
            <fpage>2232</fpage>
            <lpage>2236</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9555910</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B105">
            <title>
               <p>DNA polymerases from euryarchaeota.</p>
            </title>
            <aug>
               <au>
                  <snm>Ishino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ishino</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2001</pubdate>
            <volume>334</volume>
            <fpage>249</fpage>
            <lpage>260</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11398467</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B106">
            <title>
               <p>The interaction of Alba, a conserved archaeal chromatin protein, with Sir2 and its regulation by acetylation.</p>
            </title>
            <aug>
               <au>
                  <snm>Bell</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Botting</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Wardleworth</snm>
                  <fnm>BN</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <fpage>148</fpage>
            <lpage>151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1070506</pubid>
                  <pubid idtype="pmpid" link="fulltext">11935028</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B107">
            <title>
               <p>Holding it together: chromatin in the Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>White</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>SD</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>621</fpage>
            <lpage>626</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02808-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12446147</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B108">
            <title>
               <p>Structure of Alba: an archaeal chromatin protein modulated by acetylation.</p>
            </title>
            <aug>
               <au>
                  <snm>Wardleworth</snm>
                  <fnm>BN</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2002</pubdate>
            <volume>21</volume>
            <fpage>4654</fpage>
            <lpage>4662</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/emboj/cdf465</pubid>
                  <pubid idtype="pmpid" link="fulltext">12198167</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B109">
            <title>
               <p>Histone acetylation and deacetylation in yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Kurdistani</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Grunstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Rev Mol Cell Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>276</fpage>
            <lpage>284</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrm1075</pubid>
                  <pubid idtype="pmpid" link="fulltext">12671650</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B110">
            <title>
               <p>Completeness in structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Vitkup</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Melamud</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>559</fpage>
            <lpage>566</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/88640</pubid>
                  <pubid idtype="pmpid" link="fulltext">11373627</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B111">
            <title>
               <p>A comparison of sequence and structure protein domain families as a basis for structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Elofsson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>480</fpage>
            <lpage>500</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.6.480</pubid>
                  <pubid idtype="pmpid" link="fulltext">10383473</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B112">
            <title>
               <p>Structural genomics: bioinformatics in the driver's seat.</p>
            </title>
            <aug>
               <au>
                  <snm>Gaasterland</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <fpage>625</fpage>
            <lpage>627</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9661193</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B113">
            <title>
               <p>Target selection for structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7 Suppl</volume>
            <fpage>967</fpage>
            <lpage>969</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/80747</pubid>
                  <pubid idtype="pmpid" link="fulltext">11104002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B114">
            <title>
               <p>Integrative database analysis in structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7 Suppl</volume>
            <fpage>960</fpage>
            <lpage>963</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/80739</pubid>
                  <pubid idtype="pmpid" link="fulltext">11104000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B115">
            <title>
               <p>Protein fold recognition using sequence profiles and its application in structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Adv Protein Chem</source>
            <pubdate>2000</pubdate>
            <volume>54</volume>
            <fpage>245</fpage>
            <lpage>275</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10829230</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B116">
            <title>
               <p>Structural proteomics of an archaeon.</p>
            </title>
            <aug>
               <au>
                  <snm>Christendat</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yee</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Dharamsi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kluger</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Savchenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cort</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Booth</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mackereth</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Saridakis</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ekiel</snm>
                  <fnm>I</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>903</fpage>
            <lpage>909</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/82823</pubid>
                  <pubid idtype="pmpid" link="fulltext">11017201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B117">
            <title>
               <p>Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Zarembinski</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Hung</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Mueller-Dieckmann</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>KK</fnm>
               </au>
               <au>
                  <snm>Yokota</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>15189</fpage>
            <lpage>15193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.95.26.15189</pubid>
                  <pubid idtype="pmpid" link="fulltext">9860944</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B118">
            <title>
               <p>Monophyly of class I aminoacyl tRNA synthetase, USPA, ETFP, photolyase, and PP-ATPase nucleotide-binding domains: implications for protein evolution in the RNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Anantharaman</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2002</pubdate>
            <volume>48</volume>
            <fpage>1</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10064</pubid>
                  <pubid idtype="pmpid" link="fulltext">12012333</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B119">
            <title>
               <p>Translation: in retrospect and prospect.</p>
            </title>
            <aug>
               <au>
                  <snm>Woese</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>RNA</source>
            <pubdate>2001</pubdate>
            <volume>7</volume>
            <fpage>1055</fpage>
            <lpage>1067</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1017/S1355838201010615</pubid>
                  <pubid idtype="pmpid" link="fulltext">11497425</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B120">
            <title>
               <p>Prediction of transcription regulatory sites in Archaea by a comparative genomic approach.</p>
            </title>
            <aug>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>695</fpage>
            <lpage>705</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/28.3.695</pubid>
                  <pubid idtype="pmpid" link="fulltext">10637320</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B121">
            <title>
               <p>Conservation of the biotin regulon and the BirA regulatory signal in eubacteria and archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Rodionov</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1507</fpage>
            <lpage>1516</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.314502</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368242</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B122">
            <title>
               <p>Reconstructing/deconstructing the earliest eukaryotes: how comparative genomics can help.</p>
            </title>
            <aug>
               <au>
                  <snm>Dacks</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2001</pubdate>
            <volume>107</volume>
            <fpage>419</fpage>
            <lpage>425</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11719183</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B123">
            <title>
               <p>Phosphoesterase domains associated with DNA polymerases of diverse origins.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>3746</fpage>
            <lpage>3752</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/26.16.3746</pubid>
                  <pubid idtype="pmpid" link="fulltext">9685491</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B124">
            <title>
               <p>The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon <it>Archaeoglobus fulgidus</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Klenk</snm>
                  <fnm>HP</fnm>
               </au>
               <au>
                  <snm>Clayton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Tomb</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Ketchum</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Gwinn</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>JD</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>390</volume>
            <fpage>364</fpage>
            <lpage>370</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/37052</pubid>
                  <pubid idtype="pmpid" link="fulltext">9389475</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B125">
            <title>
               <p>Genome sequence of <it>Halobacterium </it>species NRC-1.</p>
            </title>
            <aug>
               <au>
                  <snm>Ng</snm>
                  <fnm>WV</fnm>
               </au>
               <au>
                  <snm>Kennedy</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Mahairas</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Berquist</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shukla</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Lasky</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Baliga</snm>
                  <fnm>NS</fnm>
               </au>
               <au>
                  <snm>Thorsson</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Sbrogna</snm>
                  <fnm>J</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>12176</fpage>
            <lpage>12181</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.190337797</pubid>
                  <pubid idtype="pmpid" link="fulltext">11016950</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B126">
            <title>
               <p>Complete genome sequence of <it>Methanobacterium thermoautotrophicum </it>deltaH: functional analysis and comparative genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Doucette-Stamm</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Deloughery</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Dubois</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Aldredge</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bashirzadeh</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Blakely</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cook</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1997</pubdate>
            <volume>179</volume>
            <fpage>7135</fpage>
            <lpage>7155</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9371463</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B127">
            <title>
               <p>Complete sequence and gene organization of the genome of a hyper-thermophilic archaebacterium, <it>Pyrococcus horikoshii </it>OT3.</p>
            </title>
            <aug>
               <au>
                  <snm>Kawarabayasi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sawada</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Horikawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Haikawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Baba</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kosugi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hosoyama</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>DNA Res</source>
            <pubdate>1998</pubdate>
            <volume>5</volume>
            <fpage>55</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9679194</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B128">
            <title>
               <p>An integrated analysis of the genome of the hyperthermophilic archaeon <it>Pyrococcus abyssi</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Cohen</snm>
                  <fnm>GN</fnm>
               </au>
               <au>
                  <snm>Barbe</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Flament</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Heilig</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lecompte</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Poch</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Prieur</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Querellou</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ripp</snm>
                  <fnm>R</fnm>
               </au>
               <etal/>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>47</volume>
            <fpage>1495</fpage>
            <lpage>1512</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2003.03381.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12622808</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B129">
            <title>
               <p>Genomic sequence of hyperthermophile, <it>Pyrococcus furiosus</it>: implications for physiology and enzymology.</p>
            </title>
            <aug>
               <au>
                  <snm>Robb</snm>
                  <fnm>FT</fnm>
               </au>
               <au>
                  <snm>Maeder</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>DiRuggiero</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stump</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Weiss</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Dunn</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2001</pubdate>
            <volume>330</volume>
            <fpage>134</fpage>
            <lpage>157</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11210495</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B130">
            <title>
               <p>Archaeal adaptation to higher temperatures revealed by genomic sequence of <it>Thermoplasma volcanium</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Amano</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Koike</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Makino</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Higuchi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kawashima-Ohya</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yamazaki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kanehori</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kawamoto</snm>
                  <fnm>T</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>14257</fpage>
            <lpage>14262</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.97.26.14257</pubid>
                  <pubid idtype="pmpid" link="fulltext">11121031</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B131">
            <title>
               <p>Genome sequence of the hyperthermophilic crenarchaeon <it>Pyrobaculum aerophilum</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Fitz-Gibbon</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Ladner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>UJ</fnm>
               </au>
               <au>
                  <snm>Stetter</snm>
                  <fnm>KO</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>984</fpage>
            <lpage>989</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.241636498</pubid>
                  <pubid idtype="pmpid" link="fulltext">11792869</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B132">
            <title>
               <p>Complete genome sequence of an aerobic hyper-thermophilic crenarchaeon, <it>Aeropyrum pernix </it>K1.</p>
            </title>
            <aug>
               <au>
                  <snm>Kawarabayasi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Horikawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yamazaki</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Haikawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Jin-no</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Baba</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ankai</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>DNA Res</source>
            <pubdate>1999</pubdate>
            <volume>6</volume>
            <fpage>83</fpage>
            <lpage>101</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10382966</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B133">
            <title>
               <p>Complete genome sequence of an aerobic thermoacidophilic crenarchaeon, <it>Sulfolobus tokodaii </it>strain7.</p>
            </title>
            <aug>
               <au>
                  <snm>Kawarabayasi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hino</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Horikawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Jin-no</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Baba</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ankai</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kosugi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hosoyama</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>DNA Res</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>123</fpage>
            <lpage>140</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11572479</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B134">
            <title>
               <p><it>Methanobacterium thermoautotrophicum </it>encodes two multisubunit membrane-bound [NiFe] hydrogenases. Transcription of the operons and sequence analysis of the deduced proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Tersteegen</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hedderich</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1999</pubdate>
            <volume>264</volume>
            <fpage>930</fpage>
            <lpage>943</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1432-1327.1999.00692.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10491142</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B135">
            <title>
               <p>Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs).</p>
            </title>
            <aug>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Shankavaram</snm>
                  <fnm>UT</fnm>
               </au>
               <au>
                  <snm>Galperin</snm>
                  <fnm>MY</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>research0009.1</fpage>
            <lpage>0009.19</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2000-1-5-research0009</pubid>
                  <pubid idtype="pmpid" link="fulltext">11178258</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B136">
            <title>
               <p>Conserved domains in DNA repair proteins and evolution of repair systems.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>1223</fpage>
            <lpage>1242</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/27.5.1223</pubid>
                  <pubid idtype="pmpid" link="fulltext">9973609</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B137">
            <title>
               <p>Role of predicted metalloprotease motif of Jab1/Csn5 in cleavage of Nedd8 from Cul1.</p>
            </title>
            <aug>
               <au>
                  <snm>Cope</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>GS</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Schwarz</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Zipursky</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Deshaies</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>608</fpage>
            <lpage>611</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1075901</pubid>
                  <pubid idtype="pmpid" link="fulltext">12183637</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B138">
            <title>
               <p>Role of Rpn11 metalloprotease in deubiquitination and degradation by the 26S proteasome.</p>
            </title>
            <aug>
               <au>
                  <snm>Verma</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Oania</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>McDonald</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
                  <suf>3rd</suf>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Deshaies</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>611</fpage>
            <lpage>615</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1075898</pubid>
                  <pubid idtype="pmpid" link="fulltext">12183636</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B139">
            <title>
               <p>DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history.</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>1609</fpage>
            <lpage>1618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/27.7.1609</pubid>
                  <pubid idtype="pmpid" link="fulltext">10075991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B140">
            <title>
               <p>Archaeal shikimate kinase, a new member of the GHMP-kinase family.</p>
            </title>
            <aug>
               <au>
                  <snm>Daugherty</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vonstein</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Overbeek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Osterman</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>292</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1128/JB.183.1.292-300.2001</pubid>
                  <pubid idtype="pmpid" link="fulltext">11114929</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B141">
            <title>
               <p>Molecular characterization of phosphoglycerate mutase in archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>van der Oost</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Huynen</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Verhees</snm>
                  <fnm>CH</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Lett</source>
            <pubdate>2002</pubdate>
            <volume>212</volume>
            <fpage>111</fpage>
            <lpage>120</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1097(02)00720-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">12076796</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B142">
            <title>
               <p>A novel candidate for the true fructose-1,6-bisphosphatase in archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Rashid</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Imanaka</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kanai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fukui</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Atomi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Imanaka</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <fpage>30649</fpage>
            <lpage>30655</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M202868200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12065581</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B143">
            <title>
               <p>NurA, a novel 5'-3' nuclease gene linked to rad50 and mre11 homologs of thermophilic Archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Constantinesco</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Forterre</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Elie</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>EMBO Rep</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>537</fpage>
            <lpage>542</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/embo-reports/kvf112</pubid>
                  <pubid idtype="pmpid" link="fulltext">12052775</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B144">
            <title>
               <p>Identification of a highly diverged class of S-adenosylmethionine synthetases in the archaea.</p>
            </title>
            <aug>
               <au>
                  <snm>Graham</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Bock</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Schalk-Hihi</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>ZJ</fnm>
               </au>
               <au>
                  <snm>Markham</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2000</pubdate>
            <volume>275</volume>
            <fpage>4055</fpage>
            <lpage>4059</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.275.6.4055</pubid>
                  <pubid idtype="pmpid" link="fulltext">10660563</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B145">
            <title>
               <p><it>Methanococcus jannaschii </it>uses a pyruvoyl-dependent arginine decarboxylase in polyamine biosynthesis.</p>
            </title>
            <aug>
               <au>
                  <snm>Graham</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <fpage>23500</fpage>
            <lpage>23507</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M203467200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11980912</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
