<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-6-112</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>GH97 is a new family of glycoside hydrolases, which is related to the &#945;-galactosidase superfamily</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Naumoff</snm>
               <mi>G</mi>
               <fnm>Daniil</fnm>
               <insr iid="I1"/>
               <email>daniil_naumoff@yahoo.com</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Laboratory of Bioinformatics, State Institute for Genetics and Selection of Industrial Microorganisms, I-Dorozhny proezd, 1, Moscow 117545, Russia</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2005</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>112</fpage>
         <url>http://www.biomedcentral.com/1471-2164/6/112</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16131397</pubid>
               <pubid idtype="doi">10.1186/1471-2164-6-112</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>21</day>
               <month>3</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>30</day>
               <month>8</month>
               <year>2005</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>30</day>
               <month>8</month>
               <year>2005</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2005</year>
         <collab>Naumoff; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>As a rule, about 1% of genes in a given genome encode glycoside hydrolases and their homologues. On the basis of sequence similarity they have been grouped into more than ninety GH families during the last 15 years. The GH97 family has been established very recently and initially included only 18 bacterial proteins. However, the evolutionary relationship of the genes encoding proteins of this family remains unclear, as well as their distribution among main groups of the living organisms.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The extensive search of the current databases allowed us to double the number of GH97 family proteins. Five subfamilies were distinguished on the basis of pairwise sequence comparison and phylogenetic analysis. Iterative sequence analysis revealed the relationship of the GH97 family with the GH27, GH31, and GH36 families of glycosidases, which belong to the &#945;-galactosidase superfamily, as well as a more distant relationship with some other glycosidase families (GH13 and GH20).</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The results of this study show an unexpected sequence similarity of GH97 family proteins with glycoside hydrolases from several other families, that have (&#946;/&#945;)<sub>8</sub>-barrel fold of the catalytic domain and a retaining mechanism of the glycoside bond hydrolysis. These data suggest a common evolutionary origin of glycosidases representing different families and clans.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>On the basis of sequence similarity, glycoside hydrolases (or glycosidases, EC3.2.1.-) have been grouped into 96 families (GH1-GH100, except GH21, GH40, GH41, and GH60) by the Carbohydrate-Active Enzymes (CAZy) classification <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. In the case of poly-domain proteins each catalytic domain is considered separately. A family was initially defined as a group of at least two sequences displaying significant amino acid similarity and with no significant similarity with other families <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Later, some related families of glycosidases have been combined into clans <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. According to its definition, a clan is a group of families that are thought to have a common ancestry and are recognized by significant similarities in tertiary structure together with conservation of the catalytic residues and a catalytic mechanism <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Glycosidases catalyze hydrolysis of the glycosidic bond of their substrates via two general mechanisms, leading to either inversion or overall retention of the anomeric configuration at the cleavage point <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. Currently, 14 clans (GH-A-GH-N) are described, and in total they contain 46 families <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Families of four clans (GH-A, GH-D, GH-H, and GH-K), as well as several other families, that have not been assigned to any clan, contain proteins with a similar (&#946;/&#945;)<sub>8</sub>-barrel fold of the catalytic domain <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Several glycosidases, that do not have any homologues, are included into a group of non-classified glycoside hydrolases <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. In several instances, proteins from this group have been reclassified into new families when their homologues were found <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>Two different clans have never been merged in the CAZy classification <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, even after their significant similarity has been established. Instead, related clans (and families) having statistically significant sequence similarity of the corresponding proteins were proposed to be grouped into superfamilies at a higher hierarchical level. For example, we have described the furanosidase (&#946;-fructosidase) superfamily, that includes clans GH-F (inverting glycosidases) and GH-J (retaining glycosidases), as well as the GHLP (COG2152) family of enzymatically-uncharacterized proteins <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>Nowadays, some families are very large. For example, GH13 family (clan GH-H) includes more than 2,000 representatives <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. This large and poly-specific group of enzymes has been studied by many authors <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. In particular, it was shown that splitting of this family into smaller subfamilies allowed to clarify the relationship of its members <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>The majority of known glycosidases with the &#945;-galactosidase activity [EC3.2.1.22] belong to families GH27 and GH36, that form clan GH-D <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B20">20</abbr></abbrgrp>. This clan and family GH31 compose the &#945;-galactosidase superfamily <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. This superfamily has a distant relationship with clan GH-H <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, which we have proposed to name the &#945;-glucosidase superfamily <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Both superfamilies contain proteins sharing the same enzymatic mechanism (retention), a similar (&#946;/&#945;)<sub>8</sub>-barrel fold of the catalytic domain <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, and use substrates only with the axial orientation of the glycosidic bond <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>Gram-negative obligate anaerobe <it>Bacteroides thetaiotaomicron </it>ATCC29148 is a commensal bacterium found in the human colon where it ferments a wide variety of polysaccharides <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. Its <ul>s</ul>tarch <ul>u</ul>tilization <ul>s</ul>ystem (sus) has been studied in detail <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. One of the corresponding loci (Figure <figr fid="F1">1</figr>) includes divergently oriented regulatory gene <it>susR </it>and seven structural genes <it>susA-susG </it><abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. Genes <it>susC-susF </it>encode outer membrane proteins are involved in starch binding. Glycosidases SusA (a neopullulanase, EC 3.2.1.135) and SusG (an &#945;-amylase, EC 3.2.1.1) are members of family GH13 <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. SusB is an unusual &#945;-glucosidase [EC 3.2.1.20] that for a long time was considered a unique glycosidase with no homologues <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. Therefore it was included in the group of non-classified glycoside hydrolases <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. We have found a group of its homologues among hypothetical proteins encoded by open reading frames (ORFs), that recently were sequenced in the frame of several prokaryotic genome projects. We referred to this group of proteins as the GHX family <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. In June 2004, 18 members of this family were recognized in the CAZy classification as the GH97 family of glycoside hydrolases. Currently (June 2005), family GH97 includes two &#945;-glucosidases SusB from closely related bacteria <it>B. thetaiotaomicron </it>ATCC29148 and <it>Tannerella forsythensis </it>(<it>Bacteroides forsythus</it>) ATCC43037, as well as 22 hypothetical proteins encoded by ORFs <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Structure of <it>Bacteroides thetaiotaomicron </it>ATCC29148 genome fragment containing gene clusters for starch and hemicellulose utilization</p>
            </caption>
            <text>
               <p><b>Structure of <it>Bacteroides thetaiotaomicron </it>ATCC29148 genome fragment containing gene clusters for starch and hemicellulose utilization</b>. Arrows indicate the direction of gene transcription. Red arrows correspond to glycosidase (GH) and glycosyltransferase (GT) genes: family belonging is indicated. Yellow arrows correspond to genes coding outer membrane proteins involved in starch binding (<it>susC-susF</it>) and their homologues. Green arrows correspond to genes of the transcriptional activator SusR and predicted transcriptional regulators homologous to AraC.</p>
            </text>
            <graphic file="1471-2164-6-112-1"/>
         </fig>
         <p>In this work we updated the GH97 family of glycosidases, performed its phylogenetic analysis, and established its evolutionary relationship with several other glycosidase families.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Collecting sequences of family GH97</p>
            </st>
            <p>PSI-BLAST search of the non-redundant database with the <it>Bacteroides thetaiotaomicron </it>&#945;-glucosidase SusB (97A1_BACTH, see Table <tblr tid="T1">I</tblr>) as a query sequence yielded 32 protein sequences with the worst (the largest) <it>E</it>-value of 2 &#215; 10<sup>-20 </sup>during the first round. Among them we found 10 paralogous proteins from <it>B. thetaiotaomicron </it>ATCC29148 and their 22 homologues from other species. Among 32 obtained proteins were found all 24 members of the GH97 family listed at the CAZy server <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Genomic BLAST revealed 13 additional homologous sequences. Based on the sequence similarity, we propose to enlarge the GH97 family by including all known homologues of SusB. As a result, currently this family includes 45 proteins. The majority of them represent Eubacteria (16 different species). Three other sequences correspond to Archaea (<it>Haloarcula marismortui</it>) and two uncultured bacteria. Four sequences are annotated in the NCBI database as eukaryotic (<it>Anopheles gambiae</it>) genome fragments. Only five out of 45 protein sequences (from <it>Anopheles </it>and an uncultured bacterium) are short fragments (Table <tblr tid="T1">I</tblr>).</p>
            <tbl id="T1">
               <title>
                  <p>Table I</p>
               </title>
               <caption>
                  <p>Glycoside hydrolases analyzed in the work</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="left">
                        <p>Name</p>
                     </c>
                     <c ca="left">
                        <p>Family, subfamily</p>
                     </c>
                     <c ca="left">
                        <p>Organism</p>
                     </c>
                     <c ca="left">
                        <p>Accession number<sup>a</sup></p>
                     </c>
                     <c ca="left">
                        <p>Protein function (annotation)</p>
                     </c>
                     <c ca="left">
                        <p>Length<sup>b</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_BACTH&#160;&#160;&#160;&#160;</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                          AAC44671
                        </p>
                     </c>
                     <c ca="left">
                        <p>alpha-glucosidase SusB</p>
                     </c>
                     <c ca="left">
                        <p>738</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A2_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO79686
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>719</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A3_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO75790
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>671</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B1_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                          AAO76978
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>662</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B2_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO78400
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>650</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B3_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO77727
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>649</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B4_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                          AAO78269
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>674</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C1_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO78766
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>647</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C2_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO78769
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>638</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97E1_BACTH</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97e</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides thetaiotaomicron </it>VPI-5482 = ATCC29148</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAO75239
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>644</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_BACFR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides fragilis </it>YCH46</p>
                     </c>
                     <c ca="left">
                        <p>
                           BAD47941
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>719</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A2_BACFR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides fragilis </it>YCH46</p>
                     </c>
                     <c ca="left">
                        <p>
                           BAD48072
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>671</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B1_BACFR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides fragilis </it>YCH46</p>
                     </c>
                     <c ca="left">
                        <p>
                           BAD50730
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>649</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B2_BACFR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Bacteroides fragilis </it>YCH46</p>
                     </c>
                     <c ca="left">
                        <p>
                           BAD50235
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>649</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_TANFO</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Tannerella forsythensis </it>(<it>Bacteroides forsythus</it>) ATCC43037</p>
                     </c>
                     <c ca="left">
                        <p>AAO33827</p>
                     </c>
                     <c ca="left">
                        <p>alpha-D-glucosidase SusB</p>
                     </c>
                     <c ca="left">
                        <p>708</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_PREIN</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella intermedia </it>17</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_246198)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>733</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>737</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B1_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>645</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B2_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>658</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C1_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>621</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C2_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>639</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C3_PRERU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Prevotella ruminicola </it>23</p>
                     </c>
                     <c ca="left">
                        <p>(TIGR_264731)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>645</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SALRU</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Salinibacter ruber </it>DSM13855</p>
                     </c>
                     <c ca="left">
                        <p>(NC_006812)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>708</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_AZOVI</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Azotobacter vinelandii </it>AvOP</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAM07225
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>673</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_XANAX</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Xanthomonas axonopodis </it>pv. citri 306</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAM37448
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>693</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97D1_XANAX</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97d</p>
                     </c>
                     <c ca="left">
                        <p><it>Xanthomonas axonopodis </it>pv. citri 306</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAM38156
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>654</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_XANCA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Xanthomonas campestris </it>pv. campestris ATCC33913</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAM41744
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>692</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97D1_XANCA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97d</p>
                     </c>
                     <c ca="left">
                        <p><it>Xanthomonas campestris </it>pv. campestris ATCC33913</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAM42433
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>654</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_MICDE</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Microbulbifer </it>(<it>Saccharophagus</it>) <it>degradans </it>2&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>
                           ZP_00315606
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>684</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97B1_MICDE</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97b</p>
                     </c>
                     <c ca="left">
                        <p><it>Microbulbifer </it>(<it>Saccharophagus</it>) <it>degradans </it>2&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>
                           ZP_00317369
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>679</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C1_MICDE</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Microbulbifer </it>(<it>Saccharophagus</it>) <it>degradans </it>2&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>
                           ZP_00317507
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>674</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C2_MICDE</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Microbulbifer </it>(<it>Saccharophagus</it>) <it>degradans </it>2&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>
                           ZP_00315142
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>661</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SHEON</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Shewanella oneidensis </it>MR-1</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAN55484
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>699</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SHEBA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Shewanella baltica </it>OS155</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAN43632
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>710</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SHEFR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Shewanella frigidimarina </it>NCIMB400</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAN73178
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>697</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SHEDE</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Shewanella denitrificans </it>OS-217</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAN70289
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>727</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SHEAM</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Shewanella amazonensis </it>SB2B</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAN38820
                       </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>676</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_NOVAR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Novosphingobium aromaticivorans </it>DSM12444</p>
                     </c>
                     <c ca="left">
                        <p>
                           ZP_00303588
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: transketolase</p>
                     </c>
                     <c ca="left">
                        <p>682</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_SPHAL</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Sphingopyxis alaskensis </it>RB2256</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAN45679
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>680</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97D1_CAUCR</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97d</p>
                     </c>
                     <c ca="left">
                        <p><it>Caulobacter crescentus </it>CB15</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAK22781
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: putative alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>670</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_ERYLI</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Erythrobacter litoralis </it>HTCC2594</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAL74063
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>681</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97E1_RHOBA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97e</p>
                     </c>
                     <c ca="left">
                        <p><it>Rhodopirellula baltica </it>SH1 (<it>Pirellula </it>sp. 1)</p>
                     </c>
                     <c ca="left">
                        <p>
                           CAD78916
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>645</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C1_LEIXY</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p><it>Leifsonia xyli </it>subsp. xyli CTCB07</p>
                     </c>
                     <c ca="left">
                        <p>(NC_006087)*</p>
                     </c>
                     <c ca="left">
                        <p>ORF: similar to alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>775*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97X1_SOLUS</p>
                     </c>
                     <c ca="left">
                        <p>GH97</p>
                     </c>
                     <c ca="left">
                        <p><it>Solibacter usitatus </it>Ellin6076</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAM58489
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>619</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_HALMA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Haloarcula marismortui </it>ATCC43049</p>
                     </c>
                     <c ca="left">
                        <p>AAV45265</p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>1144</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_ANOGA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Anopheles gambiae </it>str. PEST (African malaria mosquito)</p>
                     </c>
                     <c ca="left">
                        <p>(AAAB01006165)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>380*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A2_ANOGA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Anopheles gambiae </it>str. PEST (African malaria mosquito)</p>
                     </c>
                     <c ca="left">
                        <p>(AAAB01064948)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>209*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A3_ANOGA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Anopheles gambiae </it>str. PEST (African malaria mosquito)</p>
                     </c>
                     <c ca="left">
                        <p>(AAAB01020110)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>231*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A4_ANOGA</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p><it>Anopheles gambiae </it>str. PEST (African malaria mosquito)</p>
                     </c>
                     <c ca="left">
                        <p>(AAAB01068263)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>229*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_UNBAC</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>uncultured murine large bowel bacterium BAC31B</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAX16382
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: alpha-glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>720</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A2_UNBAC</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>uncultured bacterium</p>
                     </c>
                     <c ca="left">
                        <p>(AY350337)</p>
                     </c>
                     <c ca="left">
                        <p>ORF</p>
                     </c>
                     <c ca="left">
                        <p>106*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A1_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence (cf. <it>Shewanella </it>SAR-1)</p>
                     </c>
                     <c ca="left">
                        <p>EAJ06144*</p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>703</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A2_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence (cf. <it>Shewanella </it>SAR-2)</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAI69763
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>699</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A3_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAJ75652
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>714</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A4_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAI51202
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>713</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A5_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>
                           EAI80962
                        </p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>702*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A6_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>EAH92811, EAI03708, EAD44407, EAG79875, EAH92819, EAI36772</p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>711</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A7_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>EAJ99185, EAD99255, EAH48404, EAH57728, EAD83763, EAH04981, EAC91563, EAH85977, EAD11728</p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>710</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97A8_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97a</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>EAJ85380, EAH86891</p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>669*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>97C1_ENSEQ</p>
                     </c>
                     <c ca="left">
                        <p>GH97, 97c</p>
                     </c>
                     <c ca="left">
                        <p>environmental sequence</p>
                     </c>
                     <c ca="left">
                        <p>EAD85224*</p>
                     </c>
                     <c ca="left">
                        <p>ORF: unknown</p>
                     </c>
                     <c ca="left">
                        <p>218*</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GH27_ORYSA</p>
                     </c>
                     <c ca="left">
                        <p>GH27, 27a</p>
                     </c>
                     <c ca="left">
                        <p><it>Oryza sativa </it>japonica cultivar Nipponbare (rice)</p>
                     </c>
                     <c ca="left">
                        <p>
                           BAB12570
                        </p>
                     </c>
                     <c ca="left">
                        <p>alpha-galactosidase</p>
                     </c>
                     <c ca="left">
                        <p>417</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GH36_LACPL</p>
                     </c>
                     <c ca="left">
                        <p>GH36, 36A</p>
                     </c>
                     <c ca="left">
                        <p><it>Lactobacillus plantarum </it>ATCC8014</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAF02774
                        </p>
                     </c>
                     <c ca="left">
                        <p>alpha-galactosidase MelA</p>
                     </c>
                     <c ca="left">
                        <p>738</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GH31_ECOLI</p>
                     </c>
                     <c ca="left">
                        <p>GH31</p>
                     </c>
                     <c ca="left">
                        <p><it>Escherichia coli </it>K12</p>
                     </c>
                     <c ca="left">
                        <p>
                           AAC76680
                        </p>
                     </c>
                     <c ca="left">
                        <p>alpha-xylosidase YicI</p>
                     </c>
                     <c ca="left">
                        <p>772</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>Accession numbers of protein sequences are given according to the NCBI database [72]. Numbers of nucleic sequences are given (in parentheses) if the corresponding protein sequences have not been deposited. In some cases (asterisked), protein sequences were edited by changing the start codon.</p>
                  <p><sup>b</sup>Protein length was established as the number of amino acids in the corresponding precursor. Incomplete sequences (protein fragments) are asterisked.</p>
               </tblfn>
            </tbl>
            <p>PSI-BLAST searches with a few randomly selected divergent representatives of the GH97 family used as a query sequence during the first round always yielded the same 32 protein sequences as with 97A1_BACTH. An analysis of the order of the sequence appearance during the first round of searches by PSI-BLAST, depending on the query, allows us to distinguish five subfamilies (97a&#8211;97e) in the GH97 family with at least two known members in each of them (Table <tblr tid="T1">I</tblr>). The obtained pairwise alignments were used for generating the protein multiple sequence alignment of family GH97. The most conserved parts of the alignment are shown on Figure <figr fid="F2">2</figr>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Portion of the multiple sequence alignment of the sequences analyzed</p>
               </caption>
               <text>
                  <p><b>Portion of the multiple sequence alignment of the sequences analyzed</b>. Ten-letter name for each sequence is indicated in the leftmost column (for origin of the sequences see Table I). The alignment continuously spans three panels. Distances to the N- and C-termini and length of omitted fragments are indicated. Highly conserved residues are highlighted in sequences. Amino acid positions that are highly conserved within several subfamilies but varied in amino acid residues in different subfamilies are coloured. Subfamily belonging of sequences (for family GH97) are indicated in the most right. Amino acid residues, interacting with the substrate in the active center of GH27 and GH31 family glycosidases, are indicated by arrows at the bottom [50-54]. The arrow on the gray background corresponds to the Asp residue, playing the role of the nucleophile in glycosidases of families GH27 and GH31. Red asterisks over and under the alignment indicate three conserved positions (in red) probably corresponding to the nucleophile and proton donor in the glycosidases of family GH97 (see text). Alignment of GH27_ORYSA and GH31_ECOLI is structure-based. At the bottom of the figure, &#946;-strands and &#945;-helixes of the (&#946;/&#945;)<sub>8</sub>-barrel are indicated. The first part of the barrel (&#946;1&#8211;&#946;4) is shown according to the known structures of GH27 and GH31 family members [51, 54]. The second part of the barrel (&#945;4&#8211;&#945;8) is based on generalization of predictions for several GH97 family proteins by 3D-PSSM, GOR IV, and nnpredict programs.</p>
               </text>
               <graphic file="1471-2164-6-112-2"/>
            </fig>
            <p>The fragment of <it>Leifsonia xyli </it>CTCB07 genome [GenBank: NC_006087] revealed by Genomic BLAST has 2 stop codons in the region homologous to genes of GH97 family proteins. An analysis of the nucleic acid sequence allowed us to detect a frame shift (data not shown). The improved ORF encodes protein sequence (97C1_LEIXY), showing a significant sequence similarity with the other members of family GH97 along its whole length (Figure <figr fid="F2">2</figr>). However, it was impossible to determine the very beginning of the protein sequence including the start codon. This protein is a divergent representative of the GH97 family and it could not be classified into any subfamily on the basis of pairwise sequence comparison. 97C1_LEIXY and its closest homologue 97D1_CAUCR (<it>E</it>-value = 2 &#215; 10<sup>-54</sup>) have only 30% of sequence identity.</p>
            <p>A short gene fragment [GenBank: AY350337] from an uncultured bacterium was revealed by Genomic BLAST. It had been obtained and sequenced during PCR screening of human gut microflora <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The deduced protein sequence (97A2_UNBAC) corresponds to the C-terminal part of the others GH97 family proteins and has the highest similarity level with 97A1_BACTH (63% of sequence identity) and 97A1_TANFO (60%). It allows us to include this protein fragment into subfamily 97a (Table <tblr tid="T1">I</tblr>).</p>
            <p>PSI-BLAST search of the non-redundant protein database yielded a unique eukaryotic protein fragment [GenPept: EAL42226] homologous to GH97 family proteins. Screening of the database of eukaryotic nucleic acid sequences uncovered the corresponding DNA sequence [GenBank: AAAB01006165], as well as three other short sequences [GenBank: AAAB01064948, AAAB01020110, and AAAB01068263]. All of them had been sequenced during the mosquito <it>Anopheles gambiae </it>genome project <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. These 4 sequences were aligned for the identification of overlapping regions. AAAB01064948 sequence is homologous to the central part of AAAB01006165 sequence having 54% of identity at the protein level. The ends of AAAB01020110 sequence are respectively homologous to one end of AAAB01006165 and AAAB01068263 sequences: 65% and 69% sequence identity at the protein level. Thus, these 4 sequences correspond to at least two different genes. In total, they cover a complete bacterial gene encoding of a protein of family GH97. Taking into account i) a high similarity level of the 4 deduced protein sequences with bacterial proteins (50&#8211;71% identity with 97A1_BACFR, 97A2_BACTH, 97A1_TANFO, and 97A1_BACTH), ii) the intron-free gene structure, iii) an inability to map the genes on the mosquito chromosomes, and iv) absence of GH97 family proteins in any other eukaryotic organism, we suggest the bacterial origin of these four gene fragments. The bacterial origin could have resulted from a contamination of <it>Anopheles gambiae </it>tissue used for preparing of genome library by mosquito <it>Bacteroides</it>-like gut microflora. The evidence for such kind of contamination was obtained when testing the 35,575 clones from <it>A. gambiae </it>cDNA library <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. It was found that at least 808 sequences appeared to be bacterial contaminants.</p>
            <p>In order to enlarge database of family GH97 we performed screening of the so-called "Environmental Samples data" <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. It revealed 60 nucleic acid sequences from the Sargasso Sea that are homologous to genes of GH97 family proteins. However, the majority of them encode only short protein fragments and many of them have a very high level of sequence similarity. Among them we found only 5 full-size or almost complete genes (each encodes a protein consisting of more than 650 amino acid residues). Three additional "gene" sequences were obtained by combining overlapping gene fragments with almost identical sequences (at least 95% of sequence identity at the protein level). Hypothetical proteins (97A1_ENSEQ-97A8_ENSEQ) encoded by these 8 genes should be placed in the 97a subfamily, on the basis of sequence similarity (Table <tblr tid="T1">I</tblr>). Moreover, the majority of the incomplete genes encode protein fragments belonging to the same subfamily. Only four [GenPept: EAE76000, EAE67019, EAH16525, and EAH96685] and two [GenPept: EAE21375 and EAG68085] protein fragments correspond to subfamilies 97b and 97c, respectively. One short fragment (137 amino acids; [GenPept: EAD85224]) cannot be unambiguously classified into any subfamily of the GH97 family. An analysis of the nucleic acid sequence encoding the latter protein fragment [GenBank: <ext-link ext-link-type="gen" ext-link-id="AACY01501371">AACY01501371</ext-link>] allowed us to extend the protein fragment by using another start codon. The resulting protein sequence (97C1_ENSEQ; 218 amino acids) shows similarity with the sequences of the other members of family GH97 along its whole length. However, it was still impossible to include this protein fragment into any subfamily on the basis of pairwise sequence comparison.</p>
         </sec>
         <sec>
            <st>
               <p>Phylogenetic analysis of family GH97</p>
            </st>
            <p>To check the actual relationships of proteins within the GH97 family we performed a phylogenetic analysis using the obtained multiple sequence alignment. It is well known that phylogeny is the best basis for verification of subfamily structure of a protein family. In many works, where composition of a glycosidase family has been analyzed, the monophyletic status was used as the main argument for a subfamily description. Among others <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>, this method has been applied to GH13 <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, GH27 <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>, and GH36 <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> families of glycoside hydrolases.</p>
            <p>In order to verify our subdivision of the GH97 family into subfamilies we checked the clustering of the family members in the phylogenetic tree. The maximum parsimony (MP; Figure <figr fid="F3">3A</figr>) and the neighbor-joining (NJ; Figure <figr fid="F3">3B</figr>) trees have very similar topology, suggesting the correct interpretation of the evolutionary events. When any subfamily of the GH97 family was considered as an outgroup, both MP and NJ trees showed that all other subfamilies appear to form monophyletic groups with a high bootstrap value (at least 95.4% of support at both trees). It should be noted that there is no pair of subfamilies that compose neighbor clusters on both trees with significant bootstrap support. This suggests approximately the same evolutionary distance between each pair of the subfamilies.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Phylogenetic trees of family GH97</p>
               </caption>
               <text>
                  <p><b>Phylogenetic trees of family GH97</b>. The trees were reconstructed by the PHYLIP package. Each node was tested using the bootstrap approach and the number of supporting pseudoiterations (out of 1000) is indicated for each internal knot. Subfamily belongings of sequences are indicated, the value of bootstrap support for each subfamily is coloured in yellow. Red arrows indicate to the enzymatically-characterized proteins 97A1_BACTH and 97A1_TANFO (see text). The origin of sequences is given in Table <tblr tid="T1">I</tblr>. (A) The maximum parsimony phylogenetic tree. The bootstrap values were determined using the maximum parsimony (PROTPARS) method. (B) The neighbor-joining phylogenetic tree. The number of amino acid substitutions per site is taken as a measure of branch length.</p>
               </text>
               <graphic file="1471-2164-6-112-3"/>
            </fig>
            <p>The archaeal protein 97A1_HALMA is a clear outlayer in the cluster of subfamily 97a at MP and NJ trees (Figure <figr fid="F3">3</figr>). The other members of this subfamily compose several subclusters, that include representatives either from Bacteroidetes or Proteobacteria phyla.</p>
            <p>Unclassified protein 97C1_LEIXY is the closest neighbor of subfamily 97c cluster at MP and NJ trees (Figure <figr fid="F3">3</figr>) and therefore it can be considered as a divergent representative of this subfamily (Table <tblr tid="T1">I</tblr>). Phylogenetic analysis of 97C1_ENSEQ protein fragment (data not shown) allowed us to place it into the same subfamily 97c.</p>
            <p>An analysis of the GH97 family multiple sequence alignment revealed a number of amino acid positions that are highly conserved within several subfamilies but varied in amino acid residues in different subfamilies (Figure <figr fid="F2">2</figr>). Taken together, these signature sequence positions allow to predict the subfamily belonging of a protein sequence.</p>
         </sec>
         <sec>
            <st>
               <p>Relationship of family GH97 with some other glycosidase families</p>
            </st>
            <p>Depending on the GH97 query and the statistical significance threshold of <it>E</it>-value, during the second or third PSI-BLAST iterations, as a rule, we detected statistically significant similarities with &#945;-galactosidases. They represent families GH27 and GH36 of clan GH-D (the &#945;-galactosidase superfamily). More distant similarities were found with glycosidases of family GH31 (the &#945;-galactosidase superfamily) and in some cases with enzymatically-uncharacterized proteins from COG0535. COG0535 has been annotated as a family of predicted Fe-S oxidoreductases, like the closest COG0641 <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Our BLAST searches show, that both COG families are related to the radical SAM superfamily of Fe-S enzymes <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, having (&#946;/&#945;)<sub>8</sub>-barrel fold [PDB: <ext-link ext-link-type="pdb" ext-link-id="1R30">1R30</ext-link>].</p>
            <p>When we used some representatives of subfamily 97a (for example, 97A1_BACTH) as a query and an <it>E</it>-value cut-off of 0.01, it was possible to reveal statistically significant similarity with glycosidases of family GH20 (clan GH-K). A similarity with proteins of this family was detected after the second PSI-BLAST iteration, while the next one or two iterations revealed a distant relationship with members of COG0296 (family GH13 of clan GH-H). It should be noted that glycosidases from the clans GH-D, GH-H, and GH-K have a similar (&#946;/&#945;)<sub>8</sub>-barrel fold of their catalytic domain and the same molecular mechanism of the hydrolyzing reaction <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Thus, our results agree with the data of several authors <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B25">25</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr></abbrgrp> showing the relationship of glycosidases from GH13, GH27, GH31, and GH36 families. More detail analysis of these families and their relationship was done by Rigden <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>.</p>
            <p>Using the &#945;-galactosidases from rice (GH27_ORYSA, family GH27) and <it>Lactobacillus plantarum </it>(GH36_LACPL, family GH36) as a query sequence for PSI-BLAST searches we found their homology with some representatives of the GH97 family (for example, 97B1_BACFR and 97B2_BACTH) after two or three iterations. However, a statistically significant sequence similarity of GH97 family proteins with &#945;-galactosidases is restricted to a fragment of about 100&#8211;150 amino acid residues (Figure <figr fid="F2">2</figr>). This fragment corresponds to the N-terminal half of the catalytic (&#946;/&#945;)<sub>8</sub>-barrel domain of glycosidases from the &#945;-galactosidase superfamily <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp>. This half of the domain is known to be more conserved than the C-terminal half <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Therefore, we can assume that the catalytic domain of the GH97 family proteins also has a similar (&#946;/&#945;)<sub>8</sub>-barrel fold.</p>
            <p>In order to check whether the whole (&#946;/&#945;)<sub>8</sub>-barrel domain is present in GH97 family proteins, we tried to reconstruct their secondary and tertiary structure. The SWISS-MODEL program failed to unambiguously predict the type of the tertiary structure. The 3D-PSSM, GOR IV, and nnpredict programs were used for prediction of the protein secondary structure. The results obtained suggest that the central part of the GH97 family protein sequences represents a typical and complete (&#946;/&#945;)<sub>8</sub>-barrel domain (Figure <figr fid="F2">2</figr>). The N- and C-terminal parts of the sequences, mainly consisting of &#946;-strands, most probably form two additional non-catalytic domains with an unknown function. However, different programs produce contradictory results regarding the number and exact location of the &#946;-strands (data not shown). The non-catalytic domains of glycosidases from the &#945;-galactosidase and &#945;-glucosidase superfamilies are also predominantly composed of &#946;-strands. At least some of these domains are involved in oligomerization and carbohydrate binding <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B54">54</abbr></abbrgrp>.</p>
            <p>3D-PSSM searches of the PDB database with several GH97 family proteins used as a query sequence yielded the highest level of similarity with the GH27 family glycosidases [PDB: <ext-link ext-link-type="pdb" ext-link-id="1KTB">1KTB</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="1BFM">1BFM</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="1R46">1R46</ext-link>, and <ext-link ext-link-type="pdb" ext-link-id="1UAS">1UAS</ext-link>]. Among other best hits we have found representatives of several other (&#946;/&#945;)<sub>8</sub>-barrel fold glycoside hydrolase families: GH2 (clan GH-A), GH5 (GH-A), GH13 (GH-H), GH17 (GH-A), GH18 (GH-K), and GH20 (GH-K), as well as some other enzymes with (&#946;/&#945;)<sub>8</sub>-barrel fold, for example <it>Bacillus subtilis </it>inositol utilization protein IolI [PDB: <ext-link ext-link-type="pdb" ext-link-id="1I6N">1I6N</ext-link>]. These results are in agreement with the hypothesis about common origin of all (&#946;/&#945;)<sub>8</sub>-barrel protein domains, that evolved from an ancestral (&#946;/&#945;)<sub>4 </sub>half-barrel by a tandem gene duplication followed by a fusion <abbrgrp><abbr bid="B55">55</abbr><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>.</p>
            <p>In all known glycosidases with the (&#946;/&#945;)<sub>8</sub>-barrel fold, the amino acid residues involved in the active center are located on the C-termini of the &#946;-strands <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, a similar location of the active site was found in many other (&#946;/&#945;)<sub>8</sub>-barrel fold enzymes <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. It is well known that two acidic groups (Asp and/or Glu) are almost always involved in the glycosidase active center, playing the roles of nucleophile and proton donor <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. Their sequence location has been determined for several representatives of the GH27 and GH31 families <abbrgrp><abbr bid="B54">54</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr><abbr bid="B69">69</abbr></abbrgrp>.</p>
            <p>The Asp residue, playing the role of nucleophile, is located on the C-terminus of the fourth &#946;-strand of the barrel. This residue is highly conserved among proteins of the &#945;-galactosidase superfamily <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B26">26</abbr></abbrgrp>. The homologous residue in the GH97 family proteins is more variable, being Asp in all members of three subfamilies (97b, 97c, and 97d) and Gly in the other proteins (subfamilies 97a and 97e), including 97A1_BACTH and 97A1_TANFO (Figure <figr fid="F2">2</figr>). Since these two proteins display the &#945;-glucosidase activity <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B70">70</abbr></abbrgrp> we can conclude that a residue, set in another site, plays the role of nucleophile at least in some proteins of the GH97 family. It should be noted that we have found a residue on the C-terminus of the fifth &#946;-strand in GH97 family sequences that is Gly in 97b, 97c, and 97d subfamilies, but Glu and Asp in subfamilies 97a and 97e respectively (Figure <figr fid="F2">2</figr>). Therefore, this residue can be suggested as a possible nucleophile in glycosidases of 97a and 97e subfamilies. As a rule, the catalytically essential residues are highly conserved among enzymatically active members of a glycoside hydrolase family, being either Asp, or Glu. The distance between the carboxylic groups of the nucleophile and the proton donor should be similar in order to keep the catalytic machinery. Thus, the difference in the predicted nucleophile residue between 97a and 97e subfamilies is unexpected. However, this does not exclude the existence of a glycosidase activity in proteins with Asp residue at the fifth &#946;-strand (subfamily 97e). To illustrate, in the GH32 family the Asp residue was experimentally shown to be the nucleophile, while several proteins of this family have Glu residue at the homologous position and at least some of them are catalytically active <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>.</p>
            <p>The proton donor of families GH27 and GH31 is located on the C-terminus of the sixth &#946;-strand of the (&#946;/&#945;)<sub>8</sub>-barrel domain. It is outside of the N-terminal half of barrel, which can be unambiguously aligned with proteins of the GH97 family. However, on the C-terminus of the sixth &#946;-strand of the predicted (&#946;/&#945;)<sub>8</sub>-barrel of the GH97 family there is an Asp residue, which is highly conserved in all subfamilies of the family (Figure <figr fid="F2">2</figr>). We suggest this residue as a possible proton donor. Taking into account another structure of the active center and significant sequence similarity of only a half of the catalytic domain, the current data do not support an inclusion of the GH97 family into the &#945;-galactosidase superfamily.</p>
            <p>As far as we know, 97A1_BACTH and 97A1_TANFO are the only enzymatically-characterized proteins in the GH97 family <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. All other members of this family have been found recently during genome projects and are encoded by ORFs. Genes of this family are represented only in a limited number of Eubacteria from phyla Actinobacteria (1 genus), Bacteroidetes (4 genera), Planctomycetes (1 genus), and Proteobacteria (3 and 4 genera from &#945;- and &#947;-classes, respectively), as well as in a unique Archaea (<it>Haloarcula marismortui</it>). However, many of these bacteria have several paralogous genes. The most interesting case is that of <it>B. thetaiotaomicron </it>ATCC29148, which has &#945;-glucosidase SusB (97A1_BACTH) and 9 putative paralogues representing four GH97 subfamilies (Table <tblr tid="T1">I</tblr>), at least two of the paralogues (97C1_BACTH and 97C2_BACTH) are also expressed <it>in vivo </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. This human commensal microorganism is known as a bacterium with the highest number of glycosidase and glycosyltransferase genes <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B71">71</abbr></abbrgrp>. Taken together, these facts we can suggest that evolution of GH97 family proteins has been associated with multiple duplications, gene elimination, and horizontal transfer.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The results of the sequence analysis allow us to distinguish five subfamilies in the GH97 family of glycoside hydrolases. The experimental data on the enzymatic activity are available only for two representatives of the GH97 family: &#945;-glucosidases 97A1_BACTH and 97A1_TANFO <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B70">70</abbr></abbrgrp>. However, we suppose that the other members of this family may also possess some glycosidase activities. Our data suggest that proteins of this family have a common evolutionary origin with glycosidases of the &#945;-galactosidase superfamily. Many genes, encoding proteins of the GH97 family, are located in clusters with genes of glycoside hydrolases and other carbohydrate-active enzymes. For example, 97C1_BACTH and 97C2_BACTH (subfamily 97c) are encoded by genes of <it>B. thetaiotaomicron </it>located at a hemicellulose utilization locus together with eight other glycosidase genes (Figure <figr fid="F1">1</figr>). Taken together, these data support a recent suggestion to consider family GH97 (or GHX) as a new family of glycoside hydrolases <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B24">24</abbr></abbrgrp>. The evolutionary relationship of GH97 proteins with glycosidases of the GH-D, GH-H, and GH-K (and probably GH-A) clans allows to extrapolate their common most important characteristics to glycoside hydrolases of the GH97 family. We can predict a similar (&#946;/&#945;)<sub>8</sub>-barrel fold of the catalytic domain and retaining mechanism of the glycoside bond hydrolysis for glycosidases of the GH97 family.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Protein and nucleic sequences were retrieved from the NCBI database <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>. All proteins analyzed in this work were designated by a ten-letter name (see Table <tblr tid="T1">I</tblr>). The search for homologous proteins was done using the PSI-BLAST <abbrgrp><abbr bid="B73">73</abbr></abbrgrp> and Genomic BLAST at the NCBI server. The statistical significance threshold for including a sequence in the model (<it>E</it>-value) used by PSI-BLAST in the next iteration was either 10<sup>-2 </sup>or 10<sup>-3</sup>, BLOSUM45 was used as a substitution matrix. Multiple sequence alignment was prepared manually using the program BioEdit <abbrgrp><abbr bid="B74">74</abbr></abbrgrp> on the basis of BLAST pairwise alignments.</p>
         <p>The multiple sequence alignment was used to implement classical phylogenetic inference programs, using either maximum parsimony or distance methods. Programs PROTPARS and NEIGHBOR from the PHYLIP package (version 3.6; <abbrgrp><abbr bid="B75">75</abbr></abbrgrp>) were used. Moreover, programs SEQBOOT, PROTPARS, and CONSENSE and programs SEQBOOT, PROTDIST, NEIGHBOR, and CONSENSE were successively used to derive confidence limits, estimated by 1000 bootstrap replicates, for each node in the maximum parsimony and distance tree, respectively. The program TreeView Win32 (version 1.6.6; <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>) was used for drawing the trees.</p>
         <p>An analysis of the order of the display sequence during searches by PSI-BLAST <abbrgrp><abbr bid="B73">73</abbr></abbrgrp> was used for a preliminary division of a family into subfamilies. The latter was defined as a group of proteins that are displayed at the top of the list in a PSI-BLAST query results. Depending on particular criteria of the protein similarity used, the algorithm can split a family into a larger or smaller number of groups of proteins. Like in some of our previous works <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B77">77</abbr></abbrgrp>, in this study we define a subfamily as a group of proteins that have at least 30% sequence identity. Phylogenetic analysis was used in order to verify the obtained subfamilies and to clarify their boundaries. The monophyletic status was used as a criterion for the final definition of a subfamily.</p>
         <p>The SWISS-MODEL modeling server <abbrgrp><abbr bid="B78">78</abbr></abbrgrp> was used to predict the tertiary structure of proteins based on their amino acid sequences. The 3D-PSSM <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>, GOR IV <abbrgrp><abbr bid="B80">80</abbr></abbrgrp> and nnpredict <abbrgrp><abbr bid="B81">81</abbr></abbrgrp> programs were used for prediction of the protein secondary structure. The 3D-PSSM program also was used to search the PDB database.</p>
      </sec>
      <sec>
         <st>
            <p>Added in proof</p>
         </st>
         <p>After submission of the manuscript, six new sequences of GH97 family proteins have been deposited at the NCBI database. Five of them (97A1_SHEBA, 97A1_SHEFR, 97A1_SHEDE, 97A1_SHEAM, and 97A1_SPHAL) belong to subfamily 97a (Table <tblr tid="T1">I</tblr>). The sixth protein 97X1_SOLUS cannot be unambiguously classified into any subfamily of the GH97 family on the basis of pairwise sequence comparison, composition of the signature sequence positions, and phylogenetic analysis. Most probably it corresponds to a new subfamily.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I am grateful to Dr. Bernard Labedan (Universit&#233; de Paris-Sud, France) for critical reading of an earlier version of the manuscript and a helpful discussion of the problem.</p>
            <p>This work was supported by grants of the Russian President for young scientists (MK-118.2003.04 and MK-1461.2005.4).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>A classification of glycosyl hydrolases based on amino acid sequence similarities</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1991</pubdate>
            <volume>280</volume>
            <fpage>309</fpage>
            <lpage>316</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1130547</pubid>
                  <pubid idtype="pmpid">1747104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Carbohydrate-Active Enzymes server</p>
            </title>
            <aug>
               <au>
                  <snm>Coutinho</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <url>http://afmb.cnrs-mrs.fr/CAZY/</url>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Updating the sequence-based classification of glycosyl hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1996</pubdate>
            <volume>316</volume>
            <fpage>695</fpage>
            <lpage>696</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1217404</pubid>
                  <pubid idtype="pmpid" link="fulltext">8687420</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Structural and sequence-based classification of glycoside hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>637</fpage>
            <lpage>644</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(97)80072-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">9345621</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Mechanisms of enzymatic glycoside hydrolysis</p>
            </title>
            <aug>
               <au>
                  <snm>McCarter</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Withers</snm>
                  <fnm>SG</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1994</pubdate>
            <volume>4</volume>
            <fpage>885</fpage>
            <lpage>892</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0959-440X(94)90271-2</pubid>
                  <pubid idtype="pmpid">7712292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Structures and mechanisms of glycosyl hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Davies</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1995</pubdate>
            <volume>3</volume>
            <fpage>853</fpage>
            <lpage>859</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(01)00220-9</pubid>
                  <pubid idtype="pmpid">8535779</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>New families in the classification of glycosyl hydrolases based on amino acid sequence similarities</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1993</pubdate>
            <volume>293</volume>
            <fpage>781</fpage>
            <lpage>788</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1134435</pubid>
                  <pubid idtype="pmpid">8352747</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>&#946;-Fructosidases: a new superfamily of glycosyl hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumov</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Doroshenko</snm>
                  <fnm>VG</fnm>
               </au>
            </aug>
            <source>Mol Biol (Engl Tr)</source>
            <pubdate>1998</pubdate>
            <volume>32</volume>
            <fpage>761</fpage>
            <lpage>766</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">9914979</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Conserved sequence motifs in levansucrases and bifunctional &#946;-xylosidases and &#945;-L-arabinases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1999</pubdate>
            <volume>448</volume>
            <fpage>177</fpage>
            <lpage>179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(99)00369-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10217435</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>&#946;-Fructosidase superfamily: homology with some &#945;-L-arabinases and &#946;-D-xylosidases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2001</pubdate>
            <volume>42</volume>
            <fpage>66</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1097-0134(20010101)42:1&lt;66::AID-PROT70>3.0.CO;2-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">11093261</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Three acidic residues at the active site in the &#946;-propeller architecture for the glycoside hydrolase families 32, 43, 62, and 68</p>
            </title>
            <aug>
               <au>
                  <snm>Pons</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Mart&#237;nez-Fleites</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hern&#225;ndez</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2004</pubdate>
            <volume>54</volume>
            <fpage>424</fpage>
            <lpage>432</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10604</pubid>
                  <pubid idtype="pmpid" link="fulltext">14747991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Oligo-1, 6-glucosidase and neopullulanase enzyme subfamilies from the &#945;-amylase family defined by the fifth conserved sequence region</p>
            </title>
            <aug>
               <au>
                  <snm>Oslancov&#225;</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jane&#269;ek</snm>
                  <fnm>&#352;</fnm>
               </au>
            </aug>
            <source>Cell Mol Life Sci</source>
            <pubdate>2002</pubdate>
            <volume>59</volume>
            <fpage>1945</fpage>
            <lpage>1959</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/PL00012517</pubid>
                  <pubid idtype="pmpid" link="fulltext">12530525</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Relation between domain evolution, specificity, and taxonomy of the &#945;-amylase family members containing a C-terminal starch-binding domain</p>
            </title>
            <aug>
               <au>
                  <snm>Jane&#269;ek</snm>
                  <fnm>&#352;</fnm>
               </au>
               <au>
                  <snm>Svensson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>MacGregor</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>2003</pubdate>
            <volume>270</volume>
            <fpage>635</fpage>
            <lpage>645</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1432-1033.2003.03404.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12581203</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Identification of key amino acid residues in <it>Neisseria polysaccharea </it>amylosucrase</p>
            </title>
            <aug>
               <au>
                  <snm>Sar&#231;abal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Remaud-Simeon</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Willemot</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Potocki de Montalk</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Svensson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Monsan</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2000</pubdate>
            <volume>474</volume>
            <fpage>33</fpage>
            <lpage>37</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(00)01567-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10828446</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A cluster of <it>Thermotoga neapolitana </it>genes involved in the degradation of starch and maltodextins: the molecular structure of the locus</p>
            </title>
            <aug>
               <au>
                  <snm>Berezina</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Lunina</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Zverlov</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Liebl</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Velikodvorskaia</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Mol Biol (Engl Tr)</source>
            <pubdate>2003</pubdate>
            <volume>37</volume>
            <fpage>801</fpage>
            <lpage>809</lpage>
            <xrefbib>
               <pubid idtype="pmpid">14593916</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The concept of the &#945;-amylase family: structural similarity and common catalytic mechanism</p>
            </title>
            <aug>
               <au>
                  <snm>Kuriki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Imanaka</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Biosci Bioeng</source>
            <pubdate>1999</pubdate>
            <volume>87</volume>
            <fpage>557</fpage>
            <lpage>565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16232518</pubid>
                  <pubid idtype="doi">10.1016/S1389-1723(99)80114-5</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Protein engineering in the &#945;-amylase family: catalytic mechanism, substrate specificity, and stability</p>
            </title>
            <aug>
               <au>
                  <snm>Svensson</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>25</volume>
            <fpage>141</fpage>
            <lpage>157</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/BF00023233</pubid>
                  <pubid idtype="pmpid">8018865</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Close evolutionary relatedness of &#945;-amylases from Archaea and plants</p>
            </title>
            <aug>
               <au>
                  <snm>Jane&#269;ek</snm>
                  <fnm>&#352;</fnm>
               </au>
               <au>
                  <snm>L&#233;v&#234;que</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Belarbi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Haye</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1999</pubdate>
            <volume>48</volume>
            <fpage>421</fpage>
            <lpage>426</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/PL00006486</pubid>
                  <pubid idtype="pmpid" link="fulltext">10079280</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Engineering of cyclodextrin glycosyltransferase reaction and product specificity</p>
            </title>
            <aug>
               <au>
                  <snm>van der Veen</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Uitdehaag</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Dijkstra</snm>
                  <fnm>BW</fnm>
               </au>
               <au>
                  <snm>Dijkhuizen</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2000</pubdate>
            <volume>1543</volume>
            <fpage>336</fpage>
            <lpage>360</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11150613</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The DAG family of glycosyl hydrolases combines two previously identified protein families</p>
            </title>
            <aug>
               <au>
                  <snm>Dagnall</snm>
                  <fnm>BH</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>IT</fnm>
               </au>
               <au>
                  <snm>Saier</snm>
                  <fnm>JrMH</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1995</pubdate>
            <volume>311</volume>
            <fpage>349</fpage>
            <lpage>350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1136158</pubid>
                  <pubid idtype="pmpid">7575475</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Sequence analysis and classification of &#945;-galactosidases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>International Summer School "From Genome to Life: Structural, Functional and Evolutionary Approaches"</source>
            <publisher>Carg&#232;se, Corsica, France</publisher>
            <fpage>40</fpage>
            <url>http://www-archbac.u-psud.fr/Meetings/cargese2002/abstracts/NAUMOFF.html</url>
            <note>July 15&#8211;27, 2002</note>
         </bibl>
         <bibl id="B22">
            <title>
               <p>&#945;-Galactosidase superfamily: phylogenetic analysis and homology with some &#945;-glucosidases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>5th Carbohydrate Bioengineering Meeting, University Hospital Groningen</source>
            <publisher>Groningen, The Netherlands</publisher>
            <fpage>32</fpage>
            <note>April 6&#8211;9, 2003</note>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Phylogenetic analysis of &#945;-galactosidases from GH27 family</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Mol Biol (Engl Tr)</source>
            <pubdate>2004</pubdate>
            <volume>38</volume>
            <fpage>388</fpage>
            <lpage>399</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15285616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The &#945;-galactosidase superfamily: sequence based classification of &#945;-galactosidases and related glycosidases</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Proceedings of The Fourth International Conference on Bioinformatics of Genome Regulation and Structure, July 25&#8211;30, 2004. Novosibirsk. Russia</source>
            <volume>1</volume>
            <fpage>315</fpage>
            <lpage>318</lpage>
            <url>http://www.bionet.nsc.ru/meeting/bgrs2004/tom1.pdf</url>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Glycosidase families</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Biochem Soc Trans</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>153</fpage>
            <lpage>156</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9649738</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Iterative database searches demonstrate that glycoside hydrolase families 27, 31, 36 and 66 share a common evolutionary origin with family 13</p>
            </title>
            <aug>
               <au>
                  <snm>Rigden</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2002</pubdate>
            <volume>523</volume>
            <fpage>17</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(02)02879-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">12123797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>A genomic view of the human-<it>Bacteroides thetaiotaomicron </it>symbiosis</p>
            </title>
            <aug>
               <au>
                  <snm>Xu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bjursell</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Himrod</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Carmichael</snm>
                  <fnm>LK</fnm>
               </au>
               <au>
                  <snm>Chiang</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Hooper</snm>
                  <fnm>LV</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>JI</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>299</volume>
            <fpage>2074</fpage>
            <lpage>2076</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1080029</pubid>
                  <pubid idtype="pmpid" link="fulltext">12663928</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Glycan foraging in vivo by an intestine-adapted bacterial symbiont</p>
            </title>
            <aug>
               <au>
                  <snm>Sonnenburg</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Leip</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>C-H</fnm>
               </au>
               <au>
                  <snm>Westover</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Weatherford</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Buhler</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>JI</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>307</volume>
            <fpage>1955</fpage>
            <lpage>1959</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1109051</pubid>
                  <pubid idtype="pmpid" link="fulltext">15790854</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Characterization of a neopullulanase and an &#945;-glucosidase from <it>Bacteroides thetaiotaomicron </it>95-1</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1991</pubdate>
            <volume>173</volume>
            <fpage>2962</fpage>
            <lpage>2968</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">207879</pubid>
                  <pubid idtype="pmpid" link="fulltext">1708385</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Contribution of a neopullulanase, a pullulanase, and an &#945;-glucosidase to growth of <it>Bacteroides thetaiotaomicron </it>on starch</p>
            </title>
            <aug>
               <au>
                  <snm>D'Elia</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1996</pubdate>
            <volume>178</volume>
            <fpage>7173</fpage>
            <lpage>7179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">178630</pubid>
                  <pubid idtype="pmpid" link="fulltext">8955399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Characterization of four outer membrane proteins that play a role in utilization of starch by <it>Bacteroides thetaiotaomicron</it></p>
            </title>
            <aug>
               <au>
                  <snm>Reeves</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1997</pubdate>
            <volume>179</volume>
            <fpage>643</fpage>
            <lpage>649</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">178742</pubid>
                  <pubid idtype="pmpid" link="fulltext">9006015</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Physiological characterization of SusG, an outer membrane protein essential for starch utilization by <it>Bacteroides thetaiotaomicron</it></p>
            </title>
            <aug>
               <au>
                  <snm>Shipman</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Siegel</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1999</pubdate>
            <volume>181</volume>
            <fpage>7206</fpage>
            <lpage>7211</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">103681</pubid>
                  <pubid idtype="pmpid" link="fulltext">10572122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Biochemical analysis of interactions between outer membrane proteins that contribute to starch utilization by <it>Bacteroides thetaiotaomicron</it></p>
            </title>
            <aug>
               <au>
                  <snm>Cho</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>7224</fpage>
            <lpage>7230</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">95572</pubid>
                  <pubid idtype="pmpid" link="fulltext">11717282</pubid>
                  <pubid idtype="doi">10.1128/JB.183.24.7224-7230.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Effect of regulatory protein levels on utilization of starch by <it>Bacteroides thetaiotaomicron</it></p>
            </title>
            <aug>
               <au>
                  <snm>D'Elia</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1996</pubdate>
            <volume>178</volume>
            <fpage>7180</fpage>
            <lpage>7186</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">178631</pubid>
                  <pubid idtype="pmpid" link="fulltext">8955400</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>New regulatory gene that contributes to control of <it>Bacteroides thetaiotaomicron </it>starch utilization genes</p>
            </title>
            <aug>
               <au>
                  <snm>Cho</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Salyers</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2001</pubdate>
            <volume>183</volume>
            <fpage>7198</fpage>
            <lpage>7205</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">95569</pubid>
                  <pubid idtype="pmpid" link="fulltext">11717279</pubid>
                  <pubid idtype="doi">10.1128/JB.183.24.7198-7205.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>ERIC-PCR fingerprinting-based community DNA hybridization to pinpoint genome-specific fragments as molecular markers to identify and track populations common to healthy human guts</p>
            </title>
            <aug>
               <au>
                  <snm>Wei</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pan</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Du</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>J Microbiol Methods</source>
            <pubdate>2004</pubdate>
            <volume>59</volume>
            <fpage>91</fpage>
            <lpage>108</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.mimet.2004.06.007</pubid>
                  <pubid idtype="pmpid" link="fulltext">15325756</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The genome sequence of the malaria mosquito <it>Anopheles gambiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Charlab</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Nusskern</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Wincker</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Ribeiro</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Wides</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Loftus</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Yandell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Majoros</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Lai</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kraft</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Anthouard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Arensburger</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Atkinson</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Baden</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>de Berardinis</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Benes</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Biedler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Blass</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bolanos</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Boscus</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Barnstead</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Center</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chaturverdi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Christophides</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Chrystal</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cravchik</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Curwen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Dana</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Delcher</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Dew</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Flanigan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grundschober-Freimoser</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Friedli</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Guan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hillenmeyer</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Hladun</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Hogan</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>YS</fnm>
               </au>
               <au>
                  <snm>Hoover</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jaillon</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ke</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kodira</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kokoza</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Koutsos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Levitsky</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Lobo</snm>
                  <fnm>NF</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Malek</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>McIntosh</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Meister</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mobarry</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mongin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>O'Brochta</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Pfannkoch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Regier</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Remington</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Shao</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sharakhova</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Sitter</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Shetty</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Strong</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Thomasova</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ton</snm>
                  <fnm>LQ</fnm>
               </au>
               <au>
                  <snm>Topalis</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Unger</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Walenz</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Woodford</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Wortman</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Zhimulev</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Coluzzi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>della Torre</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Louis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kalush</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Mural</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
               <au>
                  <snm>Broder</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gardner</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brey</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Weissenbach</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kafatos</snm>
                  <fnm>FC</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FH</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>129</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1076181</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364791</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Pilot <it>Anopheles gambiae </it>full-length cDNA study: sequencing and initial characterization of 35,575 clones</p>
            </title>
            <aug>
               <au>
                  <snm>Gomez</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Eiglmeier</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Segurens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dehoux</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Couloux</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Scarpelli</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wincker</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Weissenbach</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brey</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>R39</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1088967</pubid>
                  <pubid idtype="pmpid" link="fulltext">15833126</pubid>
                  <pubid idtype="doi">10.1186/gb-2005-6-4-r39</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Environmental genome shotgun sequencing of the Sargasso Sea</p>
            </title>
            <aug>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Remington</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Heidelberg</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fouts</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Knap</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Lomas</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Nealson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Parsons</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Baden-Tillson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pfannkoch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>304</volume>
            <fpage>66</fpage>
            <lpage>74</lpage>
            <url>http://www.ncbi.nlm.nih.gov/BLAST/Genome/EnvirSamplesBlast.html</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1093857</pubid>
                  <pubid idtype="pmpid" link="fulltext">15001713</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Bioinformatics of the glycoside hydrolase family 57 and identification of catalytic residues in amylopullulanase from <it>Thermococcus hydrothermalis</it></p>
            </title>
            <aug>
               <au>
                  <snm>Zona</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chang-Pi-Hin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>O'Donohue</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Jane&#269;ek</snm>
                  <fnm>&#352;</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>2004</pubdate>
            <volume>271</volume>
            <fpage>2863</fpage>
            <lpage>2872</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1432-1033.2004.04144.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15233783</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Emergence of a subfamily of xylanase inhibitors within glycoside hydrolase family 18</p>
            </title>
            <aug>
               <au>
                  <snm>Durand</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Roussel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Flatman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Juge</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>FEBS Journal</source>
            <pubdate>2005</pubdate>
            <volume>272</volume>
            <fpage>1745</fpage>
            <lpage>1755</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1742-4658.2005.04606.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15794761</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Furcatin hydrolase from <it>Viburnum furcatum </it>Blume is a novel disaccharide-specific acuminosidase in glycosyl hydrolase family 1</p>
            </title>
            <aug>
               <au>
                  <snm>Ahn</snm>
                  <fnm>YO</fnm>
               </au>
               <au>
                  <snm>Mizutani</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Saino</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sakata</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2004</pubdate>
            <volume>279</volume>
            <fpage>23405</fpage>
            <lpage>23414</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M311379200</pubid>
                  <pubid idtype="pmpid" link="fulltext">14976214</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Effect of dimer dissociation on activity and thermostability of the &#945;-glucuronidase from <it>Geobacillus stearothermophilus</it>: dissecting the different oligomeric forms of family 67 glycoside hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Shallom</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Golan</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shoham</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shoham</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2004</pubdate>
            <volume>186</volume>
            <fpage>6928</fpage>
            <lpage>6937</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">522207</pubid>
                  <pubid idtype="pmpid" link="fulltext">15466046</pubid>
                  <pubid idtype="doi">10.1128/JB.186.20.6928-6937.2004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>The third chitinase gene (<it>chiC</it>) of <it>Serratia marcescens </it>2170 and the relationship of its product to other bacterial chitinases</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Taiyoji</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sugawara</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nikaidou</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1999</pubdate>
            <volume>343</volume>
            <fpage>587</fpage>
            <lpage>596</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1042/0264-6021:3430587</pubid>
                  <pubid idtype="pmpid" link="fulltext">10527937</pubid>
                  <pubid idtype="pmcid">1220590</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>A genomic perspective on protein families</p>
            </title>
            <aug>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>631</fpage>
            <lpage>637</lpage>
            <url>http://www.ncbi.nlm.nih.gov/COG/</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5338.631</pubid>
                  <pubid idtype="pmpid" link="fulltext">9381173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Radical SAM, a novel protein superfamily linking unresolved steps in familiar biosynthetic pathways with radical mechanisms: functional characterization using new analysis and information visualization methods</p>
            </title>
            <aug>
               <au>
                  <snm>Sofia</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hetzler</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Reyes-Spindola</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>NE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>1097</fpage>
            <lpage>1106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29726</pubid>
                  <pubid idtype="pmpid" link="fulltext">11222759</pubid>
                  <pubid idtype="doi">10.1093/nar/29.5.1097</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Families, superfamilies and subfamilies of glycosyl hydrolases</p>
            </title>
            <aug>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Romeu</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>1995</pubdate>
            <volume>311</volume>
            <fpage>350</fpage>
            <lpage>351</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1136159</pubid>
                  <pubid idtype="pmpid">7575477</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Three &#945;-galactosidase genes of <it>Trichoderma reesei </it>cloned by expression in yeast</p>
            </title>
            <aug>
               <au>
                  <snm>Margolles-Clark</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Tenkanen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Luonteri</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Penttil&#228;</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1996</pubdate>
            <volume>240</volume>
            <fpage>104</fpage>
            <lpage>111</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1432-1033.1996.0104h.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">8797842</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Sequence analysis of glycosylhydrolases: &#946;-fructosidase and &#945;-galactosidase superfamilies</p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Glycoconj J</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>109</fpage>
         </bibl>
         <bibl id="B50">
            <title>
               <p>The 1.9 &#197; structure of &#945;-N-acetylgalactosaminidase: molecular basis of glycosidase deficiency diseases</p>
            </title>
            <aug>
               <au>
                  <snm>Garman</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Hannick</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Garboczi</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>2002</pubdate>
            <volume>10</volume>
            <fpage>425</fpage>
            <lpage>434</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(02)00726-8</pubid>
                  <pubid idtype="pmpid">12005440</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Crystal structure of rice &#945;-galactosidase complexed with D-galactose</p>
            </title>
            <aug>
               <au>
                  <snm>Fujimoto</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kaneko</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Momma</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Mizuno</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2003</pubdate>
            <volume>278</volume>
            <fpage>20313</fpage>
            <lpage>20318</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M302292200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12657636</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>The molecular defect leading to Fabry disease: structure of human &#945;-galactosidase</p>
            </title>
            <aug>
               <au>
                  <snm>Garman</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Garboczi</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>337</volume>
            <fpage>319</fpage>
            <lpage>335</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.01.035</pubid>
                  <pubid idtype="pmpid" link="fulltext">15003450</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Crystal structure of &#945;-galactosidase from <it>Trichoderma reesei </it>and its complex with galactose: implications for catalytic mechanism</p>
            </title>
            <aug>
               <au>
                  <snm>Golubev</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Nagem</snm>
                  <fnm>RAP</fnm>
               </au>
               <au>
                  <snm>Brand&#227;o</snm>
                  <fnm>Neto</fnm>
                  <suf>JR</suf>
               </au>
               <au>
                  <snm>Neustroev</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Eneyskaya</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Kulminskaya</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Shabalin</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Savel'ev</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Polikarpov</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>339</volume>
            <fpage>413</fpage>
            <lpage>422</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.03.062</pubid>
                  <pubid idtype="pmpid" link="fulltext">15136043</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Mechanistic and structural analysis of a family 31 &#945;-glycosidase and its glycosyl-enzyme intermediate</p>
            </title>
            <aug>
               <au>
                  <snm>Lovering</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>Y-W</fnm>
               </au>
               <au>
                  <snm>Withers</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Strynadka</snm>
                  <fnm>NCJ</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>280</volume>
            <fpage>2105</fpage>
            <lpage>2115</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M410468200</pubid>
                  <pubid idtype="pmpid" link="fulltext">15501829</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Stability, catalytic versatility and evolution of the (&#946;&#945;)<sub>8</sub>-barrel fold</p>
            </title>
            <aug>
               <au>
                  <snm>H&#246;cker</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>J&#252;rgens</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wilmanns</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sterner</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Curr Opin Biotechnol</source>
            <pubdate>2001</pubdate>
            <volume>12</volume>
            <fpage>376</fpage>
            <lpage>381</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0958-1669(00)00230-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">11551466</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Dissection of a (&#946;&#945;)<sub>8</sub>-barrel enzyme into two folded halves</p>
            </title>
            <aug>
               <au>
                  <snm>H&#246;cker</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Beismann-Driemeyer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hettwer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lustig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sterner</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>32</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/83021</pubid>
                  <pubid idtype="pmpid" link="fulltext">11135667</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Barrels in pieces?</p>
            </title>
            <aug>
               <au>
                  <snm>Gerlt</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Babbitt</snm>
                  <fnm>PC</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>5</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/83048</pubid>
                  <pubid idtype="pmpid" link="fulltext">11135656</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Structural evidence for evolution of the &#946;/&#945; barrel scaffold by gene duplication and fusion</p>
            </title>
            <aug>
               <au>
                  <snm>Lang</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Thoma</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Henn-Sax</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sterner</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wilmanns</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>289</volume>
            <fpage>1546</fpage>
            <lpage>1550</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.289.5484.1546</pubid>
                  <pubid idtype="pmpid" link="fulltext">10968789</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>The evolution of &#945;/&#946; barrel enzymes</p>
            </title>
            <aug>
               <au>
                  <snm>Farber</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Petsko</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1990</pubdate>
            <volume>15</volume>
            <fpage>228</fpage>
            <lpage>234</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0968-0004(90)90035-A</pubid>
                  <pubid idtype="pmpid">2200166</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>The TIM-barrel fold: a versatile framework for efficient enzymes</p>
            </title>
            <aug>
               <au>
                  <snm>Wierenga</snm>
                  <fnm>RK</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2001</pubdate>
            <volume>492</volume>
            <fpage>193</fpage>
            <lpage>198</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(01)02236-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">11257493</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>The (&#946;/&#945;)<sub>8 </sub>glycosidases: sequence and structure analyses suggest distant evolutionary relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Nagano</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Porter</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>2001</pubdate>
            <volume>14</volume>
            <fpage>845</fpage>
            <lpage>855</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/14.11.845</pubid>
                  <pubid idtype="pmpid" link="fulltext">11742103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Partial amino acid sequences around the essential carboxylate in the active sites of the intestinal sucrase-isomaltase complex</p>
            </title>
            <aug>
               <au>
                  <snm>Quaroni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Semenza</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1976</pubdate>
            <volume>251</volume>
            <fpage>3250</fpage>
            <lpage>3253</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">776963</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Human lysosomal &#945;-glucosidase. Characterization of the catalytic site</p>
            </title>
            <aug>
               <au>
                  <snm>Hermans</snm>
                  <fnm>MMP</fnm>
               </au>
               <au>
                  <snm>Kroos</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>van Beeumen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Oostra</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Reuser</snm>
                  <fnm>AJJ</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1991</pubdate>
            <volume>266</volume>
            <fpage>13507</fpage>
            <lpage>13512</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1856189</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Chemical modification and amino acid sequence of active site in sugar beet &#945;-glucosidase</p>
            </title>
            <aug>
               <au>
                  <snm>Iwanami</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Matsui</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kimura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ito</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Mori</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Honma</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chiba</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Biosci Biotechnol Biochem</source>
            <pubdate>1995</pubdate>
            <volume>59</volume>
            <fpage>459</fpage>
            <lpage>463</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7766184</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>A catalytic amino acid and primary structure of active site in <it>Aspergillus niger </it>&#945;-glucosidase</p>
            </title>
            <aug>
               <au>
                  <snm>Kimura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Takata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fukushi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Mori</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Matsui</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Chiba</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Biosci Biotechnol Biochem</source>
            <pubdate>1997</pubdate>
            <volume>61</volume>
            <fpage>1091</fpage>
            <lpage>1098</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9255970</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Identification of Asp-130 as the catalytic nucleophile in the main &#945;-galactosidase from <it>Phanerochaete chrysosporium</it>, a family 27 glycosyl hydrolase</p>
            </title>
            <aug>
               <au>
                  <snm>Hart</snm>
                  <fnm>DO</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chany</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Withers</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Sims</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Sinnott</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Brumer</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2000</pubdate>
            <volume>39</volume>
            <fpage>9826</fpage>
            <lpage>9836</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi0008074</pubid>
                  <pubid idtype="pmpid" link="fulltext">10933800</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>The synthesis, testing and use of 5-fluoro-alpha-D-galactosyl fluoride to trap an intermediate on green coffee bean &#945;-galactosidase and identify the catalytic nucleophile</p>
            </title>
            <aug>
               <au>
                  <snm>Ly</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Howard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shum</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Withers</snm>
                  <fnm>SG</fnm>
               </au>
            </aug>
            <source>Carbohydr Res</source>
            <pubdate>2000</pubdate>
            <volume>329</volume>
            <fpage>539</fpage>
            <lpage>547</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0008-6215(00)00214-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">11128583</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Carboxyl group of residue Asp647 as possible proton donor in catalytic reaction of &#945;-glucosidase from <it>Schizosaccharomyces pombe</it></p>
            </title>
            <aug>
               <au>
                  <snm>Okuyama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Okuno</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shimizu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Mori</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kimura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chiba</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>2001</pubdate>
            <volume>268</volume>
            <fpage>2270</fpage>
            <lpage>2280</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1432-1327.2001.02104.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11298744</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>The primary structure of the subunit in <it>Bacillus thermoamyloliquefaciens </it>KP1071 molecular weight 540,000 homohexameric &#945;-glucosidase II belonging to the glycosyl hydrolase family 31</p>
            </title>
            <aug>
               <au>
                  <snm>Kashiwabara</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Azuma</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tsuduki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Biosci Biotechnol Biochem</source>
            <pubdate>2000</pubdate>
            <volume>64</volume>
            <fpage>1379</fpage>
            <lpage>1393</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1271/bbb.64.1379</pubid>
                  <pubid idtype="pmpid" link="fulltext">10945254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Cloning and expression of &#945;-D-glucosidase and N-acetyl-&#946;-glucosaminidase from the periodontal pathogen, <it>Tannerella forsythensis </it>(<it>Bacteroides forsythus</it>)</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>CV</fnm>
               </au>
               <au>
                  <snm>Malki</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Loo</snm>
                  <fnm>CY</fnm>
               </au>
               <au>
                  <snm>Tanner</snm>
                  <fnm>ACR</fnm>
               </au>
               <au>
                  <snm>Ganeshkumar</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Oral Microbiol Immunol</source>
            <pubdate>2003</pubdate>
            <volume>18</volume>
            <fpage>309</fpage>
            <lpage>312</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1034/j.1399-302X.2003.00091.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12930523</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Why are there so many carbohydrate-active enzyme-related genes in plants?</p>
            </title>
            <aug>
               <au>
                  <snm>Coutinho</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Stam</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Blanc</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Henrissat</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Trends Plant Sci</source>
            <pubdate>2003</pubdate>
            <volume>8</volume>
            <fpage>563</fpage>
            <lpage>565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tplants.2003.10.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">14659702</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Canese</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>DiCuccio</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Helmberg</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Kenton</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Khovayko</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pontius</snm>
                  <fnm>JU</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Schuler</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Schriml</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Sequeira</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sherry</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Sirotkin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Starchenko</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Suzek</snm>
                  <fnm>TO</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wagner</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yaschenko</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D39</fpage>
            <lpage>45</lpage>
            <url>http://www.ncbi.nlm.nih.gov/</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540016</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608222</pubid>
                  <pubid idtype="doi">10.1093/nar/gki062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Bioedit: a user-friendly biological sequence alignment editor and analysis program for windows 95/98/nt</p>
            </title>
            <aug>
               <au>
                  <snm>Hall</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Symp Ser</source>
            <pubdate>1999</pubdate>
            <volume>41</volume>
            <fpage>95</fpage>
            <lpage>98</lpage>
            <url>http://www.mbio.ncsu.edu/BioEdit/bioedit.html</url>
         </bibl>
         <bibl id="B75">
            <title>
               <p>PHYLIP &#8211; Phylogeny Inference Package (Version 3.2)</p>
            </title>
            <aug>
               <au>
                  <snm>Felsenstein</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cladistics</source>
            <pubdate>1989</pubdate>
            <volume>5</volume>
            <fpage>164</fpage>
            <lpage>166</lpage>
            <url>http://evolution.gs.washington.edu/phylip.html</url>
         </bibl>
         <bibl id="B76">
            <title>
               <p>TREEVIEW: An application to display phylogenetic trees on personal computers</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RDM</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <fpage>357</fpage>
            <lpage>358</lpage>
            <url>http://taxonomy.zoology.gla.ac.uk/rod/treeview.html</url>
            <xrefbib>
               <pubid idtype="pmpid">8902363</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>Molecular structure of the <it>Lactobacillus plantarum </it>sucrose utilization locus: comparison with <it>Pediococcus pentosaceus</it></p>
            </title>
            <aug>
               <au>
                  <snm>Naumoff</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Livshits</snm>
                  <fnm>VA</fnm>
               </au>
            </aug>
            <source>Mol Biol (Engl Tr)</source>
            <pubdate>2001</pubdate>
            <volume>35</volume>
            <fpage>15</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11234378</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <title>
               <p>ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling</p>
            </title>
            <aug>
               <au>
                  <snm>Peitsch</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Biochem Soc Trans</source>
            <pubdate>1996</pubdate>
            <volume>24</volume>
            <fpage>274</fpage>
            <lpage>279</lpage>
            <url>http://swissmodel.expasy.org/</url>
            <xrefbib>
               <pubid idtype="pmpid">8674685</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B79">
            <title>
               <p>Enhanced genome annotation using structural profiles in the program 3D-PSSM</p>
            </title>
            <aug>
               <au>
                  <snm>Kelley</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>MacCallum</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Sternberg</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>299</volume>
            <fpage>499</fpage>
            <lpage>520</lpage>
            <url>http://www.sbg.bio.ic.ac.uk/~3dpssm/</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.3741</pubid>
                  <pubid idtype="pmpid" link="fulltext">10860755</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B80">
            <title>
               <p>GOR method for predicting protein secondary structure from amino acid sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Garnier</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gibrat</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Robson</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>1996</pubdate>
            <volume>266</volume>
            <fpage>540</fpage>
            <lpage>553</lpage>
            <url>http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html</url>
            <xrefbib>
               <pubid idtype="pmpid">8743705</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B81">
            <title>
               <p>Improvements in protein secondary structure prediction by an enhanced neural network</p>
            </title>
            <aug>
               <au>
                  <snm>Kneller</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>FE</fnm>
               </au>
               <au>
                  <snm>Langridge</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>214</volume>
            <fpage>171</fpage>
            <lpage>182</lpage>
            <url>http://www.cmpharm.ucsf.edu/~nomi/nnpredict-instrucs.html</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(90)90154-E</pubid>
                  <pubid idtype="pmpid" link="fulltext">2370661</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
