<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-3-1</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Jordan</snm>
               <fnm>I King</fnm>
               <insr iid="I1"/>
               <email>jordan@ncbi.nlm.nih.gov</email>
            </au>
            <au id="A2">
               <snm>Wolf</snm>
               <mi>I</mi>
               <fnm>Yuri</fnm>
               <insr iid="I1"/>
               <email>wolf@ncbi.nlm.nih.gov</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Koonin</snm>
               <mi>V</mi>
               <fnm>Eugene</fnm>
               <insr iid="I1"/>
               <email>koonin@ncbi.nlm.nih.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2003</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>1</fpage>
         <url>http://www.biomedcentral.com/1471-2148/3/1</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/1471-2148-3-1</pubid>
               <pubid idtype="pmpid">12515583</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>10</day>
               <month>11</month>
               <year>2002</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>6</day>
               <month>1</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>6</day>
               <month>1</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Jordan et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>It has been suggested that rates of protein evolution are influenced, to a great extent, by the proportion of amino acid residues that are directly involved in protein function. In agreement with this hypothesis, recent work has shown a negative correlation between evolutionary rates and the number of protein-protein interactions. However, the extent to which the number of protein-protein interactions influences evolutionary rates remains unclear. Here, we address this question at several different levels of evolutionary relatedness.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Manually curated data on the number of protein-protein interactions among <it>Saccharomyces cerevisiae </it>proteins was examined for possible correlation with evolutionary rates between <it>S. cerevisiae </it>and <it>Schizosaccharomyces pombe </it>orthologs. Only a very weak negative correlation between the number of interactions and evolutionary rate of a protein was observed. Furthermore, no relationship was found between a more general measure of the evolutionary conservation of <it>S. cerevisiae </it>proteins, based on the taxonomic distribution of their homologs, and the number of protein-protein interactions. However, when the proteins from yeast were assorted into discrete bins according to the number of interactions, it turned out that 6.5% of the proteins with the greatest number of interactions evolved, on average, significantly slower than the rest of the proteins. Comparisons were also performed using protein-protein interaction data obtained with high-throughput analysis of <it>Helicobacter pylori </it>proteins. No convincing relationship between the number of protein-protein interactions and evolutionary rates was detected, either for comparisons of orthologs from two completely sequenced <it>H. pylori </it>strains or for comparisons of <it>H. pylori </it>and <it>Campylobacter jejuni </it>orthologs, even when the proteins were classified into bins by the number of interactions.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The currently available comparative-genomic data do not support the hypothesis that the evolutionary rates of the majority of proteins substantially depend on the number of protein-protein interactions they are involved in. However, a small fraction of yeast proteins with the largest number of interactions (the hubs of the interaction network) tend to evolve slower than the bulk of the proteins.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Rates of protein evolution vary greatly and may be influenced by a variety of factors. Recently, it has been demonstrated that the magnitude of the fitness effects associated with deleterious mutations in protein-coding genes (i.e. proteins' dispensability) correlates with rates of protein evolution <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. Essential proteins or those that are less dispensable to an organism tend to evolve slower than those that are more dispensable. It has also been suggested that proteins' evolutionary rates are determined by the proportion of amino-acids that are critical to their function <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. According to this intuitively plausible notion, proteins with a greater fraction of amino acid residues that play an essential role in the protein's function are predicted to evolve slower than those with a smaller fraction of such crucial residues. Consistent with this prediction, a negative correlation has been reported between protein evolutionary rates, which were determined from evolutionary distances between orthologous proteins from yeast <it>Saccharomyces cerevisiae </it>and the nematode <it>Caenorhabditis elegans</it>, and the number of protein-protein interactions (i.e., physical interactions determined, primarily, using the yeast two-hybrid system) proteins are involved in <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Yeast proteins that have a large number of interacting partners were found to have evolved slower, on average, than those with fewer interacting partners, and this was presumed to be due to the fact that proteins with more interacting partners have a greater fraction of residues directly involved in function. However, these same data indicate that less than 6% of the variance in evolutionary rates is explained by the variance in the number of protein-protein interactions, suggesting that the influence of the number of interacting partners on protein evolutionary rates might not be substantial. We sought to further investigate this phenomenon by examining the relationship between the number of protein-protein interacting partners and protein evolutionary rates for the yeasts <it>S. cerevisiae </it>and <it>Schizosaccharomyces pombe </it>as well as for the proteobacteria <it>Helicobacter pylori </it>and <it>Camplyobacter jejuni</it>.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Evolutionary rates and protein-protein interactions: yeast</p>
            </st>
            <p>A total of 1,879 pairs of orthologous proteins, one from <it>S. cerevisiae </it>and one from <it>S. pombe</it>, were identified (see Methods), and for 1,004 of these, there was data on protein-protein interactions of the <it>S. cerevisiae </it>member in the MIPS database <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. For these 1,004 orthologous pairs, the number of protein-protein interactions detected for the <it>S. cerevisiae </it>protein was plotted against the calculated substitution rates between orthologs (Figure <figr fid="F1">1a</figr>). As with a previous survey that compared conserved <it>S. cerevisiae </it>and <it>C. elegans </it>orthologs <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, there is a negative correlation between the number of protein-protein interactions and the evolutionary rates. However, although this correlation is statistically significant (Table <tblr tid="T1">1</tblr>), the slope of the linear trend line (y = -0.012) fit to the data by least squares regression as well as the small r<sup>2 </sup>value (r<sup>2 </sup>= 0.0065) suggest that the influence of the number of interacting partners on rates of evolution is minor at best. Specifically, the r<sup>2 </sup>value indicates that less than 1% of the variation in substitution rates between orthologous proteins is explained by the variation in the number of protein-protein interactions. Furthermore, when only the most conserved (&#8805; 40% sequence identity), and thus most reliably identified, pairs of orthologous proteins were considered, the slope of the linear trend line as well as the r<sup>2 </sup>value decreased and the statistical significance disappeared (Figure <figr fid="F1">1b</figr> and Table <tblr tid="T1">1</tblr>). To account for the possibility that linear regression does not adequately reflect the structure of the data and the observed low correlation is due to a non-linear relationship between the number of interactions and evolutionary rate of a protein, we also calculated the rank correlation coefficients for these quantities. Under this approach, no statistically significant correlation was observed for either of the two analysed data sets (Table <tblr tid="T1">1</tblr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>The relationship between the number of protein-protein interactions for <it>S. cerevisiae </it>proteins and the evolutionary rates between <it>S. cerevisiae </it>and <it>S. pombe </it>orthologs</p>
               </caption>
               <text>
                  <p><b>The relationship between the number of protein-protein interactions for <it>S. cerevisiae </it>proteins and the evolutionary rates between <it>S. cerevisiae </it>and <it>S. pombe </it>orthologs. </b>Shown for each plot is the equation that describes the linear trend line, the r<sup>2 </sup>value that describes the fraction of the variability in the evolutionary rates that is accounted for by the variability in the number of protein-protein interactions and the p value, which is the probability that the correlation between the number of protein-protein interactions and evolutionary rates could be due to chance. (a) All 1,004 observations. (b) 465 observations that correspond to orthologous protein pairs with &#8805; 40% amino acid sequence identity.</p>
               </text>
               <graphic file="1471-2148-3-1-1"/>
            </fig>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Correlation between the number of protein-protein interactions and the evolutionary rate</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>Data set</p>
                     </c>
                     <c ca="left">
                        <p>Linear correlation coefficient (r)/ P-value</p>
                     </c>
                     <c ca="left">
                        <p>Rank correlation coefficient (R)/P-value</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>S. cerevisiae </it>&#8211; <it>S. pombe </it>(all orthologs, N = 1044)</p>
                     </c>
                     <c ca="left">
                        <p>-0.081/0.009</p>
                     </c>
                     <c ca="left">
                        <p>-0.029/0.352</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>S. cerevisiae </it>&#8211; <it>S. pombe </it>(only orthologs with >40% identity, N = 465)</p>
                     </c>
                     <c ca="left">
                        <p>-0.018/0.697</p>
                     </c>
                     <c ca="left">
                        <p>0.074/0.111</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>H. pylori </it>J99 &#8211; <it>H. pylori </it>26695 (N = 672)</p>
                     </c>
                     <c ca="left">
                        <p>-0.039/0.310</p>
                     </c>
                     <c ca="left">
                        <p>0.020/0.610</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>H. pylori </it>&#8211; <it>C. jejuni </it>(N = 458)</p>
                     </c>
                     <c ca="left">
                        <p>-0.013/0.787</p>
                     </c>
                     <c ca="left">
                        <p>0.015/0.747</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>It is tempting to speculate that the difference between the results obtained here and those reported previously <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> can be attributed to the difference in the evolutionary relationships between the pairs of species compared in the two studies. The species compared here, <it>S. cerevisiae </it>and <it>S. pombe</it>, are much more closely related than <it>S. cerevisiae </it>and <it>C. elegans</it>, and orthologous proteins are likely to be more reliably inferred between the closely related genomes. However, we also performed comparisons for pairs of orthologous proteins identified between the more distantly related <it>S. cerevisiae </it>and <it>C. elegans </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and no significant relationship between evolutionary rates and protein-protein interactions was observed (data not shown).</p>
         </sec>
         <sec>
            <st>
               <p>Long-term evolutionary conservation and protein-protein interactions: yeast</p>
            </st>
            <p>To examine the relationship between protein-protein interactions and evolutionary conservation of proteins over longer periods of time, the numbers of interactions for <it>S. cerevisiae </it>proteins were assessed against the taxonomic distribution of their homologs, which were detected using BLAST searches of the Genbank non-redundant protein database with expect value &#8804; 10<sup>-3</sup>. Five distinct levels of taxonomic distribution categories, each including taxa that are successively more distant from <it>S. cerevisiae</it>, were considered: 1 &#8211; hits only to ascomycetes, 2 &#8211; hits to non-ascomycete fungi, 3 &#8211; hits to metazoa and plants, 4 &#8211; hits to non-crown-group eukaryotes, 5 &#8211; hits to archaea and/or bacteria. The broader the taxonomic distribution of homologs of a <it>S. cerevisiae </it>protein the more evolutionarily conserved it is considered to be. Each <it>S. cerevisiae </it>protein was assigned a taxonomic distribution category, and this value was compared to the number of protein-protein interactions reported for the given protein. Correlation between these two features of <it>S. cerevisiae </it>proteins was not statistically significant (r<sup>2 </sup>= 0.007, p = 0.39). Thus, as with the comparison between evolutionary rates and the number of interactions, no substantial relationship between long-term evolutionary conservation of <it>S. cerevisiae </it>proteins and the number of interactions was found.</p>
         </sec>
         <sec>
            <st>
               <p>Evolutionary rates and protein-protein interactions: bacteria</p>
            </st>
            <p>High throughput analysis of protein-protein interactions has also been conducted <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> on the proteobacterium <it>H. pylori </it>(the causative agent of gastric ulcers), for which complete genome sequences of two strains are available <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. Thus it is possible to assess the effect of protein-protein interactions on the rates of evolution over much shorter periods of time (within species) compared to the analysis of the yeast proteins described above. Towards this end, orthologs between the two completely sequenced <it>H. pylori </it>strains were identified and the substitution rates between pairs of orthologous proteins were calculated (see Methods). The number of protein-protein interactions was plotted against the amino acid substitution rates and no significant relationship between the two was detected (Figure <figr fid="F2">2a</figr> and Table <tblr tid="T1">1</tblr>). The same conclusion was reached when the rank correlation coefficient was determined (Table <tblr tid="T1">1</tblr>). In this case, the lack of correlation between evolutionary rates and the number of interacting partners might simply be due to the small amount of evolutionary diversification that has occurred since the two <it>H. pylori </it>strains separated from their common ancestor. To evaluate this possibility, orthologous protein pairs were identified between <it>H. pylori </it>and a more distantly related bacterium, <it>C. jejuni </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. These two species are close enough (both belong to the epsilon subdivision of proteobacteria) to ensure accurate identification of orthologs, but distant enough for substantial sequence divergence to have accumulated between orthologs. Nevertheless, comparison between these two bacteria showed no discernable correlation between the number of protein-protein interactions and the rates of substitution between orthologs, measured either directly or using the rank correlation approach (Figure <figr fid="F2">2b</figr> and Table <tblr tid="T1">1</tblr>).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The relationship between the number of protein-protein interactions for <it>H. pylori </it>and the evolutionary rates between (a) <it>H. pylori </it>strain 26695 and <it>H. pylori </it>strain J99 orthologs and (b) <it>H. pylori </it>strain 26695 and <it>C. jejuni </it>orthologs</p>
               </caption>
               <text>
                  <p><b>The relationship between the number of protein-protein interactions for <it>H. pylori </it>and the evolutionary rates between (a) <it>H. pylori </it>strain 26695 and <it>H. pylori </it>strain J99 orthologs and (b) <it>H. pylori </it>strain 26695 and <it>C. jejuni </it>orthologs. </b>The values shown in each plot are the same as in Figure <figr fid="F1">1</figr>.</p>
               </text>
               <graphic file="1471-2148-3-1-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Yeast proteins with the greatest number of interactions appear to evolve slowly</p>
            </st>
            <p>The observations described above seem to indicate that the number of interaction partners a given protein has does not make an important contribution to the evolutionary rate. One could speculate, however, that whatever minor correlation is seen (Fig. <figr fid="F1">1a</figr>, <figr fid="F2">2a</figr>), is not spread evenly, as a miniscule difference in the evolutionary rates, among all proteins, but rather reflects a substantial slowdown of evolution among a small fraction of proteins that have the greatest number of interactions. To test this hypothesis, we grouped proteins from <it>S. cerevisiae </it>and <it>H. pylori </it>into separate bins, with each bin containing proteins whose number of interactions fell within a given range. Comparison of the evolutionary rates for proteins in different bins showed that yeast proteins in the bins with the greatest number of interactions, on average, evolved slower than the bulk of the proteins (Fig. <figr fid="F3">3a</figr>). The difference was less than twofold even for the top bin, but was statistically significant for each of the top three bins or their combination (Table <tblr tid="T2">2</tblr>). The proteins with a large number of interactions placed in the top bins comprise only 6.5% of the yeast proteins. In contrast, for the bulk of the proteins, which have a small to moderate number of interactions, there did not seem to be any dependence at all between the number of interactions and the evolutionary rates (Fig. <figr fid="F3">3a</figr>). <it>H. pylori </it>proteins with the greatest number of interactions also appear to have evolved slower on average between strains than the majority of the proteins. However, the difference was not significant and this effect was not seen in the comparison of <it>H. pylori </it>and <it>C. jejuni </it>orthologs (Table <tblr tid="T2">2</tblr> and Fig <figr fid="F3">3b,3c</figr>).</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Mean evolutionary rates for bins of proteins with different number of interactions</p>
               </caption>
               <text>
                  <p><b>Mean evolutionary rates for bins of proteins with different number of interactions. </b>Shown for each graph is the range of the number protein-protein interactions for each bin (x-axis) and the mean evolutionary rate (substitutions per site) for each bin (y-axis). (a) <it>S. cerevisiae </it>and <it>S. pombe </it>orthologs. (b) <it>H. pylori </it>strain 26695 and <it>H. pyori </it>strain J99 orthologs. (c) <it>H. pylori </it>strain 26695 and <it>C. jejuni </it>orthologs.</p>
               </text>
               <graphic file="1471-2148-3-1-3"/>
            </fig>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Statistical significance of the differences in evolutionary rates between groups of proteins with different numbers of interactions.</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>
                           <it>Bin (# interactions) comparisons<sup>a</sup></it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>P<sup><it>b</it></sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p><it>S. cerevisiae </it>&#8211; <it>S. pombe</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>41 &#8211; 60 vs. 1 &#8211; 40</p>
                     </c>
                     <c ca="center">
                        <p>8.3 &#215; 10<sup>-4</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>31 &#8211; 60 vs. 1 &#8211; 30</p>
                     </c>
                     <c ca="center">
                        <p>2.4 &#215; 10<sup>-2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21 &#8211; 60 vs. 1 &#8211; 20</p>
                     </c>
                     <c ca="center">
                        <p>1.7 &#215; 10<sup>-4</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p><it>H. pylori 26695 </it>&#8211; <it>H. pylori J99</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21 &#8211; 55 vs. 1 &#8211; 20</p>
                     </c>
                     <c ca="center">
                        <p>1.5 &#215; 10<sup>-1</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15 &#8211; 55 vs. 1 &#8211; 14</p>
                     </c>
                     <c ca="center">
                        <p>1.8 &#215; 10<sup>-1</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11 &#8211; 55 vs. 1 &#8211; 10</p>
                     </c>
                     <c ca="center">
                        <p>3.2 &#215; 10<sup>-1</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p><it>H. pylori 26695 </it>&#8211; <it>C. jejuni</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21 &#8211; 47 vs. 1 &#8211; 20</p>
                     </c>
                     <c ca="center">
                        <p>9.8 &#215; 10<sup>-1</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11 &#8211; 47 vs. 1 &#8211; 10</p>
                     </c>
                     <c ca="center">
                        <p>5.1 &#215; 10<sup>-1</sup></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a </sup>Orthologous pairs of proteins were placed into bins based on the number of protein-protein interactions (Figure <figr fid="F3">3</figr>). <sup>b </sup><it>P</it>-value for the Student's ttest comparing the mean evolutionary rates between orthologs for bins with distinct ranges in the number of protein-protein interactions.</p>
               </tblfn>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion and conclusions</p>
         </st>
         <p>The hypothesis that a protein's rate of evolution is determined by the fraction of residues that are critical to its function, and this, in turn, is likely to be proportional to the number of interactions a protein is involved in, seems to make perfectly good sense. Indeed, a recent report is consistent with this idea in suggesting that the number of protein-protein interactions significantly affects rates of evolution <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. However, upon investigation of this relationship at multiple levels of evolutionary relatedness, we found that there was only a slight correlation, at best, between evolutionary rates and the number of protein-protein interactions. In fact, examination of the actual data presented in support of the previous claim of a connection between the number of interactions and evolutionary rates <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> also shows a weak correlation, albeit greater than the one observed in this study. Thus, differences in the number of interaction partners seem to explain, at best, only a small part of the great variation of the evolutionary rates of proteins encoded in each genome <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>Why does the number of interaction partners apparently have only a slight effect on the evolutionary rate? The first and most obvious possibility to consider would be that the low quality of protein-protein interaction data might obscure the signal. Indeed, a recent comparison of protein-protein interaction data sets from high-throughput studies suggested that more than half of all interactions determined by large scale experiments are likely to be false positives <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. However, at least for the yeast data, we relied on manually curated protein-protein interaction data from the MIPS database, which are expected to have a substantially lower error rate. Second, one could speculate that, even if the majority of the analyzed interactions actually do occur, they are selectively (nearly) neutral; the number of such real but functionally irrelevant interactions would not affect the rate of evolution. Third, the possibility exists that, even if many of the observed interactions are functionally important and, by inference, the respective binding sites are subject to purifying selection, the binding sites for different partners tend to overlap such that the number of amino residues in these sites increases only slowly with the increase in the numbers of interactions.</p>
         <p>The latter two possibilities are not incompatible with each other and with the other aspect of the observations reported here. We found that the small fraction of yeast proteins that have the greatest number of interaction partners do, on average, evolve slower than the bulk of the proteins, which are involved in a moderate or small number of interactions. This effect was less pronounced, if observed at all, for <it>H. pylori</it>, but it has to be noticed that the top bins of the <it>H. pylori </it>interaction data included proteins with fewer interactions than the respective bins in the yeast data (compare Fig. <figr fid="F3">3b,3c</figr> and <figr fid="F3">3a</figr>). Protein-protein interactions form scale-free networks, which show the characteristic power-law distribution of the node degrees; simply put, there is a small number of highly connected proteins (hubs), whereas the majority have a small number of partners (the most abundant class are proteins that are involved in just one interaction) <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>. Scale-free networks are highly tolerant to error (elimination of nodes at random) but are vulnerable to attack, i.e. elimination of the hubs <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and, indeed, it has been found that the most highly connected proteins in yeast interaction networks tend to be essential <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. This might explain the present findings, namely that a small number of yeast protein-protein interaction hubs evolve slowly due to strong purifying selection, whereas, for the great majority of the proteins, there is no discernible connection between the number of interactions and evolutionary rates.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Comparison of evolutionary rates and protein-protein interactions</p>
            </st>
            <p>Sets of protein sequences encoded by the complete genome sequences of the yeasts <it>S. cerevisiae </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp> and <it>S. pombe </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, the nematode <it>C. elegans </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and the proteobacteria <it>H. pylori </it>strain 26695 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, <it>H. pylori </it>strain J99 <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> and <it>C. jejuni </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp> were downloaded from the National Center of Biotechnology Information's Genbank ftp site <url>ftp://ftp.ncbi.nlm.nih.gov/genomes/</url>. Protein sets (proteomes) from the following pairs of complete genome sequences were compared in order to identify orthologous sequences: <it>S. cerevisiae </it>&#8211; <it>S. pombe</it>, <it>S. cerevisiae </it>&#8211; <it>C. elegans</it>, <it>H. pylori </it>strain 26695 &#8211; <it>H. pylori </it>strain J99, <it>H. pylori </it>strain 26695 &#8211; <it>C. jejuni</it>. Pairs of proteomes were compared using the BLASTP program <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, with post-processing of results done using the SEALS package <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. For each proteome, individual proteins were used as queries in BLASTP searches against the entire proteome of the other analyzed species (or strain). Symmetrical best hits in these BLAST searches (expectation value &#8804; 10<sup>-3</sup>) were taken to be orthologs <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Pairs of orthologous proteins were aligned using the ClustalW program <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and their substitution (evolutionary) rates were calculated using the gamma distance correction <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. The data on protein-protein interactions for the <it>S. cerevisiae </it>proteome were obtained from the Munich Information Center for Protein Sequences (MIPS) <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> Comprehensive Yeast Genome Database <url>http://mips.gsf.de/proj/yeast/CYGD/db/index.html</url>. This database includes a manually curated catalogue of binary protein-protein interactions that is considered to be a reliable reference set <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Protein-protein interactions for the <it>H. pylori </it>proteome <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> were taken from the PIMRider functional proteomics software platform <url>http://pim.hybrigenics.fr/pimrider/pimriderlobby/PimRiderLobby.jsp</url>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>IKJ performed the comparisons between evolutionary rates and the number of protein-protein interactions and drafted the manuscript. YIW determined the evolutionary conservation levels for <it>S. cerevisiae </it>proteins and contributed to the statistical analysis. EVK helped to conceive of the study, participated in its design and coordination and revised the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Protein dispensability and rate of evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>1046</fpage>
            <lpage>1049</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35082561</pubid>
                  <pubid idtype="pmpid" link="fulltext">11429604</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Essential genes are more evolutionarily conserved than are nonessential genes in bacteria.</p>
            </title>
            <aug>
               <au>
                  <snm>Jordan</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>962</fpage>
            <lpage>968</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.87702. Article published online before print in May 2002</pubid>
                  <pubid idtype="pmpid" link="fulltext">12045149</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>What determines the rate of sequence evolution?</p>
            </title>
            <aug>
               <au>
                  <snm>Brookfield</snm>
                  <fnm>JF</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>R410</fpage>
            <lpage>R0411</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(00)00506-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">10837241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Evolutionary rate in the protein interaction network.</p>
            </title>
            <aug>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Steinmetz</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Scharfe</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Feldman</snm>
                  <fnm>MW</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <fpage>750</fpage>
            <lpage>752</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1068696</pubid>
                  <pubid idtype="pmpid" link="fulltext">11976460</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>MIPS: a database for genomes and protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Guldener</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Mannhaupt</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mokrejs</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Morgenstern</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Munsterkotter</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rudd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weil</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>31</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99165</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752246</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.31</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Genome sequence of the nematode C. elegans: a platform for investigating biology.</p>
            </title>
            <aug>
               <au>
                  <cnm>The C. elegans Sequencing Consortium</cnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1998</pubdate>
            <volume>282</volume>
            <fpage>2012</fpage>
            <lpage>2018</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.282.5396.2012</pubid>
                  <pubid idtype="pmpid" link="fulltext">9851916</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The protein-protein interaction map of Helicobacter pylori.</p>
            </title>
            <aug>
               <au>
                  <snm>Rain</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Selig</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>De Reuse</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Battaglia</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Reverdy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lenzen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Petel</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Wojcik</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schachter</snm>
                  <fnm>V</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>211</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35051615</pubid>
                  <pubid idtype="pmpid" link="fulltext">11196647</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori.</p>
            </title>
            <aug>
               <au>
                  <snm>Alm</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Ling</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Moir</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Doig</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Noonan</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Guild</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>deJonge</snm>
                  <fnm>BL</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1999</pubdate>
            <volume>397</volume>
            <fpage>176</fpage>
            <lpage>180</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/16495</pubid>
                  <pubid idtype="pmpid" link="fulltext">9923682</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>The complete genome sequence of the gastric pathogen <it>Helicobacter pylori</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Tomb</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Kerlavage</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Clayton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Ketchum</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Klenk</snm>
                  <fnm>HP</fnm>
               </au>
               <au>
                  <snm>Gill</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dougherty</snm>
                  <fnm>BA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>388</volume>
            <fpage>539</fpage>
            <lpage>547</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/41483</pubid>
                  <pubid idtype="pmpid" link="fulltext">9252185</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Parkhill</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wren</snm>
                  <fnm>BW</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ketley</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Churcher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Basham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chillingworth</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Feltwell</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Holroyd</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>403</volume>
            <fpage>665</fpage>
            <lpage>668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35001088</pubid>
                  <pubid idtype="pmpid" link="fulltext">10688204</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>From complete genomes to measures of substitution rate variability within and between proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>991</fpage>
            <lpage>1000</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.7.991</pubid>
                  <pubid idtype="pmpid" link="fulltext">10899148</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Comparative assessment of large-scale data sets of protein-protein interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>von Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Krause</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Cornell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Fields</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>417</volume>
            <fpage>399</fpage>
            <lpage>403</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature750</pubid>
                  <pubid idtype="pmpid" link="fulltext">12000970</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Lethality and centrality in protein networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Jeong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Mason</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Barabasi</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Oltvai</snm>
                  <fnm>ZN</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>41</fpage>
            <lpage>42</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35075138</pubid>
                  <pubid idtype="pmpid" link="fulltext">11333967</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Generating protein interaction maps from incomplete data: application to fold assignment.</p>
            </title>
            <aug>
               <au>
                  <snm>Lappe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Niggemann</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Holm</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <issue>Suppl 1</issue>
            <fpage>S149</fpage>
            <lpage>156</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11473004</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Error and attack tolerance of complex networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Albert</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jeong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Barabasi</snm>
                  <fnm>AL</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>406</volume>
            <fpage>378</fpage>
            <lpage>382</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35019019</pubid>
                  <pubid idtype="pmpid" link="fulltext">10935628</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Life with 6000 genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Goffeau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Barrell</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Bussey</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Dujon</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Feldmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Galibert</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hoheisel</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Jacq</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Johnston</snm>
                  <fnm>M</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>274</volume>
            <fpage>563</fpage>
            <lpage>547</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1126/science.274.5287.546</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>The genome sequence of Schizosaccharomyces pombe.</p>
            </title>
            <aug>
               <au>
                  <snm>Wood</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gwilliam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rajandream</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Lyne</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lyne</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sgouros</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Peat</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hayles</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <fpage>871</fpage>
            <lpage>880</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature724</pubid>
                  <pubid idtype="pmpid" link="fulltext">11859360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="pmcid">146917</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A genomic perspective on protein families.</p>
            </title>
            <aug>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>631</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5338.631</pubid>
                  <pubid idtype="pmpid" link="fulltext">9381173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>SEALS: a system for easy analysis of lots of sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Walker</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>5</volume>
            <fpage>333</fpage>
            <lpage>339</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9322058</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Using CLUSTAL for multiple sequence alignments.</p>
            </title>
            <aug>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>1996</pubdate>
            <volume>266</volume>
            <fpage>383</fpage>
            <lpage>402</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8743695</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Estimation of the number of amino acid substitutions per site when the substitution rate varies among sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Ota</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1994</pubdate>
            <volume>38</volume>
            <fpage>642</fpage>
            <lpage>643</lpage>
         </bibl>
      </refgrp>
   </bm>
</art>
