<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-3-21</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein&#8211;protein interactions data sets</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Bloom</snm>
               <mi>D</mi>
               <fnm>Jesse</fnm>
               <insr iid="I1"/>
               <email>bloom@caltech.edu</email>
            </au>
            <au id="A2">
               <snm>Adami</snm>
               <fnm>Christoph</fnm>
               <insr iid="I2"/>
               <email>adami@caltech.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Chemistry and Digital Life Laboratory, 210-41, California Institute of Technology, Pasadena, CA 91125, USA</p>
            </ins>
            <ins id="I2">
               <p>Digital Life Laboratory and Jet Propulsion Laboratory, 136-93, California Institute of Technology, Pasadena, CA 91125, USA</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2003</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>21</fpage>
         <url>http://www.biomedcentral.com/1471-2148/3/21</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">14525624</pubid>
               <pubid idtype="doi">10.1186/1471-2148-3-21</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>04</day>
               <month>9</month>
               <year>2003</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>02</day>
               <month>10</month>
               <year>2003</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>02</day>
               <month>10</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>Bloom and Adami; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Several studies have suggested that proteins that interact with more partners evolve more slowly. The strength and validity of this association has been called into question. Here we investigate how biases in high-throughput protein&#8211;protein interaction studies could lead to a spurious correlation.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We examined the correlation between evolutionary rate and the number of protein&#8211;protein interactions for sets of interactions determined by seven different high-throughput methods in <it>Saccharomyces cerevisiae</it>. Some methods have been shown to be biased towards counting more interactions for abundant proteins, a fact that could be important since abundant proteins are known to evolve more slowly. We show that the apparent tendency for interactive proteins to evolve more slowly varies directly with the bias towards counting more interactions for abundant proteins. Interactions studies with no bias show no correlation between evolutionary rate and the number of interactions, and the one study biased towards counting fewer interactions for abundant proteins actually suggests that interactive proteins evolve <it>more rapidly</it>. In all cases, controlling for protein abundance significantly decreases the observed correlation between interactions and evolutionary rate. Finally, we disprove the hypothesis that small data set size accounts for the failure of some interactions studies to show a correlation between evolutionary rate and the number of interactions.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>The only correlation supported by a careful analysis of the data is between evolutionary rate and protein abundance. The reported correlation between evolutionary rate and protein&#8211;protein interactions cannot be separated from the biases of some protein&#8211;protein interactions studies to count more interactions for abundant proteins.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Different proteins in the same organism evolve at different rates. An understanding of the factors that cause these differences in rates has important ramifications for genetics, molecular evolution, and evolutionary biology. Factors that are thought to influence a protein's evolutionary rate include its abundance <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, whether its function is encoded in a robust manner <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, and the amount of recombination that it undergoes <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Of these factors, abundance is the strongest correlate of evolutionary rate, and recent work has shown the importance of adequately controlling for abundance when examining correlations between evolutionary rate and other protein properties <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>Another factor that has been suggested to influence a protein's evolutionary rate is its number of interaction partners, with recent studies claiming that interactive proteins evolve more slowly because they have more functionally constrained residues <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. However, this reported association between evolutionary rate and the number of protein&#8211;protein interactions has proven controversial, with studies using different interactions data sets reaching different conclusions <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <p>The original claim by Fraser <it>et al. </it><abbrgrp><abbr bid="B5">5</abbr></abbrgrp> that a protein's evolutionary rate depends on the number of different proteins it interacts with was based on a negative statistical correlation between evolutionary rate as determined from an alignment of orthologs, and the number of interactions as determined by pooling data from several studies. However, a second study by Jordan <it>et al. </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> using different data sets for both protein&#8211;protein interactions and evolutionary rates failed to find a significant correlation between evolutionary rate and the number of interactions. A third study by Fraser <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> using a much larger protein&#8211;protein interactions data set again found a correlation, and also showed that the conflicting results were due to differences in the interactions data sets rather than differences in the evolutionary rates. The authors of this last study suggested that Jordan <it>et al. </it>failed to observe a correlation because of an incomplete set of protein&#8211;protein interactions, yet they offered no explanation of why only some data sets should reveal a correlation.</p>
         <p>The biophysical explanation proposed by Fraser <it>et al. </it><abbrgrp><abbr bid="B5">5</abbr></abbrgrp> for the tendency of proteins with more interactions to evolve more slowly was that interactive proteins have more residues involved in protein&#8211;protein interaction surfaces, and are therefore less tolerant of amino acid substitutions. However, an individual residue does not distinguish between contacts with other residues from the same or a different protein, so there is no obvious reason why residues involved in intermolecular contacts should be more evolutionary constrained than other residues with the same number of intramolecular contacts. Indeed, analysis of oligomeric proteins has shown that interacting residues are not under the strong selection constraints of enzymatic active site residues, but instead actually change more rapidly than typical core residues and only slightly more slowly than the average for the entire sequence <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. On these grounds, one would expect the number of interaction partners to have at most a slight effect on the overall rate of sequence evolution, and that other factors such as the ratio of core to surface residues should be more important.</p>
         <p>The sensitivity of the correlation between evolutionary rate and the number of interactions to the particular data set used, as well as the absence of a clear biophysical justification for why proteins with more interaction partners should evolve more slowly, prompted us to analyze the data more carefully. We find that the reported connection between evolutionary rate and the number of interactions is linked to the biases of some protein&#8211;protein interactions studies to count more interactions for abundant proteins.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Analysis of the different interactions data sets</p>
            </st>
            <p>Protein&#8211;protein interactions data for <it>S. cerevisiae </it>are derived from studies using a variety of distinct methods, each with its own strengths and weaknesses (for a comprehensive discussion, see <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>). In particular, several methods have been shown to be biased towards counting more interactions for abundant proteins <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Since abundant proteins are known to evolve more slowly <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, any examination of the relationship between interactions and evolutionary rate should control for biases towards counting more interactions for abundant proteins.</p>
            <p>We compiled <it>S. cerevisiae </it>protein&#8211;protein interactions sets from nine studies using seven different high-throughput methods, taking data from two studies that identified interactions by mass spectrometry <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, two studies that identified interactions with the yeast two-hybrid system <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, and studies that identified interactions by correlated mRNA expression (synexpression) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, identification of conserved gene neighborhoods <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B9">9</abbr></abbrgrp>, cooccurrence of genes in sequenced genomes <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B9">9</abbr></abbrgrp>, identification of gene fusion events <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B9">9</abbr></abbrgrp>, and synthetic lethality in knockouts <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B9">9</abbr></abbrgrp>. The mass spectrometry studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> involved tagging and overexpression of one of the proteins, which may lead to non-native interactions, so for these studies we also compiled data sets that counted the interactions only for the untagged proteins. We also compiled a comprehensive list of all of the interactions from all studies, as well as the interactions found independently by two and three of the studies.</p>
            <p>We also gathered information on the evolutionary rates and abundances of <it>S. cerevisiae </it>proteins. The sequence evolution rates were based on alignments with orthologs from <it>Candida albicans </it>compiled by Fraser <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> according to the method of <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. We used two established proxies for protein abundance: mRNA transcript levels from gene microarrays <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp> and codon adaptation indices (CAI) calculated from gene sequences <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. We used this information to create sets of proteins that participated in at least one interaction and for which evolutionary rate and abundance information were available; the size of the coverage sets for each interactions study is shown in Table <tblr tid="T1">1</tblr>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>The correlations among evolutionary rate, the number of interactions, and protein abundance for all studies when abundance is measured by (A) mi-croarray expression level or (B) CAI. <it>N</it><sub>p </sub>and <it>N</it><sub>i </sub>are the number of proteins and interactions for each data set. The Kendall's rank correlations between variables are given by &#964;<sub>EI</sub>, &#964;<sub>AI</sub>, and &#964;<sub>EA</sub>. The Kendall's partial rank correlation between evolutionary rate and the number of interactions when abundance is controlled for is given by &#964;<sub>EI.A</sub>. All correlations have two-tailed significances of <it>P </it>&lt; 10<sup>-3 </sup>unless another <it>P </it>value is given in parentheses.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c cspan="7" ca="left">
                        <p>(A)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>Study</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>N<sub>p</sub></p>
                     </c>
                     <c ca="center">
                        <p>N<sub>i</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EI</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>AI</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EA</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EI.A</sub></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ito <abbrgrp><abbr bid="B12">12</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>505</p>
                     </c>
                     <c ca="center">
                        <p>1007</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.30)</p>
                     </c>
                     <c ca="center">
                        <p>-0.03 (0.36)</p>
                     </c>
                     <c ca="center">
                        <p>-0.34</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.51)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Uetz <abbrgrp><abbr bid="B13">13</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>607</p>
                     </c>
                     <c ca="center">
                        <p>1183</p>
                     </c>
                     <c ca="center">
                        <p>0.01 (0.63)</p>
                     </c>
                     <c ca="center">
                        <p>-0.01 (0.67)</p>
                     </c>
                     <c ca="center">
                        <p>-0.35</p>
                     </c>
                     <c ca="center">
                        <p>0.01 (0.75)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2H studies <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>893</p>
                     </c>
                     <c ca="center">
                        <p>2034</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.29)</p>
                     </c>
                     <c ca="center">
                        <p>-0.02 (0.40)</p>
                     </c>
                     <c ca="center">
                        <p>-0.36</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.48)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Gavin <abbrgrp><abbr bid="B10">10</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1039</p>
                     </c>
                     <c ca="center">
                        <p>15224</p>
                     </c>
                     <c ca="center">
                        <p>-0.12</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>-0.45</p>
                     </c>
                     <c ca="center">
                        <p>-0.09</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Gavin <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>1018</p>
                     </c>
                     <c ca="center">
                        <p>8568</p>
                     </c>
                     <c ca="center">
                        <p>-0.08</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>-0.44</p>
                     </c>
                     <c ca="center">
                        <p>-0.04 (0.04)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ho <abbrgrp><abbr bid="B11">11</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1183</p>
                     </c>
                     <c ca="center">
                        <p>5879</p>
                     </c>
                     <c ca="center">
                        <p>-0.18</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                     <c ca="center">
                        <p>-0.41</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ho <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>991</p>
                     </c>
                     <c ca="center">
                        <p>2990</p>
                     </c>
                     <c ca="center">
                        <p>-0.28</p>
                     </c>
                     <c ca="center">
                        <p>0.30</p>
                     </c>
                     <c ca="center">
                        <p>-0.40</p>
                     </c>
                     <c ca="center">
                        <p>-0.18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MS studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1698</p>
                     </c>
                     <c ca="center">
                        <p>20708</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                     <c ca="center">
                        <p>0.12</p>
                     </c>
                     <c ca="center">
                        <p>-0.42</p>
                     </c>
                     <c ca="center">
                        <p>-0.09</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MS studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>1543</p>
                     </c>
                     <c ca="center">
                        <p>11424</p>
                     </c>
                     <c ca="center">
                        <p>-0.14</p>
                     </c>
                     <c ca="center">
                        <p>0.16</p>
                     </c>
                     <c ca="center">
                        <p>-0.42</p>
                     </c>
                     <c ca="center">
                        <p>-0.08</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>synexpression <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1114</p>
                     </c>
                     <c ca="center">
                        <p>19188</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>-0.12</p>
                     </c>
                     <c ca="center">
                        <p>-0.40</p>
                     </c>
                     <c ca="center">
                        <p>0.05 (0.02)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene neighborhood <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>765</p>
                     </c>
                     <c ca="center">
                        <p>9882</p>
                     </c>
                     <c ca="center">
                        <p>-0.10</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                     <c ca="center">
                        <p>-0.44</p>
                     </c>
                     <c ca="center">
                        <p>-0.04 (0.15)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>synthetic lethality <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>524</p>
                     </c>
                     <c ca="center">
                        <p>1463</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.50)</p>
                     </c>
                     <c ca="center">
                        <p>0.01 (0.71)</p>
                     </c>
                     <c ca="center">
                        <p>-0.43</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.40)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene cooccurrence <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>298</p>
                     </c>
                     <c ca="center">
                        <p>1718</p>
                     </c>
                     <c ca="center">
                        <p>-0.15</p>
                     </c>
                     <c ca="center">
                        <p>0.07 (0.08)</p>
                     </c>
                     <c ca="center">
                        <p>-0.36</p>
                     </c>
                     <c ca="center">
                        <p>-0.13 (0.002)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene fusion <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>222</p>
                     </c>
                     <c ca="center">
                        <p>535</p>
                     </c>
                     <c ca="center">
                        <p>0.04 (0.40)</p>
                     </c>
                     <c ca="center">
                        <p>-0.03 (0.51)</p>
                     </c>
                     <c ca="center">
                        <p>-0.37</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.56)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>all interactions</p>
                     </c>
                     <c ca="center">
                        <p>2846</p>
                     </c>
                     <c ca="center">
                        <p>54258</p>
                     </c>
                     <c ca="center">
                        <p>-0.16</p>
                     </c>
                     <c ca="center">
                        <p>0.13</p>
                     </c>
                     <c ca="center">
                        <p>-0.39</p>
                     </c>
                     <c ca="center">
                        <p>-0.12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>two studies</p>
                     </c>
                     <c ca="center">
                        <p>1112</p>
                     </c>
                     <c ca="center">
                        <p>2792</p>
                     </c>
                     <c ca="center">
                        <p>-0.17</p>
                     </c>
                     <c ca="center">
                        <p>0.14</p>
                     </c>
                     <c ca="center">
                        <p>-0.42</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>three studies</p>
                     </c>
                     <c ca="center">
                        <p>329</p>
                     </c>
                     <c ca="center">
                        <p>556</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.43)</p>
                     </c>
                     <c ca="center">
                        <p>-0.03 (0.42)</p>
                     </c>
                     <c ca="center">
                        <p>-0.36</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.65)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7" ca="left">
                        <p>(B)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>Study</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>N<sub>p</sub></p>
                     </c>
                     <c ca="center">
                        <p>N<sub>i</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EI</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>AI</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EA</sub></p>
                     </c>
                     <c ca="center">
                        <p>&#964;<sub>EI.A</sub></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ito <abbrgrp><abbr bid="B12">12</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>528</p>
                     </c>
                     <c ca="center">
                        <p>1049</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.34)</p>
                     </c>
                     <c ca="center">
                        <p>-0.04 (0.21)</p>
                     </c>
                     <c ca="center">
                        <p>-0.31</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.61)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Uetz <abbrgrp><abbr bid="B13">13</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>630</p>
                     </c>
                     <c ca="center">
                        <p>1213</p>
                     </c>
                     <c ca="center">
                        <p>0.01 (0.65)</p>
                     </c>
                     <c ca="center">
                        <p>-0.06 (0.02)</p>
                     </c>
                     <c ca="center">
                        <p>-0.32</p>
                     </c>
                     <c ca="center">
                        <p>-0.01 (0.78)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2H studies <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>931</p>
                     </c>
                     <c ca="center">
                        <p>2104</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.27)</p>
                     </c>
                     <c ca="center">
                        <p>-0.06 (0.003)</p>
                     </c>
                     <c ca="center">
                        <p>-0.33</p>
                     </c>
                     <c ca="center">
                        <p>0.00 (0.90)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Gavin <abbrgrp><abbr bid="B10">10</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1055</p>
                     </c>
                     <c ca="center">
                        <p>15836</p>
                     </c>
                     <c ca="center">
                        <p>-0.12</p>
                     </c>
                     <c ca="center">
                        <p>0.08</p>
                     </c>
                     <c ca="center">
                        <p>-0.38</p>
                     </c>
                     <c ca="center">
                        <p>-0.10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Gavin <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>1033</p>
                     </c>
                     <c ca="center">
                        <p>8648</p>
                     </c>
                     <c ca="center">
                        <p>-0.08</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>-0.37</p>
                     </c>
                     <c ca="center">
                        <p>-0.06 (0.01)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ho <abbrgrp><abbr bid="B11">11</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1209</p>
                     </c>
                     <c ca="center">
                        <p>5941</p>
                     </c>
                     <c ca="center">
                        <p>-0.18</p>
                     </c>
                     <c ca="center">
                        <p>0.16</p>
                     </c>
                     <c ca="center">
                        <p>-0.42</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Ho <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>1013</p>
                     </c>
                     <c ca="center">
                        <p>3019</p>
                     </c>
                     <c ca="center">
                        <p>-0.28</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                     <c ca="center">
                        <p>-0.42</p>
                     </c>
                     <c ca="center">
                        <p>-0.16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MS studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1735</p>
                     </c>
                     <c ca="center">
                        <p>20930</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                     <c ca="center">
                        <p>0.11</p>
                     </c>
                     <c ca="center">
                        <p>-0.40</p>
                     </c>
                     <c ca="center">
                        <p>-0.10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MS studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> untagged</p>
                     </c>
                     <c ca="center">
                        <p>1575</p>
                     </c>
                     <c ca="center">
                        <p>11531</p>
                     </c>
                     <c ca="center">
                        <p>-0.14</p>
                     </c>
                     <c ca="center">
                        <p>0.15</p>
                     </c>
                     <c ca="center">
                        <p>-0.40</p>
                     </c>
                     <c ca="center">
                        <p>-0.09</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>synexpression <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>1163</p>
                     </c>
                     <c ca="center">
                        <p>20291</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>-0.09</p>
                     </c>
                     <c ca="center">
                        <p>-0.41</p>
                     </c>
                     <c ca="center">
                        <p>0.05 (0.06)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene neighborhood <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>790</p>
                     </c>
                     <c ca="center">
                        <p>10186</p>
                     </c>
                     <c ca="center">
                        <p>-0.09</p>
                     </c>
                     <c ca="center">
                        <p>0.08</p>
                     </c>
                     <c ca="center">
                        <p>-0.49</p>
                     </c>
                     <c ca="center">
                        <p>-0.06 (0.02)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>synthetic lethality <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>533</p>
                     </c>
                     <c ca="center">
                        <p>1505</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.36)</p>
                     </c>
                     <c ca="center">
                        <p>-0.02 (0.60)</p>
                     </c>
                     <c ca="center">
                        <p>-0.38</p>
                     </c>
                     <c ca="center">
                        <p>0.02 (0.49)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene cooccurrence <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>309</p>
                     </c>
                     <c ca="center">
                        <p>1767</p>
                     </c>
                     <c ca="center">
                        <p>-0.14</p>
                     </c>
                     <c ca="center">
                        <p>0.07 (0.07)</p>
                     </c>
                     <c ca="center">
                        <p>-0.44</p>
                     </c>
                     <c ca="center">
                        <p>-0.13 (0.02)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gene fusion <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>233</p>
                     </c>
                     <c ca="center">
                        <p>559</p>
                     </c>
                     <c ca="center">
                        <p>0.04 (0.41)</p>
                     </c>
                     <c ca="center">
                        <p>-0.01 (0.74)</p>
                     </c>
                     <c ca="center">
                        <p>-0.40</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.50)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>all interactions</p>
                     </c>
                     <c ca="center">
                        <p>2960</p>
                     </c>
                     <c ca="center">
                        <p>56058</p>
                     </c>
                     <c ca="center">
                        <p>-0.16</p>
                     </c>
                     <c ca="center">
                        <p>0.12</p>
                     </c>
                     <c ca="center">
                        <p>-0.38</p>
                     </c>
                     <c ca="center">
                        <p>-0.13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>two studies</p>
                     </c>
                     <c ca="center">
                        <p>1131</p>
                     </c>
                     <c ca="center">
                        <p>2822</p>
                     </c>
                     <c ca="center">
                        <p>-0.17</p>
                     </c>
                     <c ca="center">
                        <p>0.08</p>
                     </c>
                     <c ca="center">
                        <p>-0.41</p>
                     </c>
                     <c ca="center">
                        <p>-0.15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>three studies</p>
                     </c>
                     <c ca="center">
                        <p>332</p>
                     </c>
                     <c ca="center">
                        <p>562</p>
                     </c>
                     <c ca="center">
                        <p>0.03 (0.45)</p>
                     </c>
                     <c ca="center">
                        <p>-0.07 (0.06)</p>
                     </c>
                     <c ca="center">
                        <p>-0.41</p>
                     </c>
                     <c ca="center">
                        <p>0.00 (0.99)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>We confirmed that abundant proteins evolved more slowly in all coverage sets (Table <tblr tid="T1">1</tblr>, Figure <figr fid="F1">1A</figr>), in agreement with established results <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The tendency for abundant proteins to evolve more slowly was both substantial and robust, with all coverage sets showing significant correlations (Kendall's &#964; ranged from -0.31 to -0.49, <it>P </it>&lt; 10<sup>-3</sup>) regardless of whether abundance was measured by expression level or CAI.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>(A) shows the relationship between evolutionary rate and expression level as measured by gene microarrays <abbrgrp><abbr bid="B21">21</abbr></abbrgrp></p>
               </caption>
               <text>
                  <p>(A) shows the relationship between evolutionary rate and expression level as measured by gene microarrays <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. (B) shows the relationship between expression level and the total number of interactions from all studies. (C) shows the relationship between evolutionary rate and the total number of interactions from all studies. Some outlying data points are not shown, but are included in the calculations of the correlations in Table <tblr tid="T1">1</tblr>.</p>
               </text>
               <graphic file="1471-2148-3-21-1"/>
            </fig>
            <p>We also confirmed <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> that some of the interactions studies are biased towards counting more interactions for abundant proteins (Table <tblr tid="T1">1</tblr>, Figure <figr fid="F1">1B</figr>). Among the experimentally-based studies that look for direct evidence of interactions, the mass spectrometry studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> were consistently biased towards counting more interactions for abundant proteins (&#964; ranged from 0.07 to 0.33, <it>P </it>&lt; 10<sup>-3</sup>, Table <tblr tid="T1">1</tblr>), while the yeast two-hybrid studies <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp> showed no substantial bias towards counting more interactions for abundant proteins (<it>P </it>> 0.25, Table <tblr tid="T1">1</tblr>). The existence of a bias in the mass spectrometry but not the yeast two-hybrid studies can be explained by considering the experimental methods. The yeast two-hybrid studies involve over-expression of both interacting proteins, and so the probability of observing an interaction is unrelated to a protein's native concentration. In contrast, in the mass-spectrometry studies only the tagged protein is over-expressed, and so the probability of observing an interaction depends on the choice of which proteins to tag, as well as the native concentrations of the untagged proteins.</p>
            <p>Among the bioinformatics-based methods, the gene neighborhood data are substantially biased towards counting more interactions for abundant proteins (&#964; = 0.15 or 0.08, <it>P </it>&lt; 10<sup>-3</sup>, Table <tblr tid="T1">1</tblr>), the gene cooccurrence data are mildly biased towards counting more interactions for abundant proteins (&#964; = 0.07, <it>P </it>= 0.07 or 0.08, Table <tblr tid="T1">1</tblr>), while the synexpression data are actually biased towards counting fewer interactions for abundant proteins (&#964; = -0.12 or -0.09, <it>P </it>&lt; 10<sup>-3</sup>, Table <tblr tid="T1">1</tblr>). The synthetic lethality and gene fusion studies are unbiased with respect to protein abundance (<it>P </it>> 0.5, Table <tblr tid="T1">1</tblr>), as is the set of interactions found independently by three studies (<it>P </it>> 0.05, Table <tblr tid="T1">1</tblr>). The set of interactions found by two studies and the set of all interactions are both biased towards counting more interactions for abundant proteins (&#964; ranged from 0.15 to 0.19, <it>P </it>&lt; 10<sup>-5</sup>, Table <tblr tid="T1">1</tblr>), presumably because both of these sets are dominated by interactions found by the mass spectrometry studies (see Table <tblr tid="T1">1</tblr> and discussion below).</p>
            <p>We found that proteins with more interactions appeared to evolve more slowly only when the interactions data set was biased towards counting more interactions for abundant proteins (Table <tblr tid="T1">1</tblr>, Figure <figr fid="F1">1C</figr>). The yeast-two hybrid, the synthetic lethality, the gene fusion, and the interactions found by three studies are all unbiased with respect to abundance, and none of these data sets suggested any significant correlation between evolutionary rate and the number of interactions (<it>P </it>> 0.25 in all cases, Table <tblr tid="T1">1</tblr>). The mass spectrometry, the gene neighborhood, the gene cooccurrence, the interactions found by two studies, and the combined sets are all biased towards counting more interactions for abundant proteins, and data from all of these studies suggested that proteins with more interactions evolve more slowly (&#964; ranges from -0.08 to -0.28, <it>P </it>&lt; 10<sup>-3</sup>, Table <tblr tid="T1">1</tblr>). The synexpression data is biased towards counting fewer interactions for abundant proteins, and it suggests that proteins with more interactions actually evolve more rapidly (&#964; = 0.09, <it>P </it>&lt; 10<sup>-3</sup>, Table <tblr tid="T1">1</tblr>).</p>
            <p>If the bias of some studies to count more interactions for abundant proteins explains the correlation between the number of interactions and the evolutionary rate, then there should be a direct relationship between the bias and the observed correlation. We examined this relationship for all 17 data sets in Table <tblr tid="T1">1</tblr>, and confirmed that there was a simple linear relationship between the correlation of abundance with the number of interactions and the correlation of the number of interactions with the evolutionary rate, as shown in Figure <figr fid="F2">2</figr>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The correlation between evolutionary rate and the number of interactions is directly related to the bias towards counting more interactions for abundant proteins, both when abundance is measured by (A) gene microarray expression levels and (B) CAI</p>
               </caption>
               <text>
                  <p>The correlation between evolutionary rate and the number of interactions is directly related to the bias towards counting more interactions for abundant proteins, both when abundance is measured by (A) gene microarray expression levels and (B) CAI. Correlations are Kendall's rank correlation &#964;, and points are for all data sets listed in Table <tblr tid="T1">1</tblr>.</p>
               </text>
               <graphic file="1471-2148-3-21-2"/>
            </fig>
            <p>The trends described here are not sensitive to the evolutionary rates used. When evolutionary rates are derived from alignments of <it>S. cerevisiae </it>and <it>Schizosaccharomyces pombe </it>orthologs by Fraser <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, there is again a consistent correlation between evolutionary rate and abundance, but a correlation between evolutionary rate and interactions emerges only for interactions data sets biased towards counting more interactions for abundant proteins (data not shown).</p>
         </sec>
         <sec>
            <st>
               <p>Controlling for bias reduces apparent correlation between evolutionary rate and interactions</p>
            </st>
            <p>The relationship between the correlation of evolutionary rate with the number of interactions and the bias towards counting more interactions for abundant proteins (Figure <figr fid="F2">2</figr>) suggests that the bias contributes to the observed correlation. To obtain a statistical view of this effect, we used a partial correlation statistic (Kendall's partial &#964;) to measure the correlation between evolutionary rate and the number of interactions when protein abundance is controlled for. In all data sets where there is a significant correlation between evolutionary rate and the number of interactions, controlling for protein abundance reduces the magnitude of the correlation (Table <tblr tid="T1">1</tblr>). We determined the significance of this reduction by performing 10<sup>4 </sup>randomizations of the protein abundances. In none of the cases where there was a highly significant correlation between evolutionary rate and the number of interactions (the mass spectrometry, synexpression, gene neighborhood, gene cooccurrence, two study, and combined data sets) did the randomized abundances give a partial &#964; with a magnitude as small as for the actual data, demonstrating that the reductions in the correlation due to controlling for abundance were highly significant (<it>P </it>&lt; 10<sup>-4</sup>).</p>
            <p>Although controlling for protein abundance always reduces the magnitude of any significant correlation between evolutionary rate and the number of interactions, in some cases the remaining partial correlation is still statistically significant. However, this remaining correlation appears to be due to an incomplete correction for protein abundance rather than a real correlation between evolutionary rate and the number of interactions. As Figure <figr fid="F3">3</figr> shows, the remaining partial correlation between evolutionary rate and the number of interactions is still directly related to the bias towards counting more interactions for abundant proteins, suggesting that this bias is still the primary factor causing the partial correlation. Note also that the partial correlation between evolutionary rate and the number of interactions for the synexpression data set still suggests that proteins with more interactions evolve more rapidly, again suggesting that the partial correlation statistic does not completely correct for biases in the interactions data set.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Controlling for abundance reduces the magnitude of the correlations between evolutionary rate and the number of interactions from those shown in Figure <figr fid="F2">2</figr>, and the remaining partial correlation still depends on the bias towards counting more interactions for abundant proteins, both when abundance is measured by (A) gene microarray expression levels and (B) CAI</p>
               </caption>
               <text>
                  <p>Controlling for abundance reduces the magnitude of the correlations between evolutionary rate and the number of interactions from those shown in Figure <figr fid="F2">2</figr>, and the remaining partial correlation still depends on the bias towards counting more interactions for abundant proteins, both when abundance is measured by (A) gene microarray expression levels and (B) CAI. The partial correlations are Kendall's partial &#964;, the correlation between interactions and abundance is Kendall's rank correlation &#964;, and points are for all data sets listed in Table <tblr tid="T1">1</tblr>.</p>
               </text>
               <graphic file="1471-2148-3-21-3"/>
            </fig>
            <p>There are several reasons why the partial correlation statistic may be unable to completely correct for experimental biases. Both microarray expression data and CAI are imperfect proxies for true protein abundance (indeed, the Spearman correlation between these two proxies is only 0.62) <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B24">24</abbr></abbrgrp>, and so statistically controlling with these variables does not completely correct for effects due to actual protein abundances or expression levels. In addition, the evolutionary rates and expression data for the large set of proteins considered here may underestimate the true tendency for abundant proteins to evolve more slowly. Pal <it>et al </it><abbrgrp><abbr bid="B1">1</abbr></abbrgrp> analyzed the correlation between evolutionary rate and protein abundance using a carefully culled set of well-characterized proteins, and reported Pearson correlations of evolutionary rate with the logarithm of microarray expression levels and with CAI of -0.584 and -0.617 respectively (<it>P </it>&lt; 10<sup>-6</sup>). In comparison, the same Pearson correlations are substantially smaller (-0.423 and -0.356 respectively, <it>P </it>&lt; 10<sup>-6</sup>) for the set of all proteins considered here, possibly because the larger set of proteins here necessitates using less clean data. Such an underestimation of the strength of the relationship between evolutionary rate and abundance would cause the partial correlation statistic to incompletely correct for the bias. The fact that the remaining partial correlation still directly depends on the extent of the bias is evidence for this incomplete correction.</p>
            <p>In addition, the different native concentrations of proteins is only one source of bias in the counting of interactions by the mass spectrometry studies. There also is an inherent asymmetry in the counting of interactions in the mass-spectrometry studies because some proteins are tagged and over-expressed while others are only present at their native levels. If the experimenter tends to select more abundant proteins for tagging, biases towards counting more interactions for abundant proteins would be amplified in a way that cannot be controlled for by transcript level. One way to examine this effect is to only consider interactions for the untagged proteins in the mass spectrometry studies. When this is done for study <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, the bias to count more interactions for abundant proteins is slightly reduced and there is a concomitant decrease in the association between evolutionary rate and the number of interactions (Table <tblr tid="T1">1</tblr>). But when this is done for study <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, the bias to count more interactions for abundant proteins increases and the association between evolutionary rate and the number of interactions becomes larger (Table <tblr tid="T1">1</tblr>). Therefore, the effect of the experimental choice of tagged proteins differs between the studies, but in both cases, an increased tendency to count more interactions for abundant proteins increases the apparent correlation between evolutionary rate and interactions.</p>
         </sec>
         <sec>
            <st>
               <p>Protein&#8211;protein interactions and evolutionary rates in bacteria</p>
            </st>
            <p>We suggest a simple explanation for the failure of a previous analysis to observe a correlation between evolutionary rate and the number of interactions in the bacteria <it>Helicobacter pylori </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. This analysis was based on protein&#8211;protein interactions data obtained from a yeast two-hybrid study <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, and so based on our analysis here we would expect this data to have no bias towards counting-more interactions for abundant proteins, and therefore to show no correlation between evolutionary rate and the number of interactions.</p>
         </sec>
         <sec>
            <st>
               <p>Data set size or accuracy are not plausible explanations for absence of correlation</p>
            </st>
            <p>The most recent study by Fraser <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> claiming a correlation between evolutionary rate and the number of interactions suggested that the correlation may not be apparent if the interactions data set is too small, and stresses the importance of always using the largest possible data set. In order to evaluate this claim, we investigated the effect of data set size on the correlation between evolutionary rate and the number of interactions.</p>
            <p>If the dependence of evolutionary rate on the number of interactions only becomes obvious for large interactions data sets, we would expect that larger data sets would show a greater correlation. Figure <figr fid="F4">4(A)</figr> shows how the correlation depends on the size of the interactions data set. There is no obvious trend of larger data sets yielding a larger correlation &#8211; indeed, the strongest correlation is found using relatively small data sets with strong biases towards counting more interactions for abundant proteins.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>The correlation between evolutionary rate and the number of interactions does not depend on the size of the interactions data set as it would in the absence of bias in the counting of interactions</p>
               </caption>
               <text>
                  <p>The correlation between evolutionary rate and the number of interactions does not depend on the size of the interactions data set as it would in the absence of bias in the counting of interactions. (A) shows the correlation and data set sizes for all sets in Table <tblr tid="T1">1</tblr>. (B) shows how the mean and standard deviation of the correlation should depend on the data set size in the absence of experimental bias in the counting of interactions, based on sampling simulations of the mass spectrometry (green) and yeast two-hybrid (red) method of counting interactions.</p>
               </text>
               <graphic file="1471-2148-3-21-4"/>
            </fig>
            <p>In order to investigate how the spread in observed correlations between evolutionary rate and the number of interactions would be expected to depend on data set size if the bias towards counting more interactions for abundant proteins was unimportant, we performed sampling simulations on the set of all interactions mimicking both the methods of the mass spectrometry studies (counting all interactions for selected proteins) and the yeast two-hybrid studies (counting only interactions between pairs of selected proteins). The results of these simulations are shown in Figure <figr fid="F4">4(B)</figr> &#8211; they show that the observed correlation should be roughly constant regardless of the interactions data set size. Although the spread does increase for smaller data sets, this increase is not large enough to explain the observed spread in correlations. This demonstrates that differences in the data set sizes or sampling methods do not explain the variation in the observed correlations.</p>
            <p>The inadequacy of data set size as an explanation for the failure to observe a correlation for some sets is most obvious in a comparison of Figures <figr fid="F2">2</figr> and <figr fid="F4">4(A)</figr>. Data set size bears no clear relationship to the correlation between evolutionary rate and the number of interactions, but the experimental bias towards counting more interactions for abundant proteins is an excellent predictor of this correlation.</p>
            <p>We also considered the possibility that the accuracy of the interactions data might affect the strength of the observed correlation. In their review of protein interactions studies, von Mering and coworkers <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> provide estimates of the accuracies of the different studies. According to their measure of accuracy, synthetic lethality is the single most accurate method for detecting interactions, interactions detected by two different studies are more accurate than those detected by any one study, and interactions detected by three studies are more accurate still <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. The set of interactions detected by two different studies does show a tendency for more interactive proteins to evolve more slowly, however it is also strongly biased towards counting more interactions for abundant proteins (Table <tblr tid="T1">1</tblr>). This can be explained by noting that over 54% of the interactions in this data set were identified only by the two mass spectrometry studies, and that for 69% of the interactions in this set, one of the two identifications was by a mass spectrometry study. When this heavy slant towards the mass spectrometry studies is ameliorated by requiring the interactions to be identified by three different studies (meaning that at least one of the studies must use a method other than mass spectrometry), both the bias towards counting more interactions for abundant proteins and the tendency of interactive proteins to evolve more slowly disappear (Table <tblr tid="T1">1</tblr>). The data from the synthetic lethality method show no bias towards counting more interactions for abundant proteins and no tendency for abundant proteins to evolve more slowly (Table <tblr tid="T1">1</tblr>). We also note that Jordan <it>et al </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> observed no significant correlation between evolutionary rate and the number of interactions when they used a set of manually curated interactions that might be expected to be of higher accuracy than those from any single high-throughput method. Therefore, the accuracy of the interactions data does not appear to explain the apparent correlation between evolutionary rate and the number of interactions.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>We have examined the relationship among evolutionary rate, protein abundance, and the number of protein&#8211;protein interactions for data from different high-throughput studies. We have shown that while there is a consistent tendency for abundant proteins to evolve more slowly, proteins with more interactions only appear to evolve more slowly when using interactions data from studies biased towards counting more interactions for abundant proteins. The strength of the correlation between evolutionary rate and the number of interactions is directly dependent on the strength of the bias towards counting more interactions for abundant proteins &#8211; when there is no bias, there is no correlation, and in the one case where the bias is towards counting fewer interactions for abundant proteins, interactive proteins actually appear to evolve more rapidly instead. We have shown that this effect is not explained by the size or accuracy of the interactions data sets. This suggests that the apparent tendency of interactive proteins to evolve more slowly is due to the fact that abundant proteins evolve more slowly, combined with a bias towards counting more interactions for abundant proteins.</p>
         <p>Our findings underscore the importance of considering experimental methods when analyzing biological data. The failure of Jordan <it>et al. </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> to observe a correlation between evolutionary rate and the number of interactions in a data set of several thousand interactions should have raised a red flag, yet the approach of Fraser <it>et al. </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> was simply to pool all available data and recalculate the correlations. But while pooling data may yield higher statistical confidences, statistics are only as good as the quality of the data to which they are applied. In our analysis of data from individual studies, it appears that the correlation is contingent on a bias towards counting more interactions for abundant proteins. Since this bias cannot be properly controlled for with the presently available data, there is no basis to conclude that there is any association between evolutionary rate and the number of interactions.</p>
         <p>Recent advances in genomic and proteomic technologies are providing vast amounts of information about proteins and genes, including their sequences and chromosomal locations, expression levels <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, recombination rates <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, functions and dispensability <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, evolutionary rates <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, and interactions <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Many of these properties are interdependent, and in addition many of the high-throughput studies are subject to systematic biases. A major challenge of bioinformatics is to adequately correct for these interdependencies and biases in order to extract meaningful trends from the available data sets <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. We have shown here how careful consideration of the biases of individual studies can explain correlations in pooled biological data sets.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Gathering of Data</p>
            </st>
            <p>Protein evolutionary rate data were obtained from Fraser <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> compiled according to the method of <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and are based on the alignment of <it>S. cerivisiae </it>and <it>C. albicans </it>orthologs. Information on gene expression was taken from <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, where the authors have estimated the number of mRNA molecules per cell based on microarray analysis of yeast grown to the mid-log phase in YPD (yeast-extract, peptone, dextrose) media and presented this data online at <url>http://web.wi.mit.edu/young/pub/data/orf_transcriptome.txt</url>. CAI for the yeast genes were calculated <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> using gene sequences from the MIPS yeast database <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Mass spectrometry protein&#8211;protein interaction data from <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> were parsed from Table S3 of the supplementary material, counting only binary interactions between the tagged and untagged proteins in a complex.</p>
            <p>Mass spectrometry protein&#8211;protein interaction data from <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> were taken from <url>http://www.mdsp.com/yeast/</url>, again counting interactions as binary between the tagged and untagged proteins in a complex. The mass spectrometry data set in Table <tblr tid="T1">1</tblr> were the combined results of these two studies <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. In the untagged-only mass spectrometry data sets for these studies, the interactions were counted only for the untagged proteins in a complex. Yeast two-hybrid protein&#8211;protein interaction data from <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> were parsed from Table 2 of the paper. Yeast two-hybrid protein&#8211;protein interaction data from <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> were downloaded from the core data list at <url>http://genome.c.kanazawa-u.ac.jp/Y2H/</url>. The yeast two-hybrid data set in Table <tblr tid="T1">1</tblr> was the combined results of these two studies <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B12">12</abbr></abbrgrp>. The high confidence, synexpression, gene neighborhood, synthetic lethality, gene cooccurrence, and gene fusion interactions data sets were parsed from supplementary Table 4 of <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. The combined data sets of all proteins was formed from all interactions from these nine studies. The sets of interactions found by two and three of these studies were independently listed in at least that many of the nine studies. In the interaction counts listed in Table <tblr tid="T1">1</tblr>, a binary interaction was counted once for each partner except for self-interactions, in which case the interaction was only counted once. The interaction counts given in Table <tblr tid="T1">1</tblr> are the sums of the number of interactions assigned to all proteins in the data sets for which both evolutionary rate and abundance (expression or CAI) information was available. When combining interactions data sets, duplicate interactions were removed. All data will be made available upon request.</p>
         </sec>
         <sec>
            <st>
               <p>Statistical Analysis</p>
            </st>
            <p>Statistical analyses were performed using Kendall's &#964; rank correlation coefficients and two-tailed <it>P </it>values were calculated as described in <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Briefly, the Kendall's correlation between <it>x </it>and <it>y </it>was calculated as <graphic file="1471-2148-3-21-i1.gif"/> where <it>C </it>is the number of concordant pairs, <it>D </it>is the number of discordant pairs, <it>n </it>is the number of pairs, and <graphic file="1471-2148-3-21-i2.gif"/> and <graphic file="1471-2148-3-21-i3.gif"/> are corrections for tied values computed by summing over the number of observations <it>t </it>and <it>u </it>that are tied at any given value for the <it>x </it>and <it>y </it>data sets respectively. Kendall's partial correlation between <it>x </it>and <it>y </it>controlling for <it>z </it>was calculated as <graphic file="1471-2148-3-21-i4.gif"/>. For Kendall's partial &#964; correlation, two-tailed <it>P </it>values were calculated using 10<sup>4 </sup>randomizations of the abundances and the evolutionary rates. The calculations of the significances of the change in Kendall's partial &#964; correlation were performed by determining what fraction of 10<sup>4 </sup>randomizations of the abundances (preserving the interactions and evolutionary rates) yielded an increase or decrease in the partial &#964; larger than that observed for the actual data.</p>
            <p>For the sampling simulations, we began with a list of all non-duplicate interactions from the combined data set. For the mass-spectrometry simulation, we randomly selected <it>n </it>proteins and all of their interactions to add to the interactions sample set, where <it>n </it>was iterated from 3341 to 10, performing <graphic file="1471-2148-3-21-i5.gif"/> trials at each <it>n</it>. For the yeast two-hybrid simulation, we selected proteins in the same way, but only counted an interaction if both of the proteins participating in the interaction were among the selected proteins. Kendall's &#964; correlation between the evolutionary rate and the number of interactions for each sample set was calculated, and results were binned according to the total number of interactions in the sample set into bins of exponentially scaled size with centers shown in Figure <figr fid="F4">4(B)</figr>. The mean and standard deviation of the correlation were calculated for each bin.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' Contributions</p>
         </st>
         <p>JDB gathered the data, performed the statistical analysis, and wrote the manuscript. CA provided guidance on the analysis and edited the manuscript. Both authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Frances H. Arnold for helpful comments and advice. We also thank an anonymous reviewer for insightful comments that greatly improved our work. JDB is supported by a Howard Hughes Medical Institute Predoctoral Fellowship. CA is supported by the NSF under contract number DEB-9981397. Part of this work was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Highly expressed genes in yeast evolve more slowly</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2001</pubdate>
            <volume>158</volume>
            <fpage>927</fpage>
            <lpage>931</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11430355</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Evolution of mutational robustness</p>
            </title>
            <aug>
               <au>
                  <snm>Wilke</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Adami</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Mut Res</source>
            <pubdate>2003</pubdate>
            <volume>523</volume>
            <fpage>3</fpage>
            <lpage>11</lpage>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Does the recombination rate affect the efficiency of purifying selection? The yeast genome provides a partial answer</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <fpage>2323</fpage>
            <lpage>2326</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11719582</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Rate of evolution and gene dispensability</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>421</volume>
            <fpage>496</fpage>
            <lpage>498</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/421496b</pubid>
                  <pubid idtype="pmpid" link="fulltext">12556881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Evolutionary rate in the protein interaction network</p>
            </title>
            <aug>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Steinmetz</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Scharfe</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Feldman</snm>
                  <fnm>MW</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <fpage>750</fpage>
            <lpage>752</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1068696</pubid>
                  <pubid idtype="pmpid" link="fulltext">11976460</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A simple dependence between protein evolution rate and the number of protein&#8211;protein interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Wall</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2003</pubdate>
            <volume>3</volume>
            <fpage>11</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166126</pubid>
                  <pubid idtype="pmpid" link="fulltext">12769820</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-3-11</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>No simple dependence between protein evolution rate and the number of protein&#8211;protein interactions: only the most prolific interactors tend to evolve slowly</p>
            </title>
            <aug>
               <au>
                  <snm>Jordan</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2003</pubdate>
            <volume>3</volume>
            <fpage>1</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140311</pubid>
                  <pubid idtype="pmpid" link="fulltext">12515583</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-3-1</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The subunit interfaces of oligomeric enzymes are conserved to a similar extent to the overall protein sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Phillips</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>2455</fpage>
            <lpage>2458</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7757001</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Comparative assessment of large&#8211;scale data sets of protein&#8211;protein interactions</p>
            </title>
            <aug>
               <au>
                  <snm>von Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Krause</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Cornell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Fields</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>417</volume>
            <fpage>399</fpage>
            <lpage>403</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature750</pubid>
                  <pubid idtype="pmpid" link="fulltext">12000970</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Functional organization of the yeast proteome by systematic analysis of protein complexes</p>
            </title>
            <aug>
               <au>
                  <snm>Gavin</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Bosche</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Krause</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Grandi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Marzioch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bauer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rick</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Michon</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Cruciat</snm>
                  <fnm>CM</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <fpage>141</fpage>
            <lpage>147</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/415141a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11805826</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Systematic identification of protein complexes in <it>Saccharomyces cerevisiae </it>by mass spec-trometry</p>
            </title>
            <aug>
               <au>
                  <snm>Ho</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gruhler</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Heilbut</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Boutilier</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <fpage>180</fpage>
            <lpage>183</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/415180a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11805837</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A comprehensive two&#8211;hybrid analysis to explore the yeast protein interactome</p>
            </title>
            <aug>
               <au>
                  <snm>Ito</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chiba</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ozawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hattoria</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sakaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>4569</fpage>
            <lpage>4574</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">31875</pubid>
                  <pubid idtype="pmpid" link="fulltext">11283351</pubid>
                  <pubid idtype="doi">10.1073/pnas.061034498</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A comprehensive analysis of protein&#8211;protein interactions in <it>Saccharomyces cerevisiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Uetz</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Giot</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cagney</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mansfield</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Judson</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lockshon</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Narayan</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pochart</snm>
                  <fnm>P</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>403</volume>
            <fpage>623</fpage>
            <lpage>627</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35001009</pubid>
                  <pubid idtype="pmpid" link="fulltext">10688190</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The use of gene clusters to infer functional coupling</p>
            </title>
            <aug>
               <au>
                  <snm>Overbeek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fonstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>D'Souza</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pusch</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Maltsev</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>2896</fpage>
            <lpage>2901</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15866</pubid>
                  <pubid idtype="pmpid" link="fulltext">10077608</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.6.2896</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Predicting protein functioin by genomic context: quantatitive evaluation and qualitative inferences</p>
            </title>
            <aug>
               <au>
                  <snm>Huynen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lathe</snm>
                  <fnm>W</fnm>
                  <suf>III</suf>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>1204</fpage>
            <lpage>1210</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.10.8.1204</pubid>
                  <pubid idtype="pmpid" link="fulltext">10958638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Assigning protein functions by comparative genome analysis: protein phylogenetic profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marcotte</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yeates</snm>
                  <fnm>TO</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>4285</fpage>
            <lpage>4288</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">16324</pubid>
                  <pubid idtype="pmpid" link="fulltext">10200254</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.8.4285</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Protein interaction maps for complete genomes based on gene fusion events</p>
            </title>
            <aug>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Iliopoulos</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kyrpides</snm>
                  <fnm>NC</fnm>
               </au>
               <au>
                  <snm>Ouzounis</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1999</pubdate>
            <volume>402</volume>
            <fpage>86</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10573422</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Detecting protein function and protein&#8211;protein interactions from genome sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Marcotte</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ng</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Rice</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Yeates</snm>
                  <fnm>TO</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>285</volume>
            <fpage>751</fpage>
            <lpage>753</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.285.5428.751</pubid>
                  <pubid idtype="pmpid" link="fulltext">10427000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Systematic genetic analysis of ordered arrays of yeast deletion mutants</p>
            </title>
            <aug>
               <au>
                  <snm>Tong</snm>
                  <fnm>AHY</fnm>
               </au>
               <au>
                  <snm>Evangelista</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Parsons</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Page</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Robinson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Raghibizedeh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hogue</snm>
                  <fnm>CWV</fnm>
               </au>
               <au>
                  <snm>Bussey</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Andrews</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tyers</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Boone</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>2364</fpage>
            <lpage>2368</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1065810</pubid>
                  <pubid idtype="pmpid" link="fulltext">11743205</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Detecting putative orthologs</p>
            </title>
            <aug>
               <au>
                  <snm>Wall</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1710</fpage>
            <lpage>1711</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg213</pubid>
                  <pubid idtype="pmpid" link="fulltext">12967969</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Dissecting the regulatory circuitry of a eukaryotic genome</p>
            </title>
            <aug>
               <au>
                  <snm>Holstege</snm>
                  <fnm>FCP</fnm>
               </au>
               <au>
                  <snm>Jennings</snm>
                  <fnm>EG</fnm>
               </au>
               <au>
                  <snm>Wyrick</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Hengartner</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>717</fpage>
            <lpage>728</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9845373</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Correlation between protein and mRNA abundance in yeast</p>
            </title>
            <aug>
               <au>
                  <snm>Gygi</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Rochon</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Franza</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1999</pubdate>
            <volume>19</volume>
            <fpage>1720</fpage>
            <lpage>1730</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">83965</pubid>
                  <pubid idtype="pmpid" link="fulltext">10022859</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The codon adaptation index &#8211; a measure of directional synonomous codon usage bias, and its potential applications</p>
            </title>
            <aug>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1987</pubdate>
            <volume>15</volume>
            <fpage>1281</fpage>
            <lpage>1295</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3547335</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Relationship of codon bias to mRNA concentration and protein length in <it>Saccharomyces cerevisiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Coghlan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>KH</fnm>
               </au>
            </aug>
            <source>Yeast</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>1131</fpage>
            <lpage>1145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1097-0061(20000915)16:12&lt;1131::AID-YEA609>3.0.CO;2-F</pubid>
                  <pubid idtype="pmpid" link="fulltext">10953085</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>The protein&#8211;protein interaction map of <it>Helicobacter pylori</it></p>
            </title>
            <aug>
               <au>
                  <snm>Rain</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Selig</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>De Reuse</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Battaglia</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Reverdy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lenzen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Petel</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Wojcik</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schachter</snm>
                  <fnm>V</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>211</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35051615</pubid>
                  <pubid idtype="pmpid" link="fulltext">11196647</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Global mapping of meiotic recombination hotspots and coldspots in the yeast <it>Saccaromyces cerevisiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Gerton</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>DeRisi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Shroff</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lichten</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Petes</snm>
                  <fnm>TD</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>11383</fpage>
            <lpage>11390</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">17209</pubid>
                  <pubid idtype="pmpid" link="fulltext">11027339</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.21.11383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Function characterization of the <it>S. cerevisiae </it>genome by gene deletion and parallel analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Winzeler</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Astromoff</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>285</volume>
            <fpage>901</fpage>
            <lpage>906</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.285.5429.901</pubid>
                  <pubid idtype="pmpid" link="fulltext">10436161</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>What determines the rate of sequence evolution?</p>
            </title>
            <aug>
               <au>
                  <snm>Brookfield</snm>
                  <fnm>RFY</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>R410</fpage>
            <lpage>R411</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(00)00506-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">10837241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>MIPS: a databse for genomes and protein sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Guldener</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Mannhaupt</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mokrejs</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Morgenstern</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Munsterkotter</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rudd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Well</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>31</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99165</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752246</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.31</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Nonparametric Measures of Association</p>
            </title>
            <aug>
               <au>
                  <snm>Gibbons</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>In Quantitative Applications in the Social Sciences</source>
            <publisher>Sage Publications</publisher>
            <pubdate>1993</pubdate>
            <volume>91</volume>
         </bibl>
      </refgrp>
   </bm>
</art>
