<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-8-23</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>The impact of horizontal gene transfer in shaping operons and protein interaction networks &#8211; direct evidence of preferential attachment</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Davids</snm>
               <fnm>Wagied</fnm>
               <insr iid="I1"/>
               <email>wagied.davids@utoronto.ca</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Zhang</snm>
               <fnm>Zhaolei</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>zhaolei.zhang@utoronto.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Banting &amp; Best Department of Medical Research (BBDMR), Donnelly Centre for Cellular &amp; Biomolecular Research (CCBR), University of Toronto, 160 College Street, Toronto, ON M5S 3E1, Canada</p>
            </ins>
            <ins id="I2">
               <p>Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 3E1, Canada</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2008</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>23</fpage>
         <url>http://www.biomedcentral.com/1471-2148/8/23</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18218112</pubid>
               <pubid idtype="doi">10.1186/1471-2148-8-23</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>19</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>24</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Davids and Zhang; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Despite the prevalence of horizontal gene transfer (HGT) in bacteria, to this date there were few studies on HGT in the context of gene expression, operons and protein-protein interactions. Using the recently available data set on the <it>E. coli </it>protein-protein interaction network, we sought to explore the impact of HGT on genome structure and protein networks.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We classified the <it>E. coli </it>genes into three categories based on their evolutionary conservation: a set of 2158 <it>Core </it>genes that are shared by all <it>E. coli </it>strains, a set of 1044 <it>Non-core </it>genes that are strain-specific, and a set of 1053 genes that were putatively acquired by horizontal transfer. We observed a clear correlation between gene expressivity (measured by Codon Adaptation Index), evolutionary rates, and node connectivity between these categories of genes. Specifically, we found the <it>Core </it>genes are the most highly expressed and the most slowly evolving, while the <it>HGT </it>genes are expressed at the lowest level and evolve at the highest rate. <it>Core </it>genes are the most likely and <it>HGT </it>genes are the least likely to be member of the operons. In addition, we found the <it>Core </it>genes on average are more highly connected than <it>Non-core </it>and <it>HGT </it>genes in the protein interaction network, however the <it>HGT </it>genes displayed a significantly higher mean node degree than the <it>Core </it>and <it>Non-core </it>genes in the defence COG functional category. Interestingly, <it>HGT </it>genes are more likely to be connected to <it>Core </it>genes than expected by chance, which suggest a model of differential attachment in the expansion of cellular networks.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Results from our analysis shed light on the mode and mechanism of the integration of horizontally transferred genes into operons and protein interaction networks.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>It is generally accepted that horizontal gene transfer (HGT) is an important process in bacterial genome evolution, which provides both novel metabolic capabilities, and catalyzing the diversification of bacterial lineages <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. Although, the extent of the evolutionary impact of HGT is still under debate <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, it is generally accepted that roughly 10&#8211;40% of the protein-coding genes are likely to have been introduced by HGT into the <it>E. coli </it>K12 genome <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> since the species divergence from the <it>Salmonella </it>lineage approximately 100 million years ago <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
         <p>Currently, no plausible mechanisms have been proposed for the incorporation of HGT genes into their recipient genomes. We envisage that successful incorporation of a horizontally transferred gene needs not only its successful transcription and translation, but also its integration into the existing functional cellular network. We foresee a number of barriers that potentially exist against the incorporation and expression of horizontally transferred genes in a new recipient genome.</p>
         <p>The first step of integration for horizontally transferred genes is its incorporation into the host transcription machinery. Bacterial genes are often organized into groups called operons, which enable a simple and unified mechanism of gene regulation in bacteria. Integrating into operons may be regarded as beneficial for the foreign invading genes, since they gain the opportunity not only to be co-regulated and but also co-expressed with resident genes. Secondly, HGT genes may need to optimize their codon usage to be compatible to the host in order to be efficiently transcribed and translated. Thirdly, the protein product has to be integrated into the functional cellular network in order to gain interaction partners and contribute fitness benefits to the organism. Failure to achieve any of the above steps may result in eventual degradation and pseudogenization.</p>
         <p>Considering the prevalence of horizontal gene transfer during bacterial genome evolution, the importance of studies exploring their mode of evolution, expression and impact on genomic organization and protein-interactions would thus further our understanding of horizontal gene transfer. With the emergence of high-throughput functional genomics and proteomics data, we are offered a unique opportunity of answering these questions. Thus our specific aims in this paper were to address the following questions:</p>
         <sec>
            <st>
               <p>(i). Evolutionary Rates and Gene Expression characteristics of Core, Non-core and HGT genes</p>
            </st>
            <p>Bacterial genomes are known to be dynamic, consisting of genes with different evolutionary histories. Some genes are evolutionarily conserved while others can be gained and lost in a lineage-specific fashion, and by horizontal gene transfer events. Prior studies on yeast and vertebrates have suggested that genes that are the most evolutionary conserved and most highly expressed evolve at the slowest rate <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Therefore to investigate the effect of selection on these various gene categories, we classified <it>E. coli </it>genes according to their evolutionary conservation into <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes (see <b>Methods</b>). In this regard, we hypothesize that the cumulative effect of selection acting on these different gene categories would leave footprints in their sequence and gene expression characteristics.</p>
         </sec>
         <sec>
            <st>
               <p>(ii). The contribution of HGT to operon formation</p>
            </st>
            <p>It is known that horizontally transferred genes can be inserted into existing operons and thus contribute to the dynamic nature of the gene order and membership of these operons <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. Although a few studies have investigated the evolutionary stability and the conservation of gene order of operons <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>, the relative contribution of HGT on the evolutionary composition of operons remains unclear. In this regard, we aimed to explore the prevalence of <it>HGT </it>genes in operons by cataloguing the presence of operons consisting of <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes.</p>
         </sec>
         <sec>
            <st>
               <p>(iii). The impact of HGT on protein-protein interactions and networks</p>
            </st>
            <p>Another area that has been missing in the study of HGT events is the aspect of protein-protein interactions and cellular networks. A few studies have concentrated on the impact of horizontal gene transfer on metabolic networks <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Unfortunately very little is known about the effect of horizontal gene transfer on the global protein interaction networks in this aspect, mostly due to the lack of cellular interaction data in bacteria until recently.</p>
            <p>It has been suggested that the scale-free properties of biological networks may in part be due to a model of preferential attachment by means of gene duplication, whereby new nodes preferentially attach to existing highly connected nodes. In networks that have evolved via preferential attachment, older nodes should have a higher average connectivity than younger nodes <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. In this regard, horizontal gene transfer can be considered as an additional biological mechanism to the existing model of preferential attachment. Although distinctly different, a model of network growth and expansion that involves gene duplication results in a duplicate protein copy with exact same or similar function, whereas a mechanism involving HGT may represent novel functions. In this regard, proteins encoded by <it>HGT </it>genes can be seen as competing with resident genes in establishing and gaining protein interactions.</p>
            <p>We investigated both operons and protein interactions as a means of detecting successful incorporation of putative horizontally transferred genes in the <it>E. coli </it>genome. We explored the possibility that successful <it>HGT </it>genes would require integration at the level of operons to be expressed and integration at the network level to establish fitness benefits to the organism. We found horizontally transferred genes exhibit lower gene expressivity and evolve at faster evolutionary rates than evolutionarily conserved core genes. In addition, although proteins encoded by horizontally transferred genes have lower network connectivity, they preferentially attach to resident <it>Core </it>proteins rather than <it>Non-core </it>proteins within the protein interaction network. We conclude that a small proportion of the low connectivity proteins may have arisen from HGT events.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Data</p>
            </st>
            <p>Genome sequences available for the various <it>E. coli </it>strains were downloaded from the NCBI (<it>Escherichia coli K12 MG1655 </it>&#8211; NC_000913; <it>Escherichia coli O157H7 </it>&#8211; NC_002695; <it>Escherichia coli O157H7_EDL933 </it>&#8211; NC_002655; <it>Escherichia coli CFT073 </it>&#8211; NC_004431; <it>Escherichia coli UTI89 </it>&#8211; NC_007946).</p>
         </sec>
         <sec>
            <st>
               <p>Deriving a set of <it>HGT </it>genes in <it>E. coli</it></p>
            </st>
            <p>Our primary data set consisted of horizontal gene transfer events that were identified using a combination of the gene phylogeny and the pattern of gene presence and gene absence <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. This approach is similar to gene presence/absence analyses <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>.</p>
            <p>For detection of horizontal gene transfer events, a total of 326 complete bacterial genome sequences divided into 8 bacterial clades were downloaded from MicrobesOnline database <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Using each protein sequence contained within the <it>E. coli </it>K12 genome as query, BLASTP sequence similarity searches are conducted against all 326 bacterial proteomes. Subsequent BLAST sequence hits are further categorized into "BestN" hit categories with the Best0 category referring to the <it>E. coli </it>K12 gene itself. Each gene is assigned to a relative age category (i.e. clade) based on the BLASTP hit with the highest score. The method classifies each gene within the <it>E. coli </it>K12 genome as belonging to either (i) a set of horizontally transferred gene set (named HGT), (ii) a native gene set restricted to the <it>E. coli </it>lineage (named Native) or (iii) a gene set with no known sequence homologs (named ORFan). Thus the BLASTP scores gradually decrease in groups with increasing phylogenetic distance from <it>E. coli </it>K12.</p>
            <p>Multiple sequence alignments based on protein sequences are then constructed using the MUSCLE sequence alignment software <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Fast neighbour-joining trees <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> are then subsequently constructed for each protein sequence alignment. Genes that lack "close" homologs in consecutive groups of related bacteria are then confirmed using a quartet test available within the software package TREE-PUZZLE <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. To infer a horizontal gene transfer event; gene trees are compared with the MicrobesOnline specie tree (see above). If a strongly supported clade in the gene tree was present in disparate genomes, so that three or more deletion events would be required to explain the distribution of the subfamily on the species tree, then an HGT event was assigned.</p>
            <p>In addition, we have included a comparison of horizontally transferred genes obtained by various <it>HGT </it>detection methods which comprised three surrogate (non-tree based) methods namely, HGT-DB <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, the method published by Mrazek and Karlin <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and a support vector machine-based method (HGT_SVM) developed by Tsirigos and Rigoutsos <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> versus our data derived from a combined phylogenetic and gene presence/absence based method <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, both in terms of overall counts but also in terms of their distribution of Cluster of Orthologous (COG) functional categories (see Additional File <supplr sid="S1">1</supplr>).</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Comparison between different HFT gene detection methods. (A) This is a 4-way comparison Venn diagram illustrating the intersection and differences between various horizontal gene transfer detection methods investigated. The comparison included a non-surrogate phylogeny and gene presence/absence based method developed by Price <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> versus three surrogate methods which included HGT-DB <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, the method published by Mrazek and Karlin <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and a support vector machine-based method (HGT_SVM) developed by Tsirigos and Rigoutsos <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. (B): This is a comparison of Cluster of Orthologous Group (COG) functional categories between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets obtained using various methods of horizontal gene transfer detection. The comparison included a non-surrogate phylogeny and gene presence/absence based method developed by Price <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> versus three surrogate methods which included HGT-DB <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, the method published by Mrazek and Karlin <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and a support vector machine-based method (HGT_SVM) developed by Tsirigos and Rigoutsos<abbrgrp><abbr bid="B29">29</abbr></abbrgrp>.</p>
               </text>
               <file name="1471-2148-8-23-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>On the overall, the method developed by Price et al. predicts more <it>HGT </it>genes in <it>E. coli </it>K12 than the three surrogate methods. It is known that base compositional differences between resident and invading genes are "ameliorated" over a few million years <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Surrogate methods that use a compositional approach may preferentially detect recent horizontal gene transfer events and genes with atypical base compositions <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Thus, a cross-phylum approach using phylogenetic tree based methods combined with gene phyletic profiles are more likely to detect ancient but also recent horizontal gene transfer events.</p>
         </sec>
         <sec>
            <st>
               <p>Deriving a set of <it>Core </it>and a set of <it>Non-core E. coli </it>genes</p>
            </st>
            <p>Our operational definition of a <it>Core </it>set of genes was meant to reflect the evolutionary retention of a set of common genes in all <it>E. coli </it>strains. In this regard, our <it>Non-core </it>set reflect genes that are found in at least one strain but not all strains, and <it>HGT </it>genes correspond to genes which are derived from putative recent horizontal gene transfer events. Thus, this distinction between <it>Core </it>and <it>Non-core </it>genes serves to illustrate the difference between a stable and invariant <it>Core </it>component and a variable <it>Non-core </it>component that is specific to <it>E. coli </it>strains. In this regard, the <it>Non-core </it>genes represent genes with a restricted phylogenetic distribution limited to one or more <it>E. coli </it>strains. These genes can be lost or gained in a strain-specific manner. Thus to ensure that there is no overlap between any evolutionary gene categories we have filtered the <it>Core, Non-core </it>and <it>HGT </it>gene sets to ensure a non-overlapping set of each gene category is maintained.</p>
            <p>We derived an evolutionary <it>Core </it>set of 2158 <it>E. coli </it>genes based on the criteria of using phylogenetic gene conservation and genomic context (positional gene conservation). Starting with an all-vs-all protein sequence comparison consisting of the five <it>E. coli </it>genomes, we grouped <it>E. coli </it>K12 genes based on their phylogenetic gene conservation profiles within all five strains. To ensure a high quality <it>Core </it>gene set, we extracted and compared the chromosomal locations of all <it>Core </it>genes. It is known that genes which evolve vertically between closely related species can be divided into those that retain homologous chromosomal positions (positional orthologs) and those that do not <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. In addition, phylogenetic trees were constructed based on selected protein sequences to verify the phylogenetic relationship between the five <it>E. coli </it>strains.</p>
            <p>Our <it>Non-core </it>gene set was obtained by post-process filtering the BLAST sequence comparison results of <it>E. coli </it>K12 genes which had BLAST hits in at least one or more <it>E. coli </it>genomes, but not present in all genomes. We also extracted and compared gene chromosomal locations of this gene set and constructed phylogenetic trees for further investigation. Since this gene set showed lower phylogenetic conservation, they were also positionally conserved to a lesser extent.</p>
            <p>In addition, results from correspondence analysis of codon usage also revealed a distinction between our <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene categories (Additional File <supplr sid="S2">2</supplr>). The <it>Core</it>,<it>Non-core </it>and <it>HGT </it>gene lists can be found in Additional Files.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>Codon usages between <it>core</it>, <it>Non-core </it>and <it>HGT </it>genes. This is a correspondence analysis of codon usage from <it>E. coli Core</it>, <it>Non-core</it>, and putative <it>HGT </it>genes using the first two principal components.</p>
               </text>
               <file name="1471-2148-8-23-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p><it>E. coli </it>operons</p>
            </st>
            <p>Data pertaining to <it>Escherichia coli </it>operons and transcriptional units were downloaded from RegulonDB version 5.7 <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. RegulonDB is a manually curated database that focuses on transcriptional regulation in <it>E. coli </it>with information extracted from literature as well as sequence databases such as GenBank. Its basic structural unit is the operon, which describes the elements and properties of transcriptional regulation. Thus in keeping with this definition, we refer to an operon as a poly-cistronic transcribed unit with its associated regulatory sites, whereas a regulon is defined as a group of operons controlled by a single regulator. As of RegulonDB version 5.7, 4570 <it>E. coli </it>genes have been annotated and organized into 2684 operons.</p>
         </sec>
         <sec>
            <st>
               <p>Analysis of <it>E. coli </it>gene expression</p>
            </st>
            <p><it>E. coli </it>K12 MG1655 microarray gene expression data were downloaded from the NCBI GEO microarray database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. We selected the GDS2600 data set, which closely approximates growth under normal conditions. This data set contains a time course which monitors the expression of 4405 <it>E. coli </it>genes using spotted cDNAs in stationary phase using LB media. Log2-transformed gene expression values were used and we excluded genes with missing data from the analysis. For each gene, mean gene expression values across time points were calculated and used for subsequent analysis.</p>
         </sec>
         <sec>
            <st>
               <p>Protein-Protein interaction networks</p>
            </st>
            <p>For construction of the <it>E. coli </it>interaction network, we extracted the protein-protein interaction data from a recently published mass spectrometry study <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. We examined this data set carefully to confirm that it was not biased towards particular pathways or functional categories using the KEGG pathways and COG functional classification databases respectively. The whole analysis was also re-performed using the protein interaction data from Arifuzzaman et al. <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> (Additional Files <supplr sid="S3">3</supplr> and <supplr sid="S4">4</supplr>).</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Comparison between two E. coli interaction studies. This is a comparison of COG functional classes between Arifuzzaman et al. (2006) and Butland et al (2005) <it>E. coli </it>protein interaction networks.</p>
               </text>
               <file name="1471-2148-8-23-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Comparison between two E. coli interaction studies. This is a comparison between Arifuzzaman et al. (2006) and Butland et al. (2005) published protein interaction data sets.</p>
               </text>
               <file name="1471-2148-8-23-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Software</p>
            </st>
            <p>Detection of orthologs was performed using a reciprocal best-hits approach as implemented in the RSD-algorithm <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Multiple sequence alignments were constructed from protein sequences using the ClustalW package <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Phylogenetic tree reconstructions were performed using the neighbour-joining method <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Evolutionary substitution rates were estimated using the CODEML program available from the PAML package<abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Network analyses were performed using algorithms implemented in the NetworkX package <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and visualised using PAJEK <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Statistical analyses were performed using the R-programming language environment.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p><it>HGT </it>genes evolve faster and have lower expression levels</p>
            </st>
            <p>To investigate the selective pressure acting on organizational units, we classified <it>E. coli </it>genes according to their evolutionary conservation into three categories, namely, (i). <b><it>Core Set</it></b>: a conserved core set of genes that exist in all <it>E. coli </it>strains. (ii). <b><it>Non-core Set</it></b>: genes that are found in at least one strain but not in all strains, and (iii). <b><it>HGT Set</it></b>: genes that are derived from putative recent horizontal gene transfer events after the divergence of <it>E. coli </it>and <it>Salmonella</it>. By delineating genes according to their evolutionary conservation, we can more clearly identify the evolutionary forces to which the various evolutionary classes of genes are subjected.</p>
            <p>Direct measurements of <it>E. coli </it>gene expression were obtained from microarray gene expression experiments (see <b>Methods</b>). In addition, we have also used the codon adaptation index (CAI) as a proxy for gene expression data, which we referred as "gene expressivity" <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>.</p>
            <p>Figure <figr fid="F1">1</figr> shows that the <it>Core </it>genes have higher CAI gene expressivity levels (Figure <figr fid="F1">1A</figr>) as well as log2 expression values (Figure <figr fid="F1">1B</figr>) than <it>Non-core </it>and <it>HGT </it>genes (t-test and Wilcox rank test, p-value &lt; 0.001). This can be explained by the different evolutionary histories of these three groups of genes. The <it>Core </it>set of genes, being the oldest resident genes in the genome have thus had sufficient time to adapt and optimise their codon usage patterns, explaining the higher levels of gene expressivity; whereas the recent horizontally transferred genes may need an adaptation period during which their base composition and codon usage patterns may need to be optimised to their new resident genome.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Box plot of <b>(A) </b>gene expressivity (CAI) values and <b>(B) </b>log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes</p>
               </caption>
               <text>
                  <p>Box plot of <b>(A) </b>gene expressivity (CAI) values and <b>(B) </b>log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes. <it>Core </it>genes display higher expressivity than <it>Non-core </it>and <it>HGT </it>genes (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-1"/>
            </fig>
            <p>Figure <figr fid="F2">2</figr> shows that amongst the three categories of gene sets, the <it>Core </it>set of genes evolve at the lowest substitution rates (<it>dN/dS</it>) and <it>HGT </it>genes evolve at the fastest rates, using <it>E. coli K12 </it>as reference for comparison (Wilcoxon-Mann-Whitney test, p-value &lt; 0.001). The high evolutionary rates observed for <it>HGT </it>genes may be explained by either one of the following two hypotheses: (i) result of reduced negative selection pressure, which enable the invading genes to be purged from the genome, or (ii) result of increased positive selection whereby <it>HGT </it>genes contribute to the phenotypic character of <it>E. coli </it>strains <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Accordingly, it is thought that the strain-specific <it>Non-core </it>genes and <it>HGT </it>genes may contribute to the pathogenic character separating the enterohemorrhagic and uropathogenic from the benign <it>E. coli K12 </it>strain, therefore these genes are under positive selection pressure.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Distribution of evolutionary rates (<it>dN/dS</it>) for various <it>E. coli </it>strains overlaid on a phylogenetic tree using <it>E. coli K12 </it>as reference for genome comparisons</p>
               </caption>
               <text>
                  <p>Distribution of evolutionary rates (<it>dN/dS</it>) for various <it>E. coli </it>strains overlaid on a phylogenetic tree using <it>E. coli K12 </it>as reference for genome comparisons. <it>Core </it>genes evolve slower than <it>Non-core </it>and <it>HGT </it>genes (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Genes in Operons and Networks Display Higher Gene Expressivity</p>
            </st>
            <p>There is increasing evidence to suggest that the chromosomal gene order in organisms is not always random <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. It is known that proteins of linked genes evolve at comparable rates, and that natural selection may promote the conservation of linkage of co-expressed genes <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Accordingly, genes in the same operon occur in close physical proximity and are often known to be co-transcribed as units. In addition, genes encoding subunits of protein complexes also need to be expressed at similar times.</p>
            <p>To investigate the relative contributions of the various evolutionary gene categories on organizational structures, we surveyed both operons and the protein interaction network for their content of <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes. The <it>Core </it>set form a predominant portion of operons with 47% (2129 out of 4506 genes catalogued in RegulonDB version 5.7) of the operons consisting of <it>Core </it>genes, whereas 21% (948 out of 4506) of <it>Non-core </it>genes and 23% (1020 out of 4506) <it>HGT </it>genes, respectively, accounted for the remaining gene constituents of operons (Figure <figr fid="F3">3A</figr>). Similarly, proteins encoded by <it>Core </it>genes account for a 67.5% (852 out of 1262) of the protein interaction network as reported by Butland et al <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> whereas <it>Non-core </it>genes and <it>HGT </it>genes account for 14.1% (178 out of 1262) and 18.4% (232 out of 1262) respectively (Figure <figr fid="F3">3B</figr>).</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Number of <it>E. coli </it>genes in the genome organizations: <b>(A) </b>operons, <b>(B) </b>protein interaction network (PIN)</p>
               </caption>
               <text>
                  <p>Number of <it>E. coli </it>genes in the genome organizations: <b>(A) </b>operons, <b>(B) </b>protein interaction network (PIN). Genes are classified into three evolutionary categories <it>Core</it>, <it>Non</it>-<it>core </it>and <it>HGT </it>genes. <it>Core </it>genes predominantly occur in both operons and protein interaction networks (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-3"/>
            </fig>
            <p>The tendency of operons to be enriched in <it>Core </it>genes can be explained by a need to simplify regulation, since genes residing in operons known to be under control of the same promoter (Chi-squared test, p-value &lt; 0.001). This may facilitate horizontal gene transfer by enabling genes to be inherited as a physical and functional cohesive group rather than separate individual genes. In regard to the protein interaction network, it is thought that the <it>Core </it>genes form the ancestral backbones of the protein interaction network to which new functionalities are added via protein nodes and thus strengthens a model by which pathways expand <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>.</p>
            <p>To understand the impact of higher order organization of genes (i.e. operons) and proteins (i.e. interaction complexes) on properties such as expression or evolution, we investigated the gene expressivity characteristics and evolutionary substitution rates of the various categories of gene sets. We found that <it>Core </it>genes in organizational clusters (both operons and protein interaction network or PIN) have higher gene expressivity (CAI) values (Figure <figr fid="F4">4A</figr> and <figr fid="F4">4B</figr>) and as well as log2 expression values (Figure <figr fid="F4">4C</figr> and <figr fid="F4">4D</figr>) relative to <it>Non-core </it>and <it>HGT </it>genes (t-test and Wilcox-test for both operons and PIN, p-value &lt; 0.001). For the PIN, this trend was robust against removal of ribosomal proteins.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Gene expressivity (CAI) values and log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in different genome organizations</p>
               </caption>
               <text>
                  <p>Gene expressivity (CAI) values and log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in different genome organizations. <b>(A) </b>Box plot of CAI values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in operons; <b>(B) </b>Box plot of gene CAI values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in protein interaction network (PIN); <b>(C) </b>Box plot of log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in operons; <b>(D) </b>Box plot of log2 gene expression values between <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes in protein interaction network (PIN).</p>
               </text>
               <graphic file="1471-2148-8-23-4"/>
            </fig>
            <p>The overall trend from surveying operons and the protein interaction network indicates that <it>Core </it>genes tend to be found more often in organizational units such as operons and networks. The evolutionary composition may be the reason that highly clustered proteins in the protein interaction network display apparently high gene expressivity and low substitution rates.</p>
         </sec>
         <sec>
            <st>
               <p>Distribution of COG Functional Categories between <it>Core, Non-core </it>and <it>HGT </it>genes within the Operons and Protein Interaction Network</p>
            </st>
            <p>We have analyzed and compared the distribution of the Cluster of Orthologous (COG) functional categories of the <it>Core</it>, <it>Non-core </it>and <it>HGT </it>genes within the <it>E. coli </it>K12 genome, protein interaction network and operons (Figures <figr fid="F5">5</figr>, <figr fid="F6">6</figr> and <figr fid="F7">7</figr>). The various gene categories differ significantly in their COG distribution in the genome, the protein interaction network and operons (Scheirer-Ray-Hare test, p-value &lt; 0.001, see Additional File <supplr sid="S5">5</supplr>).</p>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Statistical tests for the COG distribution. (A) Kruskal-Wallis ANOVA with Scheirer-Ray-Hare extension on the ranks of COG category counts in the Genome. (B) Kruskal-Wallis ANOVA with Scheirer-Ray-Hare extension on the ranks of COG category counts in the Operons. (C) Kruskal-Wallis ANOVA with Scheirer-Ray-Hare extension on the ranks of COG category counts in the protein interaction network (PPI).</p>
               </text>
               <file name="1471-2148-8-23-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Distribution among the COG categories for all the E. coli genes</p>
               </caption>
               <text>
                  <p>Distribution among the COG categories for all the E. coli genes. Counts were estimated for each evolutionary gene category, and expressed as percentages per total number of genes per COG category. The <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets differ in their distribution of COG functional categories (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Distribution among the COG categories for those E. coli genes that are members of operons</p>
               </caption>
               <text>
                  <p>Distribution among the COG categories for those E. coli genes that are members of operons. Counts were estimated for each evolutionary gene category, and expressed as percentages per total number of genes per COG category. The <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets contained within operons differ in their distribution of COG functional categories (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-6"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Distribution among the COG categories for those E. coli genes that are members of proten-protein interaction network</p>
               </caption>
               <text>
                  <p>Distribution among the COG categories for those E. coli genes that are members of proten-protein interaction network. Counts were estimated for each evolutionary gene category, and expressed as percentages per total number of genes per COG category. The <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets contained within the protein interaction network differ in their distribution of COG functional categories (P-value &lt; 0.001).</p>
               </text>
               <graphic file="1471-2148-8-23-7"/>
            </fig>
            <p>In the overall gene comparison of the <it>E. coli </it>K12 <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets, the <it>Core </it>genes constituted the major evolutionary gene set present in all COG categories (Figure <figr fid="F5">5</figr>). The <it>Non-core </it>gene set in comparison to the <it>HGT </it>gene set was markedly abundant in the two COG categories: <b>O </b>(<it>Posttranslational modification, protein turnover, chaperones</it>) and <b>T </b>(<it>Signal transduction mechanisms</it>). The <it>HGT </it>gene set was more abundant than the <it>Non-core </it>gene set in the COG categories <b>C </b>(<it>Energy production and conversion</it>), <b>F </b>(<it>Nucleotide transport and metabolism</it>), <b>G </b>(<it>Carbohydrate transport and metabolism</it>), <b>I </b>(<it>Lipid metabolism</it>), <b>K </b>(<it>Transcription</it>) and <b>V </b>(<it>Defense mechanisms</it>). For the operons, the <it>Core </it>genes occur predominately in all COG functional categories, whereas the <it>Non-core genes </it>are over-represented in COG categories <b>S </b>(<it>Function unknown</it>) and <b>U </b>(<it>Intracellular trafficking, Secretion, and vesicular transport</it>) and the <it>HGT </it>genes are over-represented in comparison to the <it>Non-core </it>genes in COG functional categories <b>C</b>, <b>E </b>(<it>Amino acid transport and metabolism</it>), <b>G</b>, <b>H </b>(<it>Coenzyme metabolism</it>), <b>R </b>(<it>General function prediction only</it>) and <b>V </b>(<it>Defence mechanisms</it>) (Figure <figr fid="F6">6</figr>).</p>
            <p>For the protein interaction network, the <it>HGT </it>genes are over-represented in COG functional categories most notably <b>C</b>, <b>G</b>, <b>H</b>, and <b>V </b>(Figure <figr fid="F7">7</figr>). A most notable example in this regard is the COG category <b>V </b>in which the <it>HGT </it>gene set within the <it>E. coli </it>protein interaction network has a significantly higher mean node degree than the <it>Core </it>and <it>Non-core </it>genes sets. The overall statistical difference in distribution of COG functional categories between the <it>Core</it>, <it>Non-core </it>and <it>HGT </it>gene sets therefore seems to argue against the notion of a <it>Core</it>-versus-<it>Non-core </it>or <it>Core</it>-versus-acquired gene category consisting of <it>Non-core </it>and <it>HGT </it>genes, but rather strengthens the notion of a distinct separate category for <it>Non-core </it>genes.</p>
         </sec>
         <sec>
            <st>
               <p>Network topology of the <it>E. coli </it>genes</p>
            </st>
            <p>To investigate the mode and mechanism of integration of horizontally transferred genes into the <it>E. coli </it>protein-protein interaction network, we systemically investigated the network characteristics of proteins encoded by the various evolutionary categories of genes (Table <tblr tid="T1">1</tblr>). We found that proteins corresponding to the <it>Core </it>gene set represent the most highly connected protein nodes, which have an average connectivity of 11.0 interactors (Chi-squared test, p-value &lt; 0.05). In contrast, <it>Non-core </it>proteins and proteins encoded by <it>HGT </it>genes have on average lower connectivities of 4.0 and 3.0 interactors respectively. This is consistent with our hypothesis that <it>Core </it>genes being the most highly conserved genes have resided in the genome for much longer, and thus had more opportunities to evolve interactions. The result of the network analysis is consistent with this theory.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Protein interaction network characteristics of <it>E. coli Core, Non-core </it>and <it>HGT </it>genes</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Characteristic</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Core</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Non-core</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>HGT</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total nodes (1276)</p>
                     </c>
                     <c ca="left">
                        <p>852 (66.8%)</p>
                     </c>
                     <c ca="left">
                        <p>178 (14.0%)</p>
                     </c>
                     <c ca="left">
                        <p>232 (18.2%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ave. Node degree</p>
                     </c>
                     <c ca="left">
                        <p>10.9</p>
                     </c>
                     <c ca="left">
                        <p>4.1</p>
                     </c>
                     <c ca="left">
                        <p>3.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ave. clustering coefficient</p>
                     </c>
                     <c ca="left">
                        <p>0.100</p>
                     </c>
                     <c ca="left">
                        <p>0.072</p>
                     </c>
                     <c ca="left">
                        <p>0.039</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ave. betweenness centrality</p>
                     </c>
                     <c ca="left">
                        <p>2.59e-3</p>
                     </c>
                     <c ca="left">
                        <p>6.3e-4</p>
                     </c>
                     <c ca="left">
                        <p>5.5e-4</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>We also analyzed two additional network properties: <it>betweenness centrality </it>and <it>clustering coefficient </it>(Table <tblr tid="T1">1</tblr>). <it>Betweenness centrality </it>characterizes how essential a node is in maintaining communication between each pair of nodes in a network <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Depending on its position within the network, removal of a node can have very different effects on the connectivity, topology and flux of the network. Some nodes can be removed without any harmful effect, while others separate a connected network into disconnected sub-graphs. <it>Betweeness centrality </it>is a measure devised to describe the fraction of shortest paths going through a given node, with high values indicating that a node can reach many other nodes. Removal of nodes with high centrality will make it difficult to reach from one node to another, thus lengthen the path between nodes. The <it>clustering coefficient </it>describes the local transitivity in a network, with two nodes having a common neighbour in a network being more likely to be neighbours <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>.</p>
            <p>Table <tblr tid="T1">1</tblr> shows that the <it>HGT </it>genes have lower <it>betweenness centrality </it>than the <it>Core </it>and <it>Non-core </it>genes, which suggests that they are less important in cellular communications. Interestingly the <it>Non-core </it>genes have higher <it>betweeness centrality </it>than the <it>Core </it>genes, the implication of which need to be further explored. On the other hand, <it>Core </it>genes have the highest <it>clustering coefficients</it>, with any two <it>Core </it>genes having a common neighbour being more likely to be neighbours of each other. The results in Table <tblr tid="T1">1</tblr> indicate the <it>HGT </it>genes are the least important in maintaining the overall connectivity of the protein interaction network, in other words they are more likely to be <it>peripheral nodes</it>.</p>
            <p>Our analysis of the distribution of COG functional categories of the <it>Core</it>, <it>Non-core </it>and <it>HGT </it>nodes within the <it>E. coli </it>protein interaction network reveal that the <it>Core </it>genes are the most abundant and cover all the major COG functional categories in comparison to the <it>Non-core </it>and <it>HGT </it>gene sets (Figure <figr fid="F7">7</figr>). Although, the <it>Non-core </it>and <it>HGT </it>genes show similar COG distribution profiles within the protein interaction network, differences exist in COG categories <b>C</b>, G, <b>H </b>and <b>V </b>in which the <it>HGT </it>genes are relatively more abundant than <it>Non-core </it>genes. A most notable result in this regard is the COG defense category (<b>V</b>) in which the <it>HGT </it>gene set within the <it>E. coli </it>protein interaction network has a significantly higher mean node degree than the <it>Core </it>and <it>Non-core </it>genes set.</p>
         </sec>
         <sec>
            <st>
               <p>Preferential Attachment of HGT proteins to <it>Core </it>proteins</p>
            </st>
            <p>We further investigated the evolutionary profiles of the interaction partners in the network. Table <tblr tid="T2">2</tblr> shows that about 74% of all the interactions are between a pair of <it>Core </it>genes, 11.2 % of the interactions are between a <it>Core </it>gene and a <it>Non-core </it>gene. In other words, in total about 85% of the interactions involve at least one <it>Core </it>gene. Among the interactions involving <it>HGT </it>genes, a large percentage (89%) was between a <it>HGT </it>genes and a <it>Core </it>gene, while interactions between <it>Non-core </it>and <it>HGT </it>genes only account for 1%. This is surprising since the ratio between <it>Core </it>genes and <it>Non-core </it>genes is only ~5:1, much smaller than the 9:1 ratio (89%: 10%) that we observed in the network. This discrepancy in ratio implies that an <it>HGT </it>gene have a higher propensity to establish interaction with a <it>Core </it>gene than with a <it>Non-core </it>gene. Indeed, the proportions of HGT-Core interactions are higher than expected by chance (Chi-squared test, p-value &lt; 0.001).</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Classification of interactions based on the evolutionary profiles of interaction partners.</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Category of interacting partners</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Number of Interactions</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Core to Core</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>3981 (74.0%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Non-core to Non-core</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>35 (0.6%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>HGT to HGT</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>24 (0.4%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Core to Non-core</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>606 (11.2%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Core to HGT</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>687 (12.8%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>Non-core to HGT</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>55 (1.0%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total Interactions</p>
                     </c>
                     <c ca="left">
                        <p>5388 (100%)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Such a model of preferential attachment has previously been proposed to explain the growth of protein interaction networks in <it>S. cerevisiae </it><abbrgrp><abbr bid="B20">20</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp> and was also suggested recently for <it>E. coli </it><abbrgrp><abbr bid="B52">52</abbr></abbrgrp>; however it has remained mostly unproven since it was difficult to trace back the evolution history of protein networks. Along this line, the <it>HGT </it>genes in <it>E. coli </it>offer a unique opportunity to test this theory since these genes are indeed "new genes" that were only added to the network after the HGT event ~100 millions ago <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Our observation provided direct evidence and support for this model, which has not been reported previously.</p>
            <sec>
               <st>
                  <p>Data Availability  </p>
               </st>
               <p>Additional file 6 contains the data used and produced in this study.  </p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>To our knowledge, our analysis represent the first time that the HGT events are investigated in the context of protein-protein interaction and cellular networks. This is important because horizontal gene transfer in known to be prevalent in bacterial genome evolution in shaping the genome content, and they had an impact on the stability and evolution of the protein interactions and network.</p>
         <p>From our analyses, the distinguishing characteristics which sets the <it>HGT </it>gene category apart from the <it>Non-core </it>and <it>Core </it>gene categories are (i) higher evolutionary substitution rates (<it>Ka/Ks</it>), (ii) protein interaction network statistical properties such as protein <it>degree connectivity, average clustering coefficients and betweeness centrality</it>, (iii) preferential attachment with regards to the number of interactions formed by <it>HGT </it>genes, which indicate that <it>HGT </it>proteins preferentially neither self-associate nor do <it>HGT </it>proteins associate with <it>Non-core </it>proteins within the <it>E. coli </it>protein interaction network.</p>
         <p>Results from our study revealed a clear relationship between gene expressivity, evolutionary rate and protein connectivity for the three evolutionary classes of genes (Figure <figr fid="F8">8</figr>). The conserved <it>Core </it>set of genes generally display higher gene expressivity and protein connectivity than strain-specific <it>Non-core </it>and <it>HGT </it>genes. However, both gene expressivity and protein connectivity are inversely related to evolutionary rates, with the most highly conserved genes evolving the slowest. In contrast, horizontally transferred genes evolve at considerably higher evolutionary rates, and have lower gene expressivity and protein connectivity. In addition, proteins encoded by horizontally transferred genes attach preferentially to <it>Core </it>proteins within the <it>E. coli </it>protein interaction network. Consistent with this finding is the general idea that <it>Core </it>genes are the oldest resident genes and form the backbone of the protein interaction network to which new proteins are attached. These results may also suggest that a proportion of the lowest connectivity proteins in bacterial protein interaction networks are those genes which are more likely to have recently been transferred and incorporated into the <it>E. coli </it>genome.</p>
         <fig id="F8">
            <title>
               <p>Figure 8</p>
            </title>
            <caption>
               <p>Summary of the relationship between protein connectivity, gene expressivity (CAI) and evolutionary rates (<it>dN/dS</it>) in <it>E. coli</it></p>
            </caption>
            <text>
               <p>Summary of the relationship between protein connectivity, gene expressivity (CAI) and evolutionary rates (<it>dN/dS</it>) in <it>E. coli</it>.</p>
            </text>
            <graphic file="1471-2148-8-23-8"/>
         </fig>
         <p>This is reminiscent of the so-called "Complexity Hypothesis", which was proposed to explain why the successful horizontal transfer of a gene is less probable if the connectivity of the protein it encodes is large <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, and its later modification called the 'Extended Complexity hypothesis' <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> which aims to explain why adaptive evolution is the least likely for proteins with high complexity. Although the Complexity Hypothesis and its modified version aim to describe which types of genes are more or less likely to be subjected to horizontal gene transfer, it fails to provide a mode and mechanism for subsequent integration of the horizontally transferred gene into it new recipient genome. The results from our analysis support these hypotheses with genomics and evolutionary data.</p>
         <p>Considering the prevalence of HGT in bacteria, the relative contribution of HGT as an additional mechanism to gene duplication may become more important on network evolution. Thus, with the availability of proteomics data for more bacteria, we will most likely gain more insight on the impact of HGT on the evolution of networks.</p>
      </sec>
      <sec>
         <st>
            <p>List of abbreviation used</p>
         </st>
         <p>HGT: Horizontal Gene Transfer</p>
         <p>PIN: Protein Interaction Network</p>
         <p>CAI: Codon Adaption Index</p>
         <p>BLAST: Basic Local Alignment and Search Tool</p>
         <p>KEGG: Kyoto Encyclopedia of Genes and Genomes <url>http://www.genome.jp/kegg</url></p>
         <p>COG: clusters of orthologous groups <url>http://www.ncbi.nih.gov/COG</url></p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>WD collected data, carried out the calculations, performed statistical analyses, and drafted the manuscript. WD and ZZ designed the study. ZZ participated in writing the manuscript. All authors read and approved the final manuscript.</p>
         <suppl id="S6">
            <title>
               <p>Additional file 6</p>
            </title>
            <text>
               <p>Data_2008_0117.zip. Compressed zip file containing data used in the study</p>
            </text>
            <file name="1471-2148-8-23-S6.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We kindly acknowledge Morgan Price for providing the data pertaining to horizontally transferred genes; we are also grateful to the help suggestions provided by anonymous reviewers. The authors acknowledge funding support from Genome Canada through Ontario Genomic Institute.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Lateral gene transfer and the nature of bacterial innovation</p>
            </title>
            <aug>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Groisman</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>405</volume>
            <issue>6784</issue>
            <fpage>299</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35012500</pubid>
                  <pubid idtype="pmpid" link="fulltext">10830951</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Reconciling the many faces of lateral gene transfer</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>2002</pubdate>
            <volume>10</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0966-842X(01)02282-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">11755071</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Horizontal gene transfer: a critical view</p>
            </title>
            <aug>
               <au>
                  <snm>Kurland</snm>
                  <fnm>CG</fnm>
               </au>
               <au>
                  <snm>Canback</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>OG</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <issue>17</issue>
            <fpage>9658</fpage>
            <lpage>9662</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187805</pubid>
                  <pubid idtype="pmpid" link="fulltext">12902542</pubid>
                  <pubid idtype="doi">10.1073/pnas.1632870100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Molecular archaeology of the Escherichia coli genome</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <issue>16</issue>
            <fpage>9413</fpage>
            <lpage>9417</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21352</pubid>
                  <pubid idtype="pmpid" link="fulltext">9689094</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.16.9413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Determining divergence times of the major kingdoms of living organisms with a protein clock</p>
            </title>
            <aug>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>DF</fnm>
               </au>
               <au>
                  <snm>Tsang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Little</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>271</volume>
            <issue>5248</issue>
            <fpage>470</fpage>
            <lpage>477</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.271.5248.470</pubid>
                  <pubid idtype="pmpid" link="fulltext">8560259</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome</p>
            </title>
            <aug>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2004</pubdate>
            <volume>168</volume>
            <issue>1</issue>
            <fpage>373</fpage>
            <lpage>381</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1448110</pubid>
                  <pubid idtype="pmpid" link="fulltext">15454550</pubid>
                  <pubid idtype="doi">10.1534/genetics.104.028944</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Why highly expressed proteins evolve slowly</p>
            </title>
            <aug>
               <au>
                  <snm>Drummond</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bloom</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Adami</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wilke</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Arnold</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>40</issue>
            <fpage>14338</fpage>
            <lpage>14343</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1242296</pubid>
                  <pubid idtype="pmpid" link="fulltext">16176987</pubid>
                  <pubid idtype="doi">10.1073/pnas.0504070102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Selfish operons: horizontal transfer may drive the evolution of gene clusters</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1996</pubdate>
            <volume>143</volume>
            <issue>4</issue>
            <fpage>1843</fpage>
            <lpage>1860</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1207444</pubid>
                  <pubid idtype="pmpid" link="fulltext">8844169</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Selfish operons and speciation by gene transfer</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>1997</pubdate>
            <volume>5</volume>
            <issue>9</issue>
            <fpage>355</fpage>
            <lpage>359</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0966-842X(97)01110-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">9294891</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <issue>6</issue>
            <fpage>642</fpage>
            <lpage>648</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(99)00025-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">10607610</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ</p>
            </title>
            <aug>
               <au>
                  <snm>Omelchenko</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Makarova</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>9</issue>
            <fpage>R55</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">193655</pubid>
                  <pubid idtype="pmpid" link="fulltext">12952534</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-9-r55</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Operon formation is driven by co-regulation and not by horizontal gene transfer</p>
            </title>
            <aug>
               <au>
                  <snm>Price</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Arkin</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Alm</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>6</issue>
            <fpage>809</fpage>
            <lpage>819</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1142471</pubid>
                  <pubid idtype="pmpid" link="fulltext">15930492</pubid>
                  <pubid idtype="doi">10.1101/gr.3368805</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The role of laterally transferred genes in adaptive evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Marri</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Hao</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2007</pubdate>
            <volume>7</volume>
            <issue>Suppl 1</issue>
            <fpage>S8</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1796617</pubid>
                  <pubid idtype="pmpid" link="fulltext">17288581</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-7-S1-S8</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The fate of laterally transferred genes: life in the fast lane to adaptation or death</p>
            </title>
            <aug>
               <au>
                  <snm>Hao</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>636</fpage>
            <lpage>643</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457040</pubid>
                  <pubid idtype="pmpid" link="fulltext">16651664</pubid>
                  <pubid idtype="doi">10.1101/gr.4746406</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Horizontal gene transfer and the evolution of transcriptional regulation in Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Price</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Dehal</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Arkin</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2008</pubdate>
            <volume>9</volume>
            <issue>1</issue>
            <fpage>R4</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2008-9-1-r4</pubid>
                  <pubid idtype="pmpid" link="fulltext">18179685</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Evolutionary instability of operon structures disclosed by sequence comparisons of complete microbial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Itoh</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Takemoto</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mori</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1999</pubdate>
            <volume>16</volume>
            <issue>3</issue>
            <fpage>332</fpage>
            <lpage>346</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10331260</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Operon conservation from the point of view of Escherichia coli, and inference of functional interdependence of gene products from genome context</p>
            </title>
            <aug>
               <au>
                  <snm>Moreno-Hagelsieb</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>In Silico Biol</source>
            <pubdate>2002</pubdate>
            <volume>2</volume>
            <issue>2</issue>
            <fpage>87</fpage>
            <lpage>95</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12066843</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Preferential attachment in the evolution of metabolic networks</p>
            </title>
            <aug>
               <au>
                  <snm>Light</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kraulis</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Elofsson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bmc Genomics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1316878</pubid>
                  <pubid idtype="pmpid" link="fulltext">16281983</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Adaptive evolution of bacterial metabolic networks by horizontal gene transfer</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lercher</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <issue>12</issue>
            <fpage>1372</fpage>
            <lpage>1375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1686</pubid>
                  <pubid idtype="pmpid" link="fulltext">16311593</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Emergence of scaling in random networks</p>
            </title>
            <aug>
               <au>
                  <snm>Barabasi</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Albert</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <issue>5439</issue>
            <fpage>509</fpage>
            <lpage>512</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5439.509</pubid>
                  <pubid idtype="pmpid" link="fulltext">10521342</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Distributional profiles of homologous open reading frames among bacterial phyla: implications for vertical and lateral transmission</p>
            </title>
            <aug>
               <au>
                  <snm>Ragan</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Charlebois</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Int J Syst Evol Microbiol</source>
            <pubdate>2002</pubdate>
            <volume>52</volume>
            <issue>Pt 3</issue>
            <fpage>777</fpage>
            <lpage>787</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/ijs.0.02026-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">12054238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Quartet mapping and the extent of lateral transfer in bacterial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Daubin</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2004</pubdate>
            <volume>21</volume>
            <issue>1</issue>
            <fpage>86</fpage>
            <lpage>89</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msg234</pubid>
                  <pubid idtype="pmpid" link="fulltext">12949130</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>microbesonline.org</p>
            </title>
            <url>http://www.microbesonline.org</url>
         </bibl>
         <bibl id="B24">
            <title>
               <p>MUSCLE: multiple sequence alignment with high accuracy and high throughput</p>
            </title>
            <aug>
               <au>
                  <snm>Edgar</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>5</issue>
            <fpage>1792</fpage>
            <lpage>1797</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">390337</pubid>
                  <pubid idtype="pmpid" link="fulltext">15034147</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh340</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>QuickTree: building huge Neighbour-Joining trees of protein sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Howe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>11</issue>
            <fpage>1546</fpage>
            <lpage>1547</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.11.1546</pubid>
                  <pubid idtype="pmpid" link="fulltext">12424131</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing</p>
            </title>
            <aug>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Strimmer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>von Haeseler</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>3</issue>
            <fpage>502</fpage>
            <lpage>504</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.3.502</pubid>
                  <pubid idtype="pmpid" link="fulltext">11934758</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Garcia-Vallve</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guzman</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Montero</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Romeu</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>187</fpage>
            <lpage>189</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165451</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519978</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Detecting alien genes in bacterial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Mrazek</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Ann N Y Acad Sci</source>
            <pubdate>1999</pubdate>
            <volume>870</volume>
            <fpage>314</fpage>
            <lpage>329</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1749-6632.1999.tb08893.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10415493</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Tsirigos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rigoutsos</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>12</issue>
            <fpage>3699</fpage>
            <lpage>3707</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1174904</pubid>
                  <pubid idtype="pmpid" link="fulltext">16006619</pubid>
                  <pubid idtype="doi">10.1093/nar/gki660</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Amelioration of bacterial genomes: rates of change and exchange</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1997</pubdate>
            <volume>44</volume>
            <issue>4</issue>
            <fpage>383</fpage>
            <lpage>397</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/PL00006158</pubid>
                  <pubid idtype="pmpid" link="fulltext">9089078</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Codon bias and base composition are poor indicators of horizontally transferred genes</p>
            </title>
            <aug>
               <au>
                  <snm>Koski</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Morton</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Golding</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <issue>3</issue>
            <fpage>404</fpage>
            <lpage>412</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11230541</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>RegulonDB: a database on transcriptional regulation in Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Huerta</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Thieffry</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <issue>1</issue>
            <fpage>55</fpage>
            <lpage>59</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147189</pubid>
                  <pubid idtype="pmpid" link="fulltext">9399800</pubid>
                  <pubid idtype="doi">10.1093/nar/26.1.55</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions</p>
            </title>
            <aug>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gama-Castro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Peralta-Gil</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Diaz-Peredo</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sanchez-Solano</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Santos-Zavaleta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Martinez-Flores</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Jimenez-Jacinto</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bonavides-Martinez</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Segura-Salazar</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Martinez-Antonio</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Database</issue>
            <fpage>D394</fpage>
            <lpage>397</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347518</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381895</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj156</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>NCBI GEO: mining millions of expression profiles &#8211; database and tools</p>
            </title>
            <aug>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Suzek</snm>
                  <fnm>TO</fnm>
               </au>
               <au>
                  <snm>Troup</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Wilhite</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Ngau</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Ledoux</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rudnev</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lash</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Fujibuchi</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <issue>33 Database</issue>
            <fpage>D562</fpage>
            <lpage>566</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539976</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608262</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Interaction network containing conserved and essential protein complexes in Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Butland</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Peregrin-Alvarez</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Canadien</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Starostine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Richards</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Beattie</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Krogan</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Davey</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Parkinson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Greenblatt</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Emili</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>433</volume>
            <issue>7025</issue>
            <fpage>531</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature03239</pubid>
                  <pubid idtype="pmpid" link="fulltext">15690043</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Large-scale identification of protein-protein interaction of Escherichia coli K-12</p>
            </title>
            <aug>
               <au>
                  <snm>Arifuzzaman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Maeda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nishikata</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Takita</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ara</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakahigashi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Hirai</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tsuzuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Altaf-Ul-Amin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Oshima</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Baba</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kawamura</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ioka-Nakamichi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kitagawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tomita</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kanaya</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wada</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mori</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>686</fpage>
            <lpage>691</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457052</pubid>
                  <pubid idtype="pmpid" link="fulltext">16606699</pubid>
                  <pubid idtype="doi">10.1101/gr.4527806</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Detecting putative orthologs</p>
            </title>
            <aug>
               <au>
                  <snm>Wall</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Hirsh</snm>
                  <fnm>AE</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>13</issue>
            <fpage>1710</fpage>
            <lpage>1711</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg213</pubid>
                  <pubid idtype="pmpid" link="fulltext">15593400</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <issue>22</issue>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308517</pubid>
                  <pubid idtype="pmpid" link="fulltext">7984417</pubid>
                  <pubid idtype="doi">10.1093/nar/22.22.4673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The neighbor-joining method: a new method for reconstructing phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Saitou</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1987</pubdate>
            <volume>4</volume>
            <issue>4</issue>
            <fpage>406</fpage>
            <lpage>425</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3447015</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>PAML: a program package for phylogenetic analysis by maximum likelihood</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <issue>5</issue>
            <fpage>555</fpage>
            <lpage>556</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9367129</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>NetworkX package</p>
            </title>
            <url>https://networkx.lanl.gov</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Pajek &#8211; Program for large network analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Batagelj</snm>
                  <fnm>v</fnm>
               </au>
               <au>
                  <snm>Mrvar</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Connections</source>
            <pubdate>1998</pubdate>
            <volume>21</volume>
            <issue>2</issue>
            <fpage>47</fpage>
            <lpage>57</lpage>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Codon usage in bacteria: correlation with gene expressivity</p>
            </title>
            <aug>
               <au>
                  <snm>Gouy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1982</pubdate>
            <volume>10</volume>
            <issue>22</issue>
            <fpage>7055</fpage>
            <lpage>7074</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">326988</pubid>
                  <pubid idtype="pmpid" link="fulltext">6760125</pubid>
                  <pubid idtype="doi">10.1093/nar/10.22.7055</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>The proteins of linked genes evolve at similar rates</p>
            </title>
            <aug>
               <au>
                  <snm>Williams</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>407</volume>
            <issue>6806</issue>
            <fpage>900</fpage>
            <lpage>903</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35038066</pubid>
                  <pubid idtype="pmpid" link="fulltext">11057667</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Natural selection promotes the conservation of linkage of co-expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>12</issue>
            <fpage>604</fpage>
            <lpage>606</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02813-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">12446137</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The complete DNA sequence and analysis of the large virulence plasmid of Escherichia coli O157:H7</p>
            </title>
            <aug>
               <au>
                  <snm>Burland</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Shao</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Perna</snm>
                  <fnm>NT</fnm>
               </au>
               <au>
                  <snm>Plunkett</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sofia</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Blattner</snm>
                  <fnm>FR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <issue>18</issue>
            <fpage>4196</fpage>
            <lpage>4204</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147824</pubid>
                  <pubid idtype="pmpid" link="fulltext">9722640</pubid>
                  <pubid idtype="doi">10.1093/nar/26.18.4196</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Enzyme recruitment in evolution of new function</p>
            </title>
            <aug>
               <au>
                  <snm>Jensen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Annu Rev Microbiol</source>
            <pubdate>1976</pubdate>
            <volume>30</volume>
            <fpage>409</fpage>
            <lpage>425</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.mi.30.100176.002205</pubid>
                  <pubid idtype="pmpid" link="fulltext">791073</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>A set of measures of centrality based on betweenness</p>
            </title>
            <aug>
               <au>
                  <snm>Freeman</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Sociometry</source>
            <pubdate>1977</pubdate>
            <volume>40</volume>
            <fpage>35</fpage>
            <lpage>41</lpage>
            <xrefbib>
               <pubid idtype="doi">10.2307/3033543</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Collective dynamics of 'small-world' networks</p>
            </title>
            <aug>
               <au>
                  <snm>Watts</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Strogatz</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1998</pubdate>
            <volume>393</volume>
            <issue>6684</issue>
            <fpage>440</fpage>
            <lpage>442</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/30918</pubid>
                  <pubid idtype="pmpid" link="fulltext">9623998</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Preferential attachment in the protein network evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Levanon</snm>
                  <fnm>EY</fnm>
               </au>
            </aug>
            <source>Phys Rev Lett</source>
            <pubdate>2003</pubdate>
            <volume>91</volume>
            <issue>13</issue>
            <fpage>138701</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1103/PhysRevLett.91.138701</pubid>
                  <pubid idtype="pmpid" link="fulltext">14525344</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Functional evolution of the yeast protein interaction network</p>
            </title>
            <aug>
               <au>
                  <snm>Kunin</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pereira-Leal</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Ouzounis</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2004</pubdate>
            <volume>21</volume>
            <issue>7</issue>
            <fpage>1171</fpage>
            <lpage>1176</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msh085</pubid>
                  <pubid idtype="pmpid" link="fulltext">15071090</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Horizontal gene transfer among genomes: the complexity hypothesis</p>
            </title>
            <aug>
               <au>
                  <snm>Jain</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rivera</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lake</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>7</issue>
            <fpage>3801</fpage>
            <lpage>3806</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">22375</pubid>
                  <pubid idtype="pmpid" link="fulltext">10097118</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.7.3801</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Determinants of adaptive evolution at the molecular level: the extended complexity hypothesis</p>
            </title>
            <aug>
               <au>
                  <snm>Aris-Brosou</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2005</pubdate>
            <volume>22</volume>
            <issue>2</issue>
            <fpage>200</fpage>
            <lpage>209</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msi006</pubid>
                  <pubid idtype="pmpid" link="fulltext">15483330</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
