<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1752-0509-5-S3-S10</ui>
   <ji>1752-0509</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Integration of breast cancer gene signatures based on graph centrality</p>
         </title>
         <aug>
            <au ca="yes" ce="yes" id="A1"><snm>Wang</snm><fnm>Jianxin</fnm><insr iid="I1"/><email>jxwang@mail.csu.edu.cn</email></au>
            <au ce="yes" id="A2"><snm>Chen</snm><fnm>Gang</fnm><insr iid="I1"/><email>chengangcs@gmail.com</email></au>
            <au id="A3"><snm>Li</snm><fnm>Min</fnm><insr iid="I1"/><insr iid="I2"/><email>limin@mail.csu.edu.cn</email></au>
            <au ca="yes" id="A4"><snm>Pan</snm><fnm>Yi</fnm><insr iid="I1"/><insr iid="I2"/><email>pan@cs.gsu.edu</email></au>
         </aug>
         <insg>
            <ins id="I1"><p>School of Information Science and Engineering, Central South University, Changsha, 410083, China</p></ins>
            <ins id="I2"><p>Department of Computer Science, Georgia State University, Atlanta, GA30303, USA</p></ins>
         </insg>
         <source>BMC Systems Biology</source>
         
         
         <supplement><title><p>The 2010 International Conference on Bioinformatics and Computational Biology (BIOCOMP 2010): Systems Biology</p></title><editor>Ke Zhang, Yunlong Liu, Hamid R Arabnia</editor><note>Research</note><url>1752-0509-5-S3.pdf</url></supplement><conference><title><p>BIOCOMP 2010 - The 2010 International Conference on Bioinformatics and Computational Biology</p></title><location>Las Vegas, NV, USA</location><date-range>12-15 July 2010</date-range></conference><issn>1752-0509</issn>
         <pubdate>2011</pubdate>
         <volume>5</volume>
         <issue>Suppl 3</issue>
         <fpage>S10</fpage>
         <url>http://www.biomedcentral.com/1752-0509/5/S3/S10</url>
         <xrefbib><pubid idtype="doi">10.1186/1752-0509-5-S3-S10</pubid></xrefbib>
      </bibl>
      <history><pub><date><day>23</day><month>12</month><year>2011</year></date></pub></history>
      <cpyrt><year>2011</year><collab>Wang et al.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Various gene-expression signatures for breast cancer are available for the prediction of clinical outcome. However due to small overlap between different signatures, it is challenging to integrate existing disjoint signatures to provide a unified insight on the association between gene expression and clinical outcome.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>In this paper, we propose a method to integrate different breast cancer gene signatures by using graph centrality in a context-constrained protein interaction network (PIN). The context-constrained PIN for breast cancer is built by integrating complete PIN and various gene signatures reported in literatures. Then, we use graph centralities to quantify the importance of genes to breast cancer. Finally, we get reliable gene signatures that are consisted by the genes with high graph centrality. The genes which are well-known breast cancer genes, such as TP53 and BRCA1, are ranked extremely high in our results. Compared with previous results by functional enrichment analysis, graph centralities, especially the eigenvector centrality and subgraph centrality, based gene signatures are more tightly related to breast cancer. We validate these signatures on genome-wide microarray dataset and found strong association between the expression of these signature genes and pathologic parameters.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>In summary, graph centralities provide a novel way to connect different cancer signatures and to understand the mechanism of relationship between gene expression and clinical outcome of breast cancer. Moreover, this method is not only can be used on breast cancer, but also can be used on other gene expression related diseases and drug studies.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>A gene signature is a group of genes whose expression pattern represents the status of a gene expression disease <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. By using the microarray technology, which has developed rapidly in last ten years, various gene signatures are developed for various complex diseases, especially the cancer. Since researchers found that gene-expression signatures are able to predict clinical outcome of breast cancer in 2002 <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, it have become a hot topic and attracted the attention of both biologists and oncologists. Signatures for various phenotypes, such as poor prognosis <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, invasiveness <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, recurrence <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, and metastasis <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>, have been experimentally derived from patient groups and biological hypotheses. However, distinct signatures share very few genes, even though they paradoxically occupy a common prognosis space. For both cancer biologists and oncologists, a critical problem is whether these disjoint genetic signatures can provide a unified insight on the relationship between gene expression and clinical outcome.</p>
         <p>Obviously, complex heterogeneity of signatures caused by different probe design, different platforms, or inadequate patient samples, becomes an obstacle when trying to integrate various signatures of breast cancer. Gene Ontology enrichment, pathway analysis, and some genome-scale methods are proposed to explain the lack of overlap <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. In literature <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, the authors list five possible explanations for the small overlap between signatures:</p>
         <p indent="1">1. Heterogeneity in expression due to different platform technologies and references;</p>
         <p indent="1">2. Differences in supervised protocols with which signatures are extracted;</p>
         <p indent="1">3. Although the genes are not exactly the same, they represent the same set of pathways;</p>
         <p indent="1">4. Differences in clinical composition between datasets (i.e. sample heterogeneity);</p>
         <p indent="1">5. Small sample size problems that cause inaccurate signatures.</p>
         <p>Through a large-scale analysis that performed on 947 breast cancer samples from Affymetrix platform, the authors of literature <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> conclude that the small signature overlap is most likely due to small sample size problem (explanation 5). However, the conclusion might be specific to the datasets and the specific techniques used in their work. By comparison of three prognostic gene expression signatures for breast cancer, literature <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> suggested that the small overlap between the different prognostic gene signatures is because these different signatures represented largely overlapping biological processes (explanation 3). By taking into account the biological knowledge that exists among different signatures, the authors of <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> found that different signatures are similar at biological level, rather than gene level (explanation 3). Much work has been done in an effort to understand the small overlap between gene signatures, but so far there is no widely accepted explanation.</p>
         <p>Meanwhile, computational biologists have developed Protein Interaction Networks(PIN) that effectively have been used to analyze protein interactions underpinning share sub-phenotypes among otherwise seemingly disparate disease, such as retinitis pigmentosa, epithelial ovarian cancer, inflammatory bowel disease, amyotrophic lateral sclerosis, Alzheimer disease, type 2 diabetes, coronary heart disease <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and head and neck tumor metastasis <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. For an individual expression signature in breast cancer, protein interaction networks are successfully used to predict prognosis <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and detect subnetwork signatures of metastatic disease <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. More recently, in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, on genome-wide coexpression networks for different disease states, the authors used univariate Cox model and Relief algorithm to select the genes that are the most predictive of clinical outcome to construct gene signature for lung cancer. A 13-gene lung cancer prognosis signature with significant prognostic stratifications is identified by this method. By Single Protein Analysis of Net-works(SPAN <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>) and conservative permutation re-sampling, a small, but more biological significant breast cancer signature consisted by 54 genes is identified from a protein interaction network include 250 cancer-related genes curated from literatures <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. In reference <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, by integrating biological knowledge and different signatures, the authors derived a unified signature that is more robust than original signatures.</p>
         <p>However, to integrate different breast cancer signatures, most existing methods need cancer domain knowledge, such as cancer-related literature used in <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. This limited the application of these methods. In this paper, we describe a method to integrate different breast cancer signatures by using graph centrality in a context-constrained PIN for human breast cancer which is constructed by integrating disjoint gene signatures reported in previous literatures. Unlike most existing methods, the method proposed in this paper is able to integrate distinct gene signatures without cancer domain knowledge. By Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis and relating our results to previous biological studies, we show that the genes in centrality-based signatures are tightly related to breast cancer and are able to predict clinical outcome.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>To identify reliable gene signature of breast cancer by integrating various gene signatures, we propose a graph centrality based method to identify disease genes from a constrained PIN and the overview of this method is provided in Figure <figr fid="F1">1</figr>. Briefly, as shown in Figure <figr fid="F1">1</figr>, the method proposed here has three steps:</p>
         <fig id="F1"><title><p>Figure 1</p></title><caption><p>Schematic overview of graph centrality based integration of gene signatures</p></caption><text>
   <p><b>Schematic overview of graph centrality based integration of gene signatures</b>. Schematic overview of graph centrality based integration of distinct breast cancer gene signatures.</p>
</text><graphic file="1752-0509-5-S3-S10-1"/></fig>
         <p indent="1">1. Collect genes from different breast cancer gene signatures, and discard the genes that exist in only one signature.</p>
         <p indent="1">2. Project the genes collected in Step 1. to human PIN to construct a context-constrained PIN that consisted. Therefore, to some extent, all genes in this context-constrained network are related to breast cancer. However, we don't know which genes are the most important ones to the breast cancer and can be used to predict clinical outcome.</p>
         <p indent="1">3. To determine the relationship between genes and breast cancer, we calculated graph centrality of each gene in this constrained PIN. Since the constrained PIN is built based on breast cancer gene signatures, graph centrality of genes in this network indicates their relationship to breast cancer. Output given number of genes with highest graph centrality as the new unified breast cancer signature.</p>
         <p>Details of the three steps are described in following three subsections and then validation methods are presented.</p>
         <sec>
            <st>
               <p>Collecting genes from different signatures</p>
            </st>
            <p>GeneSignDB (<url>http://compbio.dfci.harvard.edu/genesigdb/</url>) <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> is a curated gene signatures database that collected gene signatures for various species and diseases. Keywords "breast cancer" for disease and "human" for species are used to search gene signatures for human breast cancer in GeneSignDB. 94 distinct human breast cancer signatures are obtained, which are reported in 58 different literatures. Since the genes which are included in only one gene signature may be generated by chance, we discard these unreliable genes.</p>
         </sec>
         <sec>
            <st>
               <p>Construction of context-constrained PIN</p>
            </st>
            <p>A complete human PIN is constructed by integrating protein interaction data from Human Protein Reference Database (HPRD) and BioGRID interaction database <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. After removal of duplicate edges and self-interactions, we got a PIN that is consisted by 51057 distinct interactions among 11465 proteins.</p>
            <p>Then, the genes we collected in the first step are projected to the complete human PIN and a constrained PIN for human breast cancer is obtained. This constrained PIN contains 2924 proteins and 4698 interactions.</p>
         </sec>
         <sec>
            <st>
               <p>Use graph centrality to quantify the relationship between genes and breast cancer</p>
            </st>
            <p>Various definitions of graph centrality have been proposed from different perspectives to evaluate the importance of nodes in a graph. The concept has been widely used in bioinformatics, such as discovery of essential proteins in protein networks <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Because it is difficult to infer which definition is best for identifying disease genes in the context-constrained network, we evaluated six different definitions in our work.</p>
            <p>For a protein interaction network <it>G(V,E)</it>, the six measurements of centrality used in this study are defined as following:</p>
            <p indent="1">&#8226; Degree centrality(<it>DC</it>): The degree centrality <it>DC</it>(<it>i</it>) of vertex <it>i </it>is the number of edges connecting node <it>i </it>and its neighbors <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
            <p>
               <display-formula id="M1">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i1"><m:mrow>
   <m:mi>D</m:mi>
   <m:mi>C</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi>D</m:mi>
   <m:mi>e</m:mi>
   <m:mi>g</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p indent="1">where <it>Deg</it>(<it>i</it>) is the degree of vertexes <it>i</it>.</p>
            <p indent="1">&#8226; Betweenness centrality(<it>BC</it>): The betweenness centrality <it>BC</it>(<it>i</it>) of a node <it>i </it>is the average fraction of shortest paths that pass through the node <it>i </it><abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
            <p>
               <display-formula id="M2">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i2"><m:mrow>
   <m:mi>B</m:mi>
   <m:mi>C</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:munder class="msub">
      <m:mrow>
         <m:mo mathsize="big"> &#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>s</m:mi>
      </m:mrow>
   </m:munder>
   <m:munder class="msub">
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>t</m:mi>
      </m:mrow>
   </m:munder>
   <m:mfrac>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>&#963;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>s</m:mi>
               <m:mi>t</m:mi>
            </m:mrow>
         </m:msub>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>&#963;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>s</m:mi>
               <m:mi>t</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
   </m:mfrac>
   <m:mo class="MathClass-punc">,</m:mo>
   <m:mi>s</m:mi>
   <m:mo class="MathClass-rel">&#8800;</m:mo>
   <m:mi>t</m:mi>
   <m:mo class="MathClass-rel">&#8800;</m:mo>
   <m:mi>i</m:mi>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p indent="1">where <it>&#963;</it><sub><it>st </it></sub>denotes the total number of shortest paths between <it>s </it>and <it>t </it>and &#963;<sub><it>st</it></sub>(<it>i</it>) denotes the number of shortest paths from <it>s </it>to <it>t </it>that pass through the node <it>i</it>.</p>
            <p indent="1">&#8226; Closeness centrality(CC): The closeness centrality <it>CC </it>of node <it>i </it>can defined as <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>:</p>
            <p>
               <display-formula id="M3">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i3"><m:mrow>
   <m:mi>C</m:mi>
   <m:mi>C</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfrac>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
               <m:mo class="MathClass-rel">&#8800;</m:mo>
               <m:mi>i</m:mi>
            </m:mrow>
         </m:msub>
         <m:msub>
            <m:mrow>
               <m:mi>c</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>p</m:mi>
            </m:mrow>
         </m:msub>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>i</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mi>j</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
   </m:mfrac>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p indent="1"><it>CC </it>is a global metric which describes how the given node <it>i </it>connects to other nodes.</p>
            <p indent="1">&#8226; Subgraph centrality(SC): The subgraph centrality <it>SC</it>(<it>i</it>) of node <it>i </it>can be defined as <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>:</p>
            <p>
               <display-formula id="M4">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i4"><m:mrow>
   <m:mi>S</m:mi>
   <m:mi>C</m:mi>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:munderover accent="false" accentunder="false">
      <m:mrow>
         <m:mo mathsize="big"> &#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>l</m:mi>
         <m:mo class="MathClass-rel">=</m:mo>
         <m:mn>0</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8734;</m:mi>
      </m:mrow>
   </m:munderover>
   <m:mfrac>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>&#956;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>l</m:mi>
            </m:mrow>
         </m:msub>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mrow>
         <m:mi>l</m:mi>
         <m:mo class="MathClass-punc">!</m:mo>
      </m:mrow>
   </m:mfrac>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p indent="1">where <it>&#956;</it><sub><it>l</it></sub>(<it>i</it>) denotes the number of closed walks of length <it>l </it>which starts and ends at node <it>i</it>.</p>
            <p indent="1">&#8226; Eigenvector centrality(EC): The eigenvector centrality <it>EC</it>(<it>i</it>) of node <it>i </it>is defined as the <it>i</it>th component of the principal eigenvector of <it>A</it>, where <it>A </it>is an adjacent matrix. Let &#955; be an eigenvalue and <it>e </it>be the eigenvector. Then for an equation <it>&#955;e </it>= <it>Ae</it>, we can obtain <it>EC(i) = e</it><sub>1</sub>(<it>i</it>), where <it>e</it><sub>1 </sub>corresponds to the largest eigenvalue of <it>A </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
            <p indent="1">&#8226; Information centrality(IC): The information centrality <it>IC</it>(<it>i</it>) of node <it>i </it>in a is defined as <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>:</p>
            <p>
               <display-formula id="M5">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i5"><m:mrow>
   <m:mi>I</m:mi>
   <m:mi>C</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msup>
      <m:mrow>
         <m:mrow>
            <m:mo class="MathClass-open">[</m:mo>
            <m:mrow>
               <m:mfrac>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
                  <m:mrow>
                     <m:mi>n</m:mi>
                  </m:mrow>
               </m:mfrac>
               <m:munder class="msub">
                  <m:mrow>
                     <m:mo mathsize="big">&#8721;</m:mo>
                  </m:mrow>
                  <m:mrow>
                     <m:mi>j</m:mi>
                  </m:mrow>
               </m:munder>
               <m:mfrac>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
                  <m:mrow>
                     <m:msub>
                        <m:mrow>
                           <m:mi>I</m:mi>
                        </m:mrow>
                        <m:mrow>
                           <m:mi>i</m:mi>
                           <m:mi>j</m:mi>
                        </m:mrow>
                     </m:msub>
                  </m:mrow>
               </m:mfrac>
            </m:mrow>
            <m:mo class="MathClass-close">]</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>where <it>n </it>is the number of nodes in graph <it>G </it>and <it>I</it><sub><it>ij </it></sub>= (<it>r</it><sub><it>ii </it></sub>+ <it>r</it><sub><it>jj </it></sub>- <it>r</it><sub><it>ij</it></sub>) - 1, where <it>r</it><sub><it>ij </it></sub>is the element of matrix <it>R</it>. Let <it>D </it>be a diagonal matrix of the weighted degree of each node and <it>J </it>be a matrix with all its elements equal to one. Then, we get <it>R </it>= (<it>r</it><sub><it>ij</it></sub>) = [<it>D </it>- <it>A</it>+ <it>J</it>] - 1. For computational purposes, <it>I</it><sub><it>ii </it></sub>is defined as infinite. Thus, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i6"><m:mfrac>
   <m:mrow>
      <m:mn>1</m:mn>
   </m:mrow>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>I</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
</m:mfrac>
<m:mo class="MathClass-rel">=</m:mo>
<m:mn>0</m:mn>
</m:math></inline-formula>.</p>
            <p>High centrality of a gene indicates that it is important to the constrained PIN and probably plays an important role in mechanism of breast cancer development. Therefore, according to the graph centrality of genes, we get a gene list that is ordered by the genes' importance to human breast cancer. Depending on specific purpose, a given number of top genes can be selected to construct a reliable gene signature of breast cancer. The reliable gene signature is the integration of the disjoint original signatures.</p>
         </sec>
         <sec>
            <st>
               <p>KEGG pathway and Gene Ontology enrichment analysis</p>
            </st>
            <p><it>p</it>-value based on the hypergenometirc distribution is widely used as a measurement of the extent to which the clusters are annotated by a specific GO term <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. Basically, the <it>p</it>-value is defined as following:</p>
            <p>
               <display-formula id="M6">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1752-0509-5-S3-S10-i7"><m:mrow>
   <m:mi>P</m:mi>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mn>1</m:mn>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:munderover accent="false" accentunder="false">
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>i</m:mi>
         <m:mo class="MathClass-rel">=</m:mo>
         <m:mn>0</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mi>k</m:mi>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:munderover>
   <m:mfrac>
      <m:mrow>
         <m:mfenced close=")" open="(" separators="">
            <m:mrow>
               <m:mtable class="array" columnlines="none none none none none none none none none none none none none none none none none none none" equalcolumns="false" equalrows="false">
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>C</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>i</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center"/>
                  </m:mtr>
               </m:mtable>
            </m:mrow>
         </m:mfenced>
         <m:mfenced close=")" open="(" separators="">
            <m:mrow>
               <m:mtable class="array" columnlines="none none none none none none none none none none none none none none none none none none none" equalcolumns="false" equalrows="false">
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>G</m:mi>
                     </m:mtd>
                     <m:mtd class="array" columnalign="center">
                        <m:mo class="MathClass-bin">-</m:mo>
                     </m:mtd>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>C</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>n</m:mi>
                     </m:mtd>
                     <m:mtd class="array" columnalign="center">
                        <m:mo class="MathClass-bin">-</m:mo>
                     </m:mtd>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>i</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center"/>
                  </m:mtr>
               </m:mtable>
            </m:mrow>
         </m:mfenced>
      </m:mrow>
      <m:mrow>
         <m:mfenced close=")" open="(" separators="">
            <m:mrow>
               <m:mtable class="array" columnlines="none none none none none none none none none none none none none none none none none none none" equalcolumns="false" equalrows="false">
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>G</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center">
                        <m:mi>n</m:mi>
                     </m:mtd>
                  </m:mtr>
                  <m:mtr>
                     <m:mtd class="array" columnalign="center"/>
                  </m:mtr>
               </m:mtable>
            </m:mrow>
         </m:mfenced>
      </m:mrow>
   </m:mfrac>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>where <it>C </it>is the size of the gene set containing <it>k </it>gene with a given GO term; <it>G </it>is the size of the universal set of known genes and contains <it>n </it>genes with the annotation.</p>
            <p>Low <it>P </it>in Formula 6 indicates that the module closely corresponds to the GO annotation because the network has a rare chance to produce the module. To simplify our analysis, we define <it>p</it>-score as the negative of <it>log(P) </it>with the annotation <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
            <p>Gene set enrichment analysis for KEGG pathways is very similar to the one for GO annotations. In Equation 6, <it>C </it>is the size of the gene set containing <it>k </it>genes that exist in a given KEGG pathway; <it>G </it>is the size of the universal set of known genes and contains <it>n </it>genes that exist in the pathway. Similarly, <it>p</it>-score can be used to measure the relationship between the gene set and a specific KEGG pathway.</p>
            <p>In this study, both KEGG and GO enrichment analysis are performed on DAVID <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Validate on microarray dataset</p>
            </st>
            <p>To evaluate the signature's ability to predict clincal outcome, we used expression intensity of the genes in the signature to cluster microarray datasets of breast cancer patients with different pathologic parameters. Patients with similar pathologic parameters should be clustered togather. For a given pathologic parameter, the <it>p</it>-value of the clustering result indicates the signature's ability to predict the pathologic parameter.</p>
            <p>In this study, euclidean distance between samples are calculated by using the expression intensity of genes in gene signature. Then hierarchical clustering is used to cluster the microarry datasets of breast cancer patients.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Overlap among breast cancer signatures</p>
            </st>
            <p>As mentioned in Methods section, 94 breast cancer gene signatures are obtained from public database. Since the different generation methods and the various purposes of these signatures, the size of these signatures are very different. The biggest signature contains 3260 genes, and the smallest signature only contains 4 genes. The median size of these signatures is 46.</p>
            <p>To evaluate the similarity among these gene signatures, we analyze the overlap among the 94 gene signatures (see Table <tblr tid="T1">1</tblr>). The analysis result is very consistent with the results reported in literature <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. A very small overlap is found among different signatures. 4143 (58.6% of the total number) genes are found in only one signature, but only 24(0.4%) genes overlapped 10 or more signatures, and none of the genes overlapped all 94 signatures. The lack of overlapping is an obstacle to integrate various signatures of breast cancer.</p>
            <tbl id="T1"><title><p>Table 1</p></title><caption><p>Overlap among 94 breast cancer gene signatures</p></caption><tblbdy cols="17">
      <r>
         <c ca="left">
            <p>
               <b>Frequency</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>1</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>2</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>3</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>4</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>5</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>6</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>7</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>8</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>9</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>10</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>11</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>12</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>14</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>15</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>16</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>17</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="17">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <b>Num of genes</b>
            </p>
         </c>
         <c ca="center">
            <p>4143</p>
         </c>
         <c ca="center">
            <p>1608</p>
         </c>
         <c ca="center">
            <p>687</p>
         </c>
         <c ca="center">
            <p>323</p>
         </c>
         <c ca="center">
            <p>148</p>
         </c>
         <c ca="center">
            <p>56</p>
         </c>
         <c ca="center">
            <p>40</p>
         </c>
         <c ca="center">
            <p>23</p>
         </c>
         <c ca="center">
            <p>15</p>
         </c>
         <c ca="center">
            <p>13</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>There is very small overlap among distinct breast cancer signatures. Most genes exist in only one signature, and only 24 genes are included in 10 or more signatures.</p>
   </tblfn></tbl>
         </sec>
         <sec>
            <st>
               <p>Centrality based gene signatures</p>
            </st>
            <p>All genes included in the 94 signatures are projected to the human PIN described in Methods section. In consideration of the lower reliability of gene signatures, only the genes that are included in two or more different signatures are used to construct the context-constrained PIN of breast cancer. Finally, this context-constrained PIN contains 2924 proteins and 4698 interactions.</p>
            <p>Then, six graph centralities of each genes in this context-constrained PIN are calculated. Higher centrality for a gene indicates that the gene is more important to this network and should be more tightly related to breast cancer. Graph centrality of each gene in the context-constrained PIN is calculated and provided in Additional File <supplr sid="S1">1</supplr>. In a recent similar study performed by Chen <it>et.al</it>, based on a context-constrained network that obtained from literatures, 10 published gene signatures are integrated and a 54-genes signature is obtained <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. This result is named as "Chen's signature" in the rest of this paper. For comparison, we also select 54 most important genes identified by each centrality definitions to construct gene signatures. Full list of genes in these graph centrality based signatures are provided in Table <tblr tid="T2">2</tblr>.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Graph centrality of genes in the context-constrained PIN</b>. In this study, graph centrality of each gene in the context-constrained PIN is calculated and used to quantify the relationship between genes and the breast cancer. The calculation results are provided in this additional file.</p>
               </text>
               <file name="1752-0509-5-S3-S10-S1.pdf">
   <p>Click here for file</p>
</file>
            </suppl>
            <tbl id="T2"><title><p>Table 2</p></title><caption><p>Graph centrality based breast cancer signatures</p></caption><tblbdy cols="6">
      <r>
         <c ca="right">
            <p>
               <b>BC</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>CC</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>DC</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>EC</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>IC</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>SC</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>TP53</p>
         </c>
         <c ca="right">
            <p>TP53</p>
         </c>
         <c ca="right">
            <p>TP53</p>
         </c>
         <c ca="right">
            <p>TP53</p>
         </c>
         <c ca="right">
            <p>TP53</p>
         </c>
         <c ca="right">
            <p>TP53</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>EGFR</p>
         </c>
         <c ca="right">
            <p>ESR1</p>
         </c>
         <c ca="right">
            <p>EGFR</p>
         </c>
         <c ca="right">
            <p>ESR1</p>
         </c>
         <c ca="right">
            <p>EGFR</p>
         </c>
         <c ca="right">
            <p>ESR1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
         <c ca="right">
            <p>AR</p>
         </c>
         <c ca="right">
            <p>EP300</p>
         </c>
         <c ca="right">
            <p>EP300</p>
         </c>
         <c ca="right">
            <p>EP300</p>
         </c>
         <c ca="right">
            <p>EP300</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>SMAD3</p>
         </c>
         <c ca="right">
            <p>EGFR</p>
         </c>
         <c ca="right">
            <p>ESR1</p>
         </c>
         <c ca="right">
            <p>AR</p>
         </c>
         <c ca="right">
            <p>ESR1</p>
         </c>
         <c ca="right">
            <p>AR</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>ESR1</p>
         </c>
         <c ca="right">
            <p>EP300</p>
         </c>
         <c ca="right">
            <p>BRCA1</p>
         </c>
         <c ca="right">
            <p>BRCA1</p>
         </c>
         <c ca="right">
            <p>BRCA1</p>
         </c>
         <c ca="right">
            <p>BRCA1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>EP300</p>
         </c>
         <c ca="right">
            <p>SMAD3</p>
         </c>
         <c ca="right">
            <p>CREBBP</p>
         </c>
         <c ca="right">
            <p>CREBBP</p>
         </c>
         <c ca="right">
            <p>CREBBP</p>
         </c>
         <c ca="right">
            <p>CREBBP</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>SRC</p>
         </c>
         <c ca="right">
            <p>BRCA1</p>
         </c>
         <c ca="right">
            <p>SMAD3</p>
         </c>
         <c ca="right">
            <p>SMAD3</p>
         </c>
         <c ca="right">
            <p>AR</p>
         </c>
         <c ca="right">
            <p>SMAD3</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>BRCA1</p>
         </c>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
         <c ca="right">
            <p>AR</p>
         </c>
         <c ca="right">
            <p>EGFR</p>
         </c>
         <c ca="right">
            <p>SMAD3</p>
         </c>
         <c ca="right">
            <p>EGFR</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CREBBP</p>
         </c>
         <c ca="right">
            <p>SRC</p>
         </c>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
         <c ca="right">
            <p>HDAC1</p>
         </c>
         <c ca="right">
            <p>SRC</p>
         </c>
         <c ca="right">
            <p>HDAC1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>AR</p>
         </c>
         <c ca="right">
            <p>CREBBP</p>
         </c>
         <c ca="right">
            <p>SRC</p>
         </c>
         <c ca="right">
            <p>STAT3</p>
         </c>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
         <c ca="right">
            <p>STAT3</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>UBE2I</p>
         </c>
         <c ca="right">
            <p>AKT1</p>
         </c>
         <c ca="right">
            <p>HDAC1</p>
         </c>
         <c ca="right">
            <p>RB1</p>
         </c>
         <c ca="right">
            <p>HDAC1</p>
         </c>
         <c ca="right">
            <p>RB1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>DYNLL1</p>
         </c>
         <c ca="right">
            <p>HDAC1</p>
         </c>
         <c ca="right">
            <p>CASP3</p>
         </c>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
         <c ca="right">
            <p>RB1</p>
         </c>
         <c ca="right">
            <p>CTNNB1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CASP3</p>
         </c>
         <c ca="right">
            <p>STAT3</p>
         </c>
         <c ca="right">
            <p>RB1</p>
         </c>
         <c ca="right">
            <p>JUN</p>
         </c>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
         <c ca="right">
            <p>JUN</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>ACTB</p>
         </c>
         <c ca="right">
            <p>STAT1</p>
         </c>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
         <c ca="right">
            <p>SRC</p>
         </c>
         <c ca="right">
            <p>CASP3</p>
         </c>
         <c ca="right">
            <p>SRC</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>AKT1</p>
         </c>
         <c ca="right">
            <p>XRCC6</p>
         </c>
         <c ca="right">
            <p>STAT3</p>
         </c>
         <c ca="right">
            <p>AKT1</p>
         </c>
         <c ca="right">
            <p>STAT3</p>
         </c>
         <c ca="right">
            <p>AKT1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>HDAC1</p>
         </c>
         <c ca="right">
            <p>RB1</p>
         </c>
         <c ca="right">
            <p>UBE2I</p>
         </c>
         <c ca="right">
            <p>SMARCA4</p>
         </c>
         <c ca="right">
            <p>AKT1</p>
         </c>
         <c ca="right">
            <p>SMARCA4</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
         <c ca="right">
            <p>AKT1</p>
         </c>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
         <c ca="right">
            <p>SHC1</p>
         </c>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
         <c ca="right">
            <p>PML</p>
         </c>
         <c ca="right">
            <p>CDK2</p>
         </c>
         <c ca="right">
            <p>PML</p>
         </c>
         <c ca="right">
            <p>CDK2</p>
         </c>
         <c ca="right">
            <p>PML</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>ACVR1</p>
         </c>
         <c ca="right">
            <p>UBE2I</p>
         </c>
         <c ca="right">
            <p>PCNA</p>
         </c>
         <c ca="right">
            <p>STAT1</p>
         </c>
         <c ca="right">
            <p>JUN</p>
         </c>
         <c ca="right">
            <p>STAT1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>RB1</p>
         </c>
         <c ca="right">
            <p>HSPA8</p>
         </c>
         <c ca="right">
            <p>SHC1</p>
         </c>
         <c ca="right">
            <p>CDK2</p>
         </c>
         <c ca="right">
            <p>STAT1</p>
         </c>
         <c ca="right">
            <p>CDK2</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>STAT1</p>
         </c>
         <c ca="right">
            <p>CASP3</p>
         </c>
         <c ca="right">
            <p>JUN</p>
         </c>
         <c ca="right">
            <p>NCOA6</p>
         </c>
         <c ca="right">
            <p>UBE2I</p>
         </c>
         <c ca="right">
            <p>NCOA6</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>HGS</p>
         </c>
         <c ca="right">
            <p>CDK2</p>
         </c>
         <c ca="right">
            <p>DYNLL1</p>
         </c>
         <c ca="right">
            <p>HDAC2</p>
         </c>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
         <c ca="right">
            <p>HDAC2</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CDK2</p>
         </c>
         <c ca="right">
            <p>SMARCA4</p>
         </c>
         <c ca="right">
            <p>STAT1</p>
         </c>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
         <c ca="right">
            <p>PCNA</p>
         </c>
         <c ca="right">
            <p>PIK3R1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>YWHAZ</p>
         </c>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
         <c ca="right">
            <p>ACTB</p>
         </c>
         <c ca="right">
            <p>UBE2I</p>
         </c>
         <c ca="right">
            <p>HDAC2</p>
         </c>
         <c ca="right">
            <p>UBE2I</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>BCL2</p>
         </c>
         <c ca="right">
            <p>JUN</p>
         </c>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
         <c ca="right">
            <p>XRCC6</p>
         </c>
         <c ca="right">
            <p>SMARCA4</p>
         </c>
         <c ca="right">
            <p>XRCC6</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>XRCC6</p>
         </c>
         <c ca="right">
            <p>CDH1</p>
         </c>
         <c ca="right">
            <p>MAPK14</p>
         </c>
         <c ca="right">
            <p>HIF1A</p>
         </c>
         <c ca="right">
            <p>NFKB1</p>
         </c>
         <c ca="right">
            <p>HIF1A</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>STAT3</p>
         </c>
         <c ca="right">
            <p>YWHAZ</p>
         </c>
         <c ca="right">
            <p>YWHAZ</p>
         </c>
         <c ca="right">
            <p>NFKB1</p>
         </c>
         <c ca="right">
            <p>CDH1</p>
         </c>
         <c ca="right">
            <p>NFKB1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PLK1</p>
         </c>
         <c ca="right">
            <p>MAPK14</p>
         </c>
         <c ca="right">
            <p>BCL2</p>
         </c>
         <c ca="right">
            <p>CASP3</p>
         </c>
         <c ca="right">
            <p>ACTB</p>
         </c>
         <c ca="right">
            <p>CASP3</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PCNA</p>
         </c>
         <c ca="right">
            <p>PRKDC</p>
         </c>
         <c ca="right">
            <p>HDAC2</p>
         </c>
         <c ca="right">
            <p>E2F1</p>
         </c>
         <c ca="right">
            <p>MAPK14</p>
         </c>
         <c ca="right">
            <p>E2F1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>HSPA8</p>
         </c>
         <c ca="right">
            <p>HIF1A</p>
         </c>
         <c ca="right">
            <p>ATM</p>
         </c>
         <c ca="right">
            <p>CEBPB</p>
         </c>
         <c ca="right">
            <p>LYN</p>
         </c>
         <c ca="right">
            <p>CEBPB</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>FN1</p>
         </c>
         <c ca="right">
            <p>SHC1</p>
         </c>
         <c ca="right">
            <p>LYN</p>
         </c>
         <c ca="right">
            <p>PRKDC</p>
         </c>
         <c ca="right">
            <p>XRCC6</p>
         </c>
         <c ca="right">
            <p>PRKDC</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>VIM</p>
         </c>
         <c ca="right">
            <p>STUB1</p>
         </c>
         <c ca="right">
            <p>NFKB1</p>
         </c>
         <c ca="right">
            <p>SHC1</p>
         </c>
         <c ca="right">
            <p>ATM</p>
         </c>
         <c ca="right">
            <p>SHC1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CD44</p>
         </c>
         <c ca="right">
            <p>HDAC2</p>
         </c>
         <c ca="right">
            <p>XRCC6</p>
         </c>
         <c ca="right">
            <p>NCOA2</p>
         </c>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
         <c ca="right">
            <p>NCOA2</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>MAPK14</p>
         </c>
         <c ca="right">
            <p>ERBB2</p>
         </c>
         <c ca="right">
            <p>CDH1</p>
         </c>
         <c ca="right">
            <p>CCND1</p>
         </c>
         <c ca="right">
            <p>E2F1</p>
         </c>
         <c ca="right">
            <p>CCND1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>GNB2L1</p>
         </c>
         <c ca="right">
            <p>HSPA4</p>
         </c>
         <c ca="right">
            <p>SMARCA4</p>
         </c>
         <c ca="right">
            <p>PCNA</p>
         </c>
         <c ca="right">
            <p>YWHAZ</p>
         </c>
         <c ca="right">
            <p>PCNA</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>FANCA</p>
         </c>
         <c ca="right">
            <p>BCL2</p>
         </c>
         <c ca="right">
            <p>EZR</p>
         </c>
         <c ca="right">
            <p>TDG</p>
         </c>
         <c ca="right">
            <p>BCL2</p>
         </c>
         <c ca="right">
            <p>TDG</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>EEF1A1</p>
         </c>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
         <c ca="right">
            <p>HGS</p>
         </c>
         <c ca="right">
            <p>ERBB2</p>
         </c>
         <c ca="right">
            <p>PML</p>
         </c>
         <c ca="right">
            <p>ERBB2</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>SKIL</p>
         </c>
         <c ca="right">
            <p>LYN</p>
         </c>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
         <c ca="right">
            <p>HSPA4</p>
         </c>
         <c ca="right">
            <p>JAK1</p>
         </c>
         <c ca="right">
            <p>HSPA4</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>SHC1</p>
         </c>
         <c ca="right">
            <p>NCOA6</p>
         </c>
         <c ca="right">
            <p>PLK1</p>
         </c>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
         <c ca="right">
            <p>EZR</p>
         </c>
         <c ca="right">
            <p>ZBTB16</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>USP7</p>
         </c>
         <c ca="right">
            <p>PAK1</p>
         </c>
         <c ca="right">
            <p>CAV1</p>
         </c>
         <c ca="right">
            <p>RXRA</p>
         </c>
         <c ca="right">
            <p>PRKDC</p>
         </c>
         <c ca="right">
            <p>RXRA</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CAV1</p>
         </c>
         <c ca="right">
            <p>MDM4</p>
         </c>
         <c ca="right">
            <p>E2F1</p>
         </c>
         <c ca="right">
            <p>MAPK14</p>
         </c>
         <c ca="right">
            <p>JAK2</p>
         </c>
         <c ca="right">
            <p>MAPK14</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>JUN</p>
         </c>
         <c ca="right">
            <p>CASP8</p>
         </c>
         <c ca="right">
            <p>JAK1</p>
         </c>
         <c ca="right">
            <p>JAK2</p>
         </c>
         <c ca="right">
            <p>CASP8</p>
         </c>
         <c ca="right">
            <p>JAK2</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>RXRA</p>
         </c>
         <c ca="right">
            <p>CEBPB</p>
         </c>
         <c ca="right">
            <p>RXRA</p>
         </c>
         <c ca="right">
            <p>STUB1</p>
         </c>
         <c ca="right">
            <p>RXRA</p>
         </c>
         <c ca="right">
            <p>STUB1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>LYN</p>
         </c>
         <c ca="right">
            <p>NFKB1</p>
         </c>
         <c ca="right">
            <p>FN1</p>
         </c>
         <c ca="right">
            <p>CDK7</p>
         </c>
         <c ca="right">
            <p>ERBB2</p>
         </c>
         <c ca="right">
            <p>CDK7</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>CDKN1A</p>
         </c>
         <c ca="right">
            <p>JAK2</p>
         </c>
         <c ca="right">
            <p>HSPA8</p>
         </c>
         <c ca="right">
            <p>PTPN6</p>
         </c>
         <c ca="right">
            <p>HIF1A</p>
         </c>
         <c ca="right">
            <p>PTPN6</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>HIF1A</p>
         </c>
         <c ca="right">
            <p>PTPN6</p>
         </c>
         <c ca="right">
            <p>VAV1</p>
         </c>
         <c ca="right">
            <p>ATM</p>
         </c>
         <c ca="right">
            <p>CAV1</p>
         </c>
         <c ca="right">
            <p>ATM</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>ATM</p>
         </c>
         <c ca="right">
            <p>HGS</p>
         </c>
         <c ca="right">
            <p>ACVR1</p>
         </c>
         <c ca="right">
            <p>BCL3</p>
         </c>
         <c ca="right">
            <p>HSPA8</p>
         </c>
         <c ca="right">
            <p>BCL3</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>EZR</p>
         </c>
         <c ca="right">
            <p>EZR</p>
         </c>
         <c ca="right">
            <p>CASP8</p>
         </c>
         <c ca="right">
            <p>DDX5</p>
         </c>
         <c ca="right">
            <p>VAV1</p>
         </c>
         <c ca="right">
            <p>DDX5</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PPP1CA</p>
         </c>
         <c ca="right">
            <p>CAV1</p>
         </c>
         <c ca="right">
            <p>HIF1A</p>
         </c>
         <c ca="right">
            <p>RBL1</p>
         </c>
         <c ca="right">
            <p>CDKN2A</p>
         </c>
         <c ca="right">
            <p>RBL1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PAK1</p>
         </c>
         <c ca="right">
            <p>PIAS4</p>
         </c>
         <c ca="right">
            <p>JAK2</p>
         </c>
         <c ca="right">
            <p>FOS</p>
         </c>
         <c ca="right">
            <p>NCOA6</p>
         </c>
         <c ca="right">
            <p>FOS</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>PLSCR1</p>
         </c>
         <c ca="right">
            <p>TDG</p>
         </c>
         <c ca="right">
            <p>PML</p>
         </c>
         <c ca="right">
            <p>CDH1</p>
         </c>
         <c ca="right">
            <p>PTPN6</p>
         </c>
         <c ca="right">
            <p>CDH1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>TGFBR2</p>
         </c>
         <c ca="right">
            <p>IGF1R</p>
         </c>
         <c ca="right">
            <p>CDKN2A</p>
         </c>
         <c ca="right">
            <p>MDM4</p>
         </c>
         <c ca="right">
            <p>DYNLL1</p>
         </c>
         <c ca="right">
            <p>MDM4</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>AURKA</p>
         </c>
         <c ca="right">
            <p>DDX5</p>
         </c>
         <c ca="right">
            <p>EEF1A1</p>
         </c>
         <c ca="right">
            <p>ING1</p>
         </c>
         <c ca="right">
            <p>HGS</p>
         </c>
         <c ca="right">
            <p>ING1</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>KPNB1</p>
         </c>
         <c ca="right">
            <p>MET</p>
         </c>
         <c ca="right">
            <p>ERBB2</p>
         </c>
         <c ca="right">
            <p>SIN3A</p>
         </c>
         <c ca="right">
            <p>SIN3A</p>
         </c>
         <c ca="right">
            <p>SIN3A</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>54 genes with highest graph centrality in the context-constrained PIN are selected to consist new breast cancer gene signatures. The genes in the six signatures identified by six graph centrality measurements are listed in this table.</p>
   </tblfn></tbl>
            <p>Since all the six graph centrality definitions are designed to measure the importance of a node to a graph, gene signatures identified by the six centrality measurements are similar with each other. A extremely case is all the top 54 genes identified by <it>EC </it>and <it>SC </it>are the same. By contrast, overlap between our results and Chen's gene signature is only 5-8(9.3%-14.8%, p-value &lt; 0.05).</p>
            <p>An interesting result is found in our work. No matter which centrality measurement is used to evaluate the importance of genes in the context-constrained network, TP53 gene, which is already known as a tumor suppressor, is always the most important gene. Another similar example is breast cancer type 1 susceptibility protein (BRCA1). As we expected, our result also shows that BRCA1 plays an important role in the breast cancer. Other similar examples include epidermal growth factor receptor (EGFR), E1A binding protein p300(E300), Androgen receptor gene(AR) and so on (see Table <tblr tid="T2">2</tblr>).</p>
            <p>However, another well-known breast cancer gene, BRCA2, is not included in any signature identified by the six centrality measurement. This is because BRCA2 is included in only one original signature and was discarded when we constructed the context-constrained PIN. On the one hand, the absence of BRCA2 is also a evidence to prove that the quality of existing breast cancer gene signatures is low, and on the other hand, the absence of BRCA2 in out signature indicates that our method can be improved by refining the signature genes collection method.</p>
         </sec>
         <sec>
            <st>
               <p>Relationship among genes in gene signatures</p>
            </st>
            <p>To investigate the relationship among genes in the gene signatures, seven sub PINs are constructed by projecting genes in each gene signature to the complete human PIN. As shown in Table <tblr tid="T3">3</tblr>, the sub PINs consisted by the genes in the graph centrality based gene signatures are much denser than that consisted by the genes in Chen's signature. We also can observe the significant difference in Figure <figr fid="F2">2</figr>. The most dense networks are consisted by the disease genes that are identified by <it>EC </it>and <it>SC</it>.</p>
            <tbl id="T3"><title><p>Table 3</p></title><caption><p>Topological size of context-constrained PINs.</p></caption><tblbdy cols="3">
      <r>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>Proteins</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Interactions</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="3">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>BC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>238</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>CC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>330</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>DC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>279</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>EC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>352</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>IC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>312</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>SC</p>
         </c>
         <c ca="center">
            <p>54</p>
         </c>
         <c ca="center">
            <p>352</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Chen</p>
         </c>
         <c ca="center">
            <p>54(35)</p>
         </c>
         <c ca="center">
            <p>70</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>To investigate the differences between graph centrality based gene signatures and that reported in previous literature, topological size of sub-network consisted by the genes in these gene signatures are calculated and presented in this table. It should be note that the sub-network consisted by the genes included in Chen's gene signature is not a connected graph. The number in parenthesis is the number of proteins of its biggest connected component.</p>
   </tblfn></tbl>
            <fig id="F2"><title><p>Figure 2</p></title><caption><p>Sub-networks consisted by the genes of different signatures</p></caption><text>
   <p><b>Sub-networks consisted by the genes of different signatures</b>. The first six sub-networks are consisted by the signature genes identified by six graph centrality measurements, repectively. The last one is consisted by the signature genes identified by Chen. Compared with Chen's result, all the sub-networks that consisted by the genes of the signatures that identified by graph centrality are connected graph and denser.</p>
</text><graphic file="1752-0509-5-S3-S10-2"/></fig>
         </sec>
         <sec>
            <st>
               <p>KEGG pathways enrichment analysis</p>
            </st>
            <p>As shown in Table <tblr tid="T4">4</tblr>, the most significant KEGG pathway of all six graph centrality-based gene signatures are "pathways in cancer", but that of Chen's gene signature is "Cell cycle". Chen's gene signature is also annotated by "pathways in cancer", but the <it>p</it>-score is very low. It is obvious that graph centrality based method is more powerful than SPAN to identify cancer related genes in a constrained PIN.</p>
            <tbl id="T4"><title><p>Table 4</p></title><caption><p>KEGG pathways enrichment analysis results.</p></caption><tblbdy cols="5">
      <r>
         <c ca="center">
            <p>
               <b>Method</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>KEGG Pathway</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Description</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Annotated Genes</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Corrected <it>p</it>-score</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>BC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>23</p>
         </c>
         <c ca="left">
            <p>13.09</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>CC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>29</p>
         </c>
         <c ca="left">
            <p>18.92</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>DC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>31</p>
         </c>
         <c ca="left">
            <p>21.59</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>EC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>28</p>
         </c>
         <c ca="left">
            <p>25.74</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>IC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>30</p>
         </c>
         <c ca="left">
            <p>20.22</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>SC</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>28</p>
         </c>
         <c ca="left">
            <p>25.74</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Chen</p>
         </c>
         <c ca="left">
            <p>hsa04110</p>
         </c>
         <c ca="center">
            <p>Cell cycle</p>
         </c>
         <c ca="left">
            <p>25</p>
         </c>
         <c ca="left">
            <p>26.80</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Chen</p>
         </c>
         <c ca="left">
            <p>hsa05200</p>
         </c>
         <c ca="center">
            <p>pathways in cancer</p>
         </c>
         <c ca="left">
            <p>11</p>
         </c>
         <c ca="left">
            <p>2.59</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>To investigate the relationship between the gene signatures and cancer-related pathways, KEGG pathway enrichment analysis is preformed on these signatures.</p>
   </tblfn></tbl>
         </sec>
         <sec>
            <st>
               <p>GO enrichment analysis</p>
            </st>
            <p>Biological process GO enrichment analysis is performed on each gene signatures (see Table <tblr tid="T5">5</tblr>). For each signature, the GO biological process with highest p-score and corresponding p-score are presented in Table <tblr tid="T5">5</tblr>. Unlike the genes identified in Chen's study <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, which is annotated by term "cell cycle", five centrality-based gene signatures are annotated by "positive regulation of macromolecule metabolic process" and only the gene signature identified by <it>BC </it>is annotated by "response to organic substance".</p>
            <tbl id="T5"><title><p>Table 5</p></title><caption><p>Gene Ontology enrichment analysis results.</p></caption><tblbdy cols="4">
      <r>
         <c ca="center">
            <p>
               <b>Method</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>GO Term</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Description</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>p</it>-score</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="4">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>BC</p>
         </c>
         <c ca="center">
            <p>GO:0010033</p>
         </c>
         <c ca="left">
            <p>response to organic substance</p>
         </c>
         <c ca="center">
            <p>21.44</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>CC</p>
         </c>
         <c ca="center">
            <p>GO:0010604</p>
         </c>
         <c ca="left">
            <p>positive regulation of macromolecule metabolic process</p>
         </c>
         <c ca="center">
            <p>21.44</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>DC</p>
         </c>
         <c ca="center">
            <p>GO:0010604</p>
         </c>
         <c ca="left">
            <p>positive regulation of macromolecule metabolic process</p>
         </c>
         <c ca="center">
            <p>17.50</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>EC</p>
         </c>
         <c ca="center">
            <p>GO:0010604</p>
         </c>
         <c ca="left">
            <p>positive regulation of macromolecule metabolic process</p>
         </c>
         <c ca="center">
            <p>25.74</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>IC</p>
         </c>
         <c ca="center">
            <p>GO:0010033</p>
         </c>
         <c ca="left">
            <p>response to organic substance</p>
         </c>
         <c ca="center">
            <p>19.60</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>SC</p>
         </c>
         <c ca="center">
            <p>GO:0010604</p>
         </c>
         <c ca="left">
            <p>positive regulation of macromolecule metabolic process</p>
         </c>
         <c ca="center">
            <p>25.74</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Chen</p>
         </c>
         <c ca="center">
            <p>GO:0007049</p>
         </c>
         <c ca="left">
            <p>cell cycle</p>
         </c>
         <c ca="center">
            <p>19.94</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>To investigate the relationship between the gene signatures and cancer-related biological processes, Gene Ontology enrichment analysis is preformed on these signatures.</p>
   </tblfn></tbl>
            <p>The structure of GO terms is a tree-like structure (see Figure <figr fid="F3">3</figr>. The term "regulation of biological regulation", which is located at the same level of "cell cycle", is the ancestor of "positive regulation of macromolecule metabolic process". According to G2SBC database <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, only 190 breast cancer genes are annotated by "Cell Cycle", but 1159 breast cancer genes are annotated by "regulation of biological process". This indicates that the gene signatures identified by graph centrality are probably more tightly related to breast cancer.</p>
            <fig id="F3"><title><p>Figure 3</p></title><caption><p>Relationship among most significant GO terms annotate to significant breast cancer genes identified by different methods</p></caption><text>
   <p><b>Relationship among most significant GO terms annotate to significant breast cancer genes identified by different methods</b>. Relationship among most significant GO terms annotate to significant breast cancer genes identified by different methods. Green boxes indicates the GO terms annotate to genes identified by centrality, and red box indicate the GO term annotate to Chen's results.</p>
</text><graphic file="1752-0509-5-S3-S10-3"/></fig>
            <p>According to the <it>p</it>-score, both the two types of enrichment analysis suggested that <it>EC </it>and <it>SC </it>are probably the best choices to identify disease genes from context-constrained network and integrate different gene signatures. As reported in literatures, <it>SC </it>also has superior performance in the discovery of essential proteins in protein interaction network <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> and essential proteins tend to have higher correlation with dominant and recessive mutants of disease genes <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. This is possibly the reason that <it>SC </it>and <it>EC </it>outperformed other centrality measurements. We also note that the PINs that are consisted by the disease genes identified by <it>EC </it>and <it>SC </it>are denser than that of other centrality measurements(see Table <tblr tid="T3">3</tblr>). This indicates that breast cancer genes have tight and complicated relationship with each other.</p>
         </sec>
         <sec>
            <st>
               <p>Validate of the prognostic potential of the subgraph centrality based gene signature</p>
            </st>
            <p>Since <it>SC </it>and <it>EC </it>outperform in functional enrichment analysis, we tested the <it>SC </it>and <it>EC </it>based gene signatures in a genome-wide microarray dataset: GSE7390 <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. This microarray dataset is downloaded from NCBI GEO database and includes 198 breast cancer patients width different pre-diagnosed pathologic parameters.</p>
            <p>We analyze the relationship between the <it>SC </it>based signature genes and hormone receptor status using hierarchical clustering. The 198 patients in GSE7390 were divided into two main clusters (Additional File <supplr sid="S2">2</supplr>), the first cluster includes 118 samples, and the second cluster includes 80 samples. The mean value of each attribute of the samples in each cluster are listed in Table <tblr tid="T6">6</tblr>. As shown in Table <tblr tid="T6">6</tblr>, the tumor size and NPI Score of the first cluster is smaller than those of the second cluster. Disease-free survival time, overall survival time, distant metastasis-free survival time, 10-year overall survival probability and time to distant metastasis of the patients in the first cluster is longer than those of the patients in the second cluster. In another word, the condition of patients in the first cluster is much better than the second cluster of patients.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>SC-based gene signature based hierarchical clustering analysis of breast cancer microarray dataset by using SC-based gene signature</b>. According to the gene signature identified by SC, hierarchical clustering analysis is performed on the breast cancer microarray dataset, GSE7390, which include 198 breast cancer patients with various pathologic parameters.</p>
               </text>
               <file name="1752-0509-5-S3-S10-S2.pdf">
   <p>Click here for file</p>
</file>
            </suppl>
            <tbl id="T6"><title><p>Table 6</p></title><caption><p>Clinical outcome of two main clusters</p></caption><tblbdy cols="3">
      <r>
         <c>
            <p/>
         </c>
         <c ca="right">
            <p>
               <b>118 samples cluster</b>
            </p>
         </c>
         <c ca="right">
            <p>
               <b>80 samples cluster</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="3">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Patient Age</p>
         </c>
         <c ca="right">
            <p>47</p>
         </c>
         <c ca="right">
            <p>45</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Tumor Size(mm)</p>
         </c>
         <c ca="right">
            <p>2.06</p>
         </c>
         <c ca="right">
            <p>2.36</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Disease-free survival (days)</p>
         </c>
         <c ca="right">
            <p>3498</p>
         </c>
         <c ca="right">
            <p>3252</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Overall survival (days)</p>
         </c>
         <c ca="right">
            <p>4391</p>
         </c>
         <c ca="right">
            <p>3792</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Distant metastasis-free survival (days)</p>
         </c>
         <c ca="right">
            <p>4148</p>
         </c>
         <c ca="right">
            <p>3667</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>Time to distant metastasis (days)</p>
         </c>
         <c ca="right">
            <p>4148</p>
         </c>
         <c ca="right">
            <p>3667</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>NPI Score</p>
         </c>
         <c ca="right">
            <p>3.4</p>
         </c>
         <c ca="right">
            <p>4.17</p>
         </c>
      </r>
      <r>
         <c ca="right">
            <p>10-year overall survival probability</p>
         </c>
         <c ca="right">
            <p>83.38</p>
         </c>
         <c ca="right">
            <p>74.98</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Mean value of each clinical outcome recorded in the microarray dataset of the samples in each cluster is presented in this table.</p>
   </tblfn></tbl>
            <p>In the first cluster of patients, 104 of them were ER+; and in the second cluster of patients, only 30 of them were ER+. The <it>p</it>-value calculated by ANOVA is 2.47 &#215; 10<sup>-13</sup>. We also explored the relationship between the signature genes and the time to distant metastasis. In the first cluster, the time to distant metastasis of 97 patients were longer than 2000 days; and in the second cluster, that number is only 55. The <it>p</it>-value is 0.04254. Normally, <it>p</it>-value that smaller than 0.05 means the result is statistic significant and is not generated by chance. Such small <it>p</it>-value of the clustering result indicates that the <it>SC </it>based signature is able to predict the clinical outcome very well.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion and conclusion</p>
         </st>
         <p>Identification of genes which play important roles in the development of cancer is a critical problem needed to solve in current research of various cancers. Gene expression signatures provide a way to find significant cancer genes in given groups of patients. Due to the low overlap between heterogeneous signatures, how to integrate them is becoming a serious problem. Fortunately, as shown in this paper, graph centralities, especially <it>EC </it>and <it>SC</it>, are useful tools to integrate existing different cancer gene signatures.</p>
         <p>As well-known, weighted protein interaction network can be constructed by integrating functional annotations, and centrality is also can be extended to weighted network easily <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. More promising results should be found in the weighted protein interaction network. Besides this, other topological parameters from graph theory may improve this method as well.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>GC conceived and carried out this work under the guidance and supervision of JW. JW, GC and ML drafted the manuscript together. YP participated in revising the draft. All authors have read and approved the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work is supported in part by the National Natural Science Foundation of China under Grant No.61003124 and No.61073036, the Ph.D. Programs Foundation of Ministry of Education of China No.20090162120073, the Freedom Explore Program of Central South University No.201012200124, and the U.S. National Science Foundation under Grants CCF-0514750, CCF-0646102, and CNS-0831634.</p>
            <p>This article has been published as part of <it>BMC Systems Biology </it>Volume 5 Supplement 3, 2011: BIOCOMP 2010 - The 2010 International Conference on Bioinformatics &amp; Computational Biology: Systems Biology.  The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1752-0509/5?issue=S3</url>.</p>
         </sec>
      </ack>
      <refgrp><bibl id="B1"><title><p>Can systems biology understand pathway activation? Gene expression signatures as surrogate markers for understanding the complexity of pathway activation</p></title><aug><au><snm>Itadani</snm><fnm>H</fnm></au><au><snm>Mizuarai</snm><fnm>S</fnm></au><au><snm>Kotani</snm><fnm>H</fnm></au></aug><source>Curr Genomics</source><pubdate>2008</pubdate><volume>9</volume><issue>5</issue><fpage>349</fpage><lpage>360</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2174/138920208785133235</pubid><pubid idtype="pmcid">2694555</pubid><pubid idtype="pmpid">19517027</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Gene expression profiling predicts clinical outcome of breast cancer</p></title><aug><au><snm>van 't Veer</snm><fnm>LJ</fnm></au><au><snm>Dai</snm><fnm>H</fnm></au><au><snm>van de Vijver</snm><fnm>MJ</fnm></au><etal/></aug><source>Nature</source><pubdate>2002</pubdate><volume>415</volume><issue>6871</issue><fpage>530</fpage><lpage>536</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/415530a</pubid><pubid idtype="pmpid" link="fulltext">11823860</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>A Gene-Expression Signature as a Predictor of Survival in Breast Cancer</p></title><aug><au><snm>van de Vijver</snm><fnm>MJ</fnm></au><au><snm>He</snm><fnm>YD</fnm></au><au><snm>van 't Veer</snm><fnm>LJ</fnm></au><etal/></aug><source>New England Journal of Medicine</source><pubdate>2002</pubdate><volume>347</volume><issue>25</issue><fpage>1999</fpage><lpage>2009</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1056/NEJMoa021967</pubid><pubid idtype="pmpid" link="fulltext">12490681</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Network-based classification of breast cancer metastasis</p></title><aug><au><snm>Chuang</snm><fnm>HY</fnm></au><au><snm>Lee</snm><fnm>E</fnm></au><au><snm>Liu</snm><fnm>YT</fnm></au><au><snm>Lee</snm><fnm>D</fnm></au><au><snm>Ideker</snm><fnm>T</fnm></au></aug><source>Mol Syst Biol</source><pubdate>2007</pubdate><volume>3</volume><fpage>140</fpage><xrefbib><pubidlist><pubid idtype="pmcid">2063581</pubid><pubid idtype="pmpid" link="fulltext">17940530</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer</p></title><aug><au><snm>Paik</snm><fnm>S</fnm></au><au><snm>Shak</snm><fnm>S</fnm></au><au><snm>Tang</snm><fnm>G</fnm></au><etal/></aug><source>N Engl J Med</source><pubdate>2004</pubdate><volume>351</volume><issue>27</issue><fpage>2817</fpage><lpage>2826</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1056/NEJMoa041588</pubid><pubid idtype="pmpid" link="fulltext">15591335</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Genes that mediate breast cancer metastasis to lung</p></title><aug><au><snm>Minn</snm><fnm>AJ</fnm></au><au><snm>Gupta</snm><fnm>GP</fnm></au><au><snm>Siegel</snm><fnm>PM</fnm></au><etal/></aug><source>Nature</source><pubdate>2005</pubdate><volume>436</volume><issue>7050</issue><fpage>518</fpage><lpage>524</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature03799</pubid><pubid idtype="pmcid">1283098</pubid><pubid idtype="pmpid" link="fulltext">16049480</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer</p></title><aug><au><snm>Wang</snm><fnm>Y</fnm></au><au><snm>Klijn</snm><fnm>JGM</fnm></au><au><snm>Zhang</snm><fnm>Y</fnm></au><etal/></aug><source>Lancet</source><pubdate>2005</pubdate><volume>365</volume><issue>9460</issue><fpage>671</fpage><lpage>679</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">15721472</pubid></xrefbib></bibl><bibl id="B8"><title><p>Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability</p></title><aug><au><snm>van Vliet</snm><fnm>MH</fnm></au><au><snm>Reyal</snm><fnm>F</fnm></au><au><snm>Horlings</snm><fnm>HM</fnm></au><etal/></aug><source>BMC Genomics</source><pubdate>2008</pubdate><volume>9</volume><fpage>375</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-9-375</pubid><pubid idtype="pmcid">2527336</pubid><pubid idtype="pmpid" link="fulltext">18684329</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Comparison of prognostic gene expression signatures for breast cancer</p></title><aug><au><snm>Haibe-Kains</snm><fnm>B</fnm></au><au><snm>Desmedt</snm><fnm>C</fnm></au><au><snm>Piette</snm><fnm>F</fnm></au><au><snm>Buyse</snm><fnm>M</fnm></au><au><snm>Cardoso</snm><fnm>F</fnm></au><au><snm>Veer</snm><fnm>LV</fnm></au><au><snm>Piccart</snm><fnm>M</fnm></au><au><snm>Bontempi</snm><fnm>G</fnm></au><au><snm>Sotiriou</snm><fnm>C</fnm></au></aug><source>BMC Genomics</source><pubdate>2008</pubdate><volume>9</volume><fpage>394</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-9-394</pubid><pubid idtype="pmcid">2533026</pubid><pubid idtype="pmpid" link="fulltext">18717985</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Complementary gene signature integration in multi-platform microarray experiments</p></title><aug><au><snm>Blazadonakis</snm><fnm>ME</fnm></au><au><snm>Zervakis</snm><fnm>ME</fnm></au><au><snm>Kafetzopoulos</snm><fnm>D</fnm></au></aug><source>IEEE Trans Inf Technol Biomed</source><pubdate>2011</pubdate><volume>15</volume><fpage>155</fpage><lpage>163</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">20813648</pubid></xrefbib></bibl><bibl id="B11"><title><p>A human phenome-interactome network of protein complexes implicated in genetic disorders</p></title><aug><au><snm>Lage</snm><fnm>K</fnm></au><au><snm>Karlberg</snm><fnm>EO</fnm></au><au><snm>St&#248;rling</snm><fnm>ZM</fnm></au><etal/></aug><source>Nat Biotechnol</source><pubdate>2007</pubdate><volume>25</volume><issue>3</issue><fpage>309</fpage><lpage>316</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt1295</pubid><pubid idtype="pmpid" link="fulltext">17344885</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Network modeling identifies molecular functions targeted by miR-204 to suppress head and neck tumor metastasis</p></title><aug><au><snm>Lee</snm><fnm>Y</fnm></au><au><snm>Yang</snm><fnm>X</fnm></au><au><snm>Huang</snm><fnm>Y</fnm></au><au><snm>Fan</snm><fnm>H</fnm></au><etal/></aug><source>PLoS Comput Biol</source><pubdate>2010</pubdate><volume>6</volume><issue>4</issue><fpage>e1000730</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pcbi.1000730</pubid><pubid idtype="pmcid">2848541</pubid><pubid idtype="pmpid" link="fulltext">20369013</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Distinct molecular signature of inflammatory breast cancer by cDNA microarray analysis</p></title><aug><au><snm>Laere</snm><fnm>SV</fnm></au><au><snm>der Auwera</snm><fnm>IV</fnm></au><au><snm>den Eynden</snm><fnm>GGV</fnm></au><etal/></aug><source>Breast Cancer Res Treat</source><pubdate>2005</pubdate><volume>93</volume><issue>3</issue><fpage>237</fpage><lpage>246</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s10549-005-5157-z</pubid><pubid idtype="pmpid" link="fulltext">16172796</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>A novel network model identified a 13-gene lung cancer prognostic signature</p></title><aug><au><snm>Guo</snm><fnm>NL</fnm></au><au><snm>Wan</snm><fnm>YW</fnm></au><au><snm>Bose</snm><fnm>S</fnm></au><au><snm>Denvir</snm><fnm>J</fnm></au><au><snm>Kashon</snm><fnm>ML</fnm></au><au><snm>Andrew</snm><fnm>ME</fnm></au></aug><source>Int J Comput Biol Drug Des</source><pubdate>2011</pubdate><volume>4</volume><fpage>19</fpage><lpage>39</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1504/IJCBDD.2011.038655</pubid><pubid idtype="pmcid">3095973</pubid><pubid idtype="pmpid">21330692</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Protein interaction network underpins concordant prognosis among heterogeneous breast cancer signatures</p></title><aug><au><snm>Chen</snm><fnm>J</fnm></au><au><snm>Sam</snm><fnm>L</fnm></au><au><snm>Huang</snm><fnm>Y</fnm></au><etal/></aug><source>Journal of Biomedical Informatics</source><pubdate>2010</pubdate><volume>43</volume><issue>3</issue><fpage>385</fpage><lpage>396</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jbi.2010.03.009</pubid><pubid idtype="pmcid">2878851</pubid><pubid idtype="pmpid" link="fulltext">20350617</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Protein-network modeling of prostate cancer gene signatures reveals essential pathways in disease recurrence</p></title><aug><au><snm>Chen</snm><fnm>JL</fnm></au><au><snm>Li</snm><fnm>J</fnm></au><au><snm>Stadler</snm><fnm>WM</fnm></au><au><snm>Lussier</snm><fnm>YA</fnm></au></aug><source>J Am Med Inform Assoc</source><pubdate>2011</pubdate><volume>18</volume><issue>4</issue><fpage>392</fpage><lpage>402</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/amiajnl-2011-000178</pubid><pubid idtype="pmcid">3128407</pubid><pubid idtype="pmpid" link="fulltext">21672909</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>GeneSigDB-a curated database of gene expression signatures</p></title><aug><au><snm>Culhane</snm><fnm>AC</fnm></au><au><snm>Schwarzl</snm><fnm>T</fnm></au><au><snm>Sultana</snm><fnm>R</fnm></au><etal/></aug><source>Nucleic Acids Res</source><pubdate>2010</pubdate><issue>38 Database</issue><fpage>D716</fpage><lpage>D725</lpage></bibl><bibl id="B18"><title><p>The BioGRID Interaction Database: 2008 update</p></title><aug><au><snm>Breitkreutz</snm><fnm>BJ</fnm></au><au><snm>Stark</snm><fnm>C</fnm></au><au><snm>Reguly</snm><fnm>T</fnm></au><etal/></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><issue>36 Database</issue><fpage>D637</fpage><lpage>D640</lpage></bibl><bibl id="B19"><title><p>Human Protein Reference Database-2009 update</p></title><aug><au><snm>Prasad</snm><fnm>TSK</fnm></au><au><snm>Goel</snm><fnm>R</fnm></au><au><snm>Kandasamy</snm><fnm>K</fnm></au><etal/></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><issue>37 Database</issue><fpage>D767</fpage><lpage>D772</lpage></bibl><bibl id="B20"><title><p>Essential Proteins Discovery from Weighted Protein Interaction Networks</p></title><aug><au><snm>Li</snm><fnm>M</fnm></au><au><snm>Wang</snm><fnm>J</fnm></au><au><snm>Wang</snm><fnm>H</fnm></au><au><snm>Pan</snm><fnm>Y</fnm></au></aug><source>Bioinformatics Research and Applications</source><pubdate>2010</pubdate><fpage>89</fpage><lpage>100</lpage></bibl><bibl id="B21"><title><p>Lethality and centrality in protein networks</p></title><aug><au><snm>Jeong</snm><fnm>H</fnm></au><au><snm>Mason</snm><fnm>SP</fnm></au><au><snm>Barab&#225;si</snm><fnm>AL</fnm></au><au><snm>Oltvai</snm><fnm>ZN</fnm></au></aug><source>Nature</source><pubdate>2001</pubdate><volume>411</volume><issue>6833</issue><fpage>41</fpage><lpage>42</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/35075138</pubid><pubid idtype="pmpid" link="fulltext">11333967</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>High-betweenness proteins in the yeast protein interaction network</p></title><aug><au><snm>Joy</snm><fnm>MP</fnm></au><au><snm>Brock</snm><fnm>A</fnm></au><au><snm>Ingber</snm><fnm>DE</fnm></au><au><snm>Huang</snm><fnm>S</fnm></au></aug><source>J Biomed Biotechnol</source><pubdate>2005</pubdate><volume>2005</volume><issue>2</issue><fpage>96</fpage><lpage>103</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1155/JBB.2005.96</pubid><pubid idtype="pmcid">1184047</pubid><pubid idtype="pmpid" link="fulltext">16046814</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Centers of complex networks</p></title><aug><au><snm>Wuchty</snm><fnm>S</fnm></au><au><snm>Stadler</snm><fnm>PF</fnm></au></aug><source>J Theor Biol</source><pubdate>2003</pubdate><volume>223</volume><fpage>45</fpage><lpage>53</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0022-5193(03)00071-7</pubid><pubid idtype="pmpid" link="fulltext">12782116</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>Subgraph centrality in complex networks</p></title><aug><au><snm>Estrada</snm><fnm>E</fnm></au><au><snm>Rodr&#237;guez-Vel&#225;zquez</snm><fnm>JA</fnm></au></aug><source>Phys Rev E Stat Nonlin Soft Matter Phys</source><pubdate>2005</pubdate><volume>71</volume><issue>5 Pt 2</issue><fpage>056103</fpage><xrefbib><pubid idtype="pmpid" link="fulltext">16089598</pubid></xrefbib></bibl><bibl id="B25"><title><p>Power and Centrality: A Family of Measures</p></title><aug><au><snm>Bonacich</snm><fnm>P</fnm></au></aug><source>The American Journal of Sociology</source><pubdate>1987</pubdate><volume>92</volume><issue>5</issue><fpage>1170</fpage><lpage>1182</lpage><xrefbib><pubid idtype="doi">10.1086/228631</pubid></xrefbib></bibl><bibl id="B26"><title><p>Rethinking centrality: Methods and examples</p></title><aug><au><snm>Stephenson</snm><fnm>K</fnm></au><au><snm>Zelen</snm><fnm>M</fnm></au></aug><source>Social Networks</source><pubdate>1989</pubdate><volume>11</volume><fpage>1</fpage><lpage>37</lpage><xrefbib><pubid idtype="doi">10.1016/0378-8733(89)90016-6</pubid></xrefbib></bibl><bibl id="B27"><title><p>A novel functional module detection algorithm for protein-protein interaction networks</p></title><aug><au><snm>Hwang</snm><fnm>W</fnm></au><au><snm>Cho</snm><fnm>Y</fnm></au><au><snm>Zhang</snm><fnm>A</fnm></au><au><snm>Ramanathan</snm><fnm>M</fnm></au></aug><source>Algorithms Mol Biol</source><pubdate>2006</pubdate><volume>12</volume><fpage>1</fpage><lpage>24</lpage></bibl><bibl id="B28"><title><p>Topological structure analysis of the protein-protein interaction network in budding yeast</p></title><aug><au><snm>Bu</snm><fnm>D</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2003</pubdate><volume>31</volume><fpage>2443</fpage><lpage>2450</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg340</pubid><pubid idtype="pmcid">154226</pubid><pubid idtype="pmpid" link="fulltext">12711690</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Modifying the DPClus algorithm for identifying protein complexes based on new topological structures</p></title><aug><au><snm>Li</snm><fnm>M</fnm></au><au><snm>Chen</snm><fnm>Je</fnm></au><au><snm>Wang</snm><fnm>Jx</fnm></au><etal/></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>398</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-398</pubid><pubid idtype="pmcid">2570695</pubid><pubid idtype="pmpid" link="fulltext">18816408</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Protein complexes and functional modules in molecular networks</p></title><aug><au><snm>Spirin</snm><fnm>V</fnm></au><au><snm>Mirny</snm><fnm>L</fnm></au></aug><source>PNAS</source><pubdate>2003</pubdate><volume>100</volume><fpage>12123</fpage><lpage>12128</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.2032324100</pubid><pubid idtype="pmcid">218723</pubid><pubid idtype="pmpid" link="fulltext">14517352</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Identification of Overlapping Functional Modules in Protein Interaction Networks: Information Flow-based Approach</p></title><aug><au><snm>Cho</snm><fnm>YR</fnm></au><au><snm>Hwang</snm><fnm>W</fnm></au><au><snm>Zhang</snm><fnm>A</fnm></au></aug><source>ICDMW '06</source><publisher>Washington, DC, USA: IEEE Computer Society</publisher><pubdate>2006</pubdate><fpage>147</fpage><lpage>152</lpage></bibl><bibl id="B32"><title><p>Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources</p></title><aug><au><snm>Huang</snm><fnm>DW</fnm></au><au><snm>Sherman</snm><fnm>BT</fnm></au><au><snm>Lempicki</snm><fnm>RA</fnm></au></aug><source>Nat Protoc</source><pubdate>2009</pubdate><volume>4</volume><fpage>44</fpage><lpage>57</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">19131956</pubid></xrefbib></bibl><bibl id="B33"><title><p>A multilevel data integration resource for breast cancer study</p></title><aug><au><snm>Mosca</snm><fnm>E</fnm></au><au><snm>Alfieri</snm><fnm>R</fnm></au><au><snm>Merelli</snm><fnm>I</fnm></au><etal/></aug><source>BMC Syst Biol</source><pubdate>2010</pubdate><volume>4</volume><fpage>76</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1752-0509-4-76</pubid><pubid idtype="pmcid">2900226</pubid><pubid idtype="pmpid" link="fulltext">20525248</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Virtual identification of essential proteins within the protein interaction network of yeast</p></title><aug><au><snm>Estrada</snm><fnm>E</fnm></au></aug><source>Proteomics</source><pubdate>2006</pubdate><volume>6</volume><fpage>35</fpage><lpage>40</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/pmic.200500209</pubid><pubid idtype="pmpid" link="fulltext">16281187</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Differences in the evolutionary history of disease genes affected by dominant or recessive mutations</p></title><aug><au><snm>Furney</snm><fnm>SJ</fnm></au><au><snm>Alb&#224;</snm><fnm>MM</fnm></au><au><snm>L&#243;pez-Bigas</snm><fnm>N</fnm></au></aug><source>BMC Genomics</source><pubdate>2006</pubdate><volume>7</volume><fpage>165</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-7-165</pubid><pubid idtype="pmcid">1534034</pubid><pubid idtype="pmpid" link="fulltext">16817963</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series</p></title><aug><au><snm>Desmedt</snm><fnm>C</fnm></au><au><snm>Piette</snm><fnm>F</fnm></au><au><snm>Loi</snm><fnm>S</fnm></au><etal/></aug><source>Clin Cancer Res</source><pubdate>2007</pubdate><volume>13</volume><issue>11</issue><fpage>3207</fpage><lpage>3214</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1158/1078-0432.CCR-06-2765</pubid><pubid idtype="pmpid" link="fulltext">17545524</pubid></pubidlist></xrefbib></bibl></refgrp>
   </bm>
</art>