<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1755-8794-1-28</ui>
   <ji>1755-8794</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Pathway analysis reveals functional convergence of gene expression profiles in breast cancer</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Shen</snm>
               <fnm>Ronglai</fnm>
               <insr iid="I1"/>
               <email>shenr@mskcc.org</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Chinnaiyan</snm>
               <mi>M</mi>
               <fnm>Arul</fnm>
               <insr iid="I2"/>
               <email>arul@med.umich.edu</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Ghosh</snm>
               <fnm>Debashis</fnm>
               <insr iid="I3"/>
               <email>ghoshd@psu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Pathology and Urology, University of Michigan, Ann Arbor, MI, USA</p>
            </ins>
            <ins id="I3">
               <p>Departments of Statistics and Public Health Sciences, Penn State University, University Park, PA, USA</p>
            </ins>
         </insg>
         <source>BMC Medical Genomics</source>
         <issn>1755-8794</issn>
         <pubdate>2008</pubdate>
         <volume>1</volume>
         <issue>1</issue>
         <fpage>28</fpage>
         <url>http://www.biomedcentral.com/1755-8794/1/28</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18588682</pubid>
               <pubid idtype="doi">10.1186/1755-8794-1-28</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>04</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>27</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>27</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Shen et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>A recent study has shown high concordance of several breast-cancer gene signatures in predicting disease recurrence despite minimal overlap of the gene lists. It raises the question if there are common themes underlying such prediction concordance that are not apparent on the individual gene-level. We therefore studied the similarity of these gene-signatures on the basis of their functional annotations.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We found the signatures did not identify the same set of genes but converged on the activation of a similar set of oncogenic and clinically-relevant pathways. A clear and consistent pattern across the four breast cancer signatures is the activation of the estrogen-signaling pathway. Other common features include BRCA1-regulated pathway, reck pathways, and insulin signaling associated with the ER-positive disease signatures, all providing possible explanations for the prediction concordance.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>This work explains why independent breast cancer signatures that appear to perform equally well at predicting patient prognosis show minimal overlap in gene membership.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Many studies have demonstrated the ability of using gene-expression "signatures" derived from DNA microarray data to define cancer subtypes, predict disease recurrence, and guide treatment decisions. In breast cancer, van't Veer <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> derived a 70-gene profile to predict a patient's risk of developing distant metastases. Perou <it>et al</it>. <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> and Sorlie <it>et al</it>. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> developed an intrinsic-subtype signature that classifies breast tumors into molecular subtypes showing distinct differences in prognosis. From a cancer biology perspective, Chang <it>et al</it>. <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> studied the links between the wound healing process and cancer progression. Based on the expression pattern of a wound-response signature of 512 genes, they classify a tumor to have either activated or quiescent response and found this to be a significant prognostic predictor of tumor metastasis. These are promising results and a few of these signatures have begun to be assessed in clinical settings. Two questions have often been asked: (1) are these signatures identifying the same set of genes and (2) will they generate similar prediction performance when tested in new data sets?</p>
         <p>The answer to the first question has been discouraging. Any pair of these signatures share only a few common genes. Possible reasons have been suggested including the differences in patient cohort characteristics (such as the distribution of age or stage of the disease), lack of comparability and reproducibility of the data generated using different microarray platforms, and varying statistical procedures used to generate the gene list. Nevertheless, Ein-Dor <it>et al</it>. <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> showed that the inconsistency still exists when eliminating all three differences. In particular, the authors repeated the same analysis in a single data set and identified many lists of genes equally predictive of the outcome. Any two of these gene lists share only a small number of genes. In another study by Son <it>et al</it>. <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, the authors reported that any randomly selected subgroup of around 100 differentially expressed genes generates similar hierarchical clustering results in the same data set.</p>
         <p>Ein-Dor <it>et al</it>. <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> further suggested perhaps the main source of the problem lies in the small sample size and large number of genes the signatures were derived from. For several published breast cancer data sets, the authors estimated that several thousands of samples would be needed to achieve a typical gene overlap of 50%. On the other hand, the problem is compounded by analyzing and interpreting genes in isolation. A common approach to gene selection involves selecting a handful of top-ranking genes that best differentiate sample classes (such as tumor vs. normal tissue) or are most predictive of clinical outcome. The univariate selection procedure ignores correlation between genes. The biological and statistical validity of such assumption seems tenuous. As a result, gene-set based approaches have emerged in recent years to identify sets of biologically related genes that are deregulated as a group. Examples of gene-set analysis include the Gene Set Enrichment Analysis (GSEA) <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, Significance Analysis of Function and Expression (SAFE) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, and the globaltest package <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> These methods focus on groups of genes that share common biological functions such as cell cycle regulation; metabolic or signaling pathways defined by Gene Ontology (GO); online databases such as BioCarta, KEGG and signaling data base; or a literature-defined gene set subject to experimental perturbations such as a drug treatment or an oncogene-activation. In addition, Rhodes <it>et al</it>. <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> introduced a Molecular Concepts Map (MCM) providing an expanded analytic framework to explore the network of relationships among biologically related gene sets.</p>
         <p>The motivation of this study came from a recent paper by Fan <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, which addresses the second question described above. The authors demonstrated a high degree of prediction concordance of five breast cancer gene-signatures despite minimal gene-wise overlap. In an independent data set of 295 tumors, the authors showed that the intrinsic subtypes <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> of basal-like, HER2+/ER-, and luminal B were consistently classified as poor 70-gene profile <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> prognosis, activated wound response <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and high recurrence score <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. It raises the question that perhaps the gene-overlap is not the most relevant measure of robustness and reproducibility of the gene-signatures. There may be common themes shared across these signatures that are not apparent on the individual gene level. As an example, the cell cycle gene Cyclin E1 (CCNE1) was included in the 70-gene profile while Cyclin E2 (CCNE2) in the intrinsic subtype signature. The two signatures apparently share commonality in the activation of the Cyclin family genes. For another example, ERBB2 and EGFR are both receptor tyrosine kinase involved in estrogen pathway. Inclusion of one or the other in two different signatures apparently converges at the pathway level both indicating the activation of the estrogen-signaling pathway.</p>
         <p>In this study, we assess the potential functional convergence of these gene-signatures on the basis of activated oncogenic pathways. This involves first annotating each gene-signature to identify significantly enriched functional modules (e.g., cell growth, response to estrogen, myb-regulated pathways, etc.). Definition of the modules can be based on Gene Ontology (GO) terms, online pathway databases such as BioCarta and KEGG, or literature-defined concepts. In the next step, the overlapping functional modules are obtained by intersecting the annotated sets. We investigated six breast cancer signatures (four of which were compared in Fan <it>et al</it>. <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>) that share high prediction concordance. We found eighteen common modules including estrogen-signaling, responses to tamoxifen treatment, and BRCA1 expression. The degree of the functional overlap across the six BR-signatures is highly significant (P = 0.0002) under a bootstrapped null distribution.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Prediction concordance across five breast-cancer gene-signatures</p>
            </st>
            <p>In a similar fashion as in <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, we cross-tabulated the prediction results of the gene-signatures listed in Table <tblr tid="T1">1</tblr> in the 295 breast cancer patients in the van de Vijver study <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In Table <tblr tid="T2">2</tblr>, all the signatures consistently classified the basal-like and HER2/ER- subtype tumors as having high risk of recurrence outcome. The 70-gene profile and wound-response signatures both classify luminal B subtype to be a low risk group, while the meta-signature classifies the luminal A and the normal-like subtypes as low risk groups. Overall, the signatures showed a certain degree of prediction concordance. The kappa coefficient measuring the classification agreement across the signatures is estimated to be 0.67.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Breast cancer gene-signatures.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Gene-signature</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of genes</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>number of samples</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Experiment summary</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1. 70-gene profile <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>70</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                     <c ca="left">
                        <p>Inkjet oligonucleotide array on 25,000 genes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2. Wound-response <abbrgrp><abbr bid="B4">4</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>512</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="left">
                        <p>cDNA microarrays profiled over 36,000 genes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3. Intrinsic subtype <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>427</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                     <c ca="left">
                        <p>cDNA microarrays on a core set of 8,102 genes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4. meta-90 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>90</p>
                     </c>
                     <c ca="center">
                        <p>305</p>
                     </c>
                     <c ca="left">
                        <p>Integrative analysis of 4 microarray studies on a set of 2,555 genes</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3" ca="center">
                        <p>ER+ signature</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5. Recurrence score <abbrgrp><abbr bid="B14">14</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>2892</p>
                     </c>
                     <c ca="left">
                        <p>RT-PCR on 250 genes selected from the literature</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6. Wang ER+ profile <abbrgrp><abbr bid="B18">18</abbr></abbrgrp></p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="left">
                        <p>Affymetrix GeneChips on 22,000 transcripts</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Classification concordance of the breast cancer gene signatures (Kappa coefficient = 0.67)</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Intrinsic Subtype</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No. of Patients</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>70-Gene Profile</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Wound Response</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Meta90</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Patients</p>
                     </c>
                     <c ca="center">
                        <p>Classification</p>
                     </c>
                     <c ca="center">
                        <p>No. of Patients</p>
                     </c>
                     <c ca="center">
                        <p>Classification</p>
                     </c>
                     <c ca="center">
                        <p>No. of Patients</p>
                     </c>
                     <c ca="center">
                        <p>Classification</p>
                     </c>
                     <c ca="center">
                        <p>No. of Patients</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Basal-like</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>Good</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>Quiescent</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>Low</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Poor</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>Activated</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>High</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Luminal A</p>
                     </c>
                     <c ca="center">
                        <p>91</p>
                     </c>
                     <c ca="center">
                        <p>Good</p>
                     </c>
                     <c ca="center">
                        <p>69</p>
                     </c>
                     <c ca="center">
                        <p>Quiescent</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>Low</p>
                     </c>
                     <c ca="center">
                        <p>89</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Poor</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>Activated</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>High</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Luminal B</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                     <c ca="center">
                        <p>Good</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>Quiescent</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>Low</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Poor</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>Activated</p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="center">
                        <p>High</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>HER2+ and ER-</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>Good</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>Quiescent</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>Low</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Poor</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                     <c ca="center">
                        <p>Activated</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>High</p>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Normal-like</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>Good</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>Quiescent</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>Low</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Poor</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>Activated</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>High</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Common "oncogenic" sets underpinning breast cancer outcome prediction</p>
            </st>
            <p>For pairs of the six signatures, there is a fair amount of overlapping literature concepts (MCMs). Many of the overlaps are highly significant (Figure <figr fid="F1">1A</figr>). For example, there is a set of 142 enriched MCM modules shared between the 70-gene profile and the wound-response signature (<it>P </it>&lt; 0.00001) while only two genes were identified by both. Furthermore, signatures 1&#8211;3 showed marginal significance in metabolic and signaling pathway overlaps (Figure <figr fid="F1">1B</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Pair-wise functional overlap of the six breast cancer gene-signatures</p>
               </caption>
               <text>
                  <p><b>Pair-wise functional overlap of the six breast cancer gene-signatures</b>. 1. 70-gene profile 2. Wound response 3. Intrinsic subtype 4. Meta90 5. Recurrence score 6. Wang ER+ profile. A. The number of overlapping literature-defined oncogenic concepts (MCM) and the corresponding P-value heatmap indicating the significance of the overlap under bootstrapped null distribution. B. The number of overlapping pathway sets (MsigDB) and the corresponding P-value heatmap.</p>
               </text>
               <graphic file="1755-8794-1-28-1"/>
            </fig>
            <p>We found 18 common MCMs (<it>P </it>= 0.0002) and 5 common metabolic and signaling pathways (<it>P </it>= 0.04) across signatures 1&#8211;4. Table <tblr tid="T3">3</tblr> and <tblr tid="T4">4</tblr> list these common sets ordered by the overall significance of enrichment (summarized hypergeometric test P-value adjusted for multiple testing). Among the top are deregulated genes in androgen-sensitive prostate cancer cell lines in response to MSA (MCM 258), Myb-regulated transcriptional changes in the estrogen-dependent human breast cancer cell line MCF7 (MCM 458), several MCMs comprising responsive genes upon antiestrogen hormonal treatment (MCM 691, 379, 375, 673). Clearly a dominant common characteristic underpinning the four breast-cancer signatures is closely related to the estrogen-receptor status of the tumor which is a main prognostic factor in breast cancer. Another common prognostic set of interest is response to BRCA1 expression (MCM513), which many studies have shown a characteristic of sporadic basal-like cancers.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Eighteen common literature-defined oncogenic concepts (MCM) across the four breast cancer gene signatures (significance of overlap, P = 0.0002)</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <b>70-gene profile</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Wound Response</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Intrinsic Subtype</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Meta-90</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Common GeneSet</b>
                        </p>
                     </c>
                     <c cspan="4" ca="center">
                        <p>
                           <b>No. of mapped genes (Enrichment p-value**)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>MCM size</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Description</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM258</p>
                     </c>
                     <c ca="left">
                        <p>7 (3e-04)</p>
                     </c>
                     <c ca="left">
                        <p>25 (2e-04)</p>
                     </c>
                     <c ca="left">
                        <p>19 (0.21)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.09)</p>
                     </c>
                     <c ca="left">
                        <p>350</p>
                     </c>
                     <c ca="left">
                        <p>Downregulated genes in prostate cancer cells in response to MSA (full list)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM458</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.16)</p>
                     </c>
                     <c ca="left">
                        <p>24 (1e-04)</p>
                     </c>
                     <c ca="left">
                        <p>34 (0.005)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.24)</p>
                     </c>
                     <c ca="left">
                        <p>322</p>
                     </c>
                     <c ca="left">
                        <p>Differentially expressed genes in MCF7 cells expressing Myb</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM396</p>
                     </c>
                     <c ca="left">
                        <p>10 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>89 (2e-04)</p>
                     </c>
                     <c ca="left">
                        <p>117 (0.005)</p>
                     </c>
                     <c ca="left">
                        <p>20 (0.21)</p>
                     </c>
                     <c ca="left">
                        <p>2265</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in U937 cells expressing the PLZF/RAR fusion protein</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM691</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.1)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.06)</p>
                     </c>
                     <c ca="left">
                        <p>19 (6e-04)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.16)</p>
                     </c>
                     <c ca="left">
                        <p>101</p>
                     </c>
                     <c ca="left">
                        <p>Up-regulated genes in untreated or permanently tamoxifen-treated MaCa 3366/TAM compared with MaCa 3366</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM513</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.22)</p>
                     </c>
                     <c ca="left">
                        <p>24 (6e-04)</p>
                     </c>
                     <c ca="left">
                        <p>29 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>12 (0.04)</p>
                     </c>
                     <c ca="left">
                        <p>375</p>
                     </c>
                     <c ca="left">
                        <p>Differentially expressed genes in EcR-293 cells in response to BRCA1 expression</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM277</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.01)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.02)</p>
                     </c>
                     <c ca="left">
                        <p>5 (0.05)</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.16)</p>
                     </c>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in NCCIT cells in response to Wnt-3A</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM30</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.01)</p>
                     </c>
                     <c ca="left">
                        <p>4 (0.004)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.25)</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.15)</p>
                     </c>
                     <c ca="left">
                        <p>24</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in colorectal cancer cells</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM673</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.01)</p>
                     </c>
                     <c ca="left">
                        <p>7 (0.008)</p>
                     </c>
                     <c ca="left">
                        <p>7 (0.28)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.15)</p>
                     </c>
                     <c ca="left">
                        <p>79</p>
                     </c>
                     <c ca="left">
                        <p>Androgen</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM6209872</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.003)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>5 (0.09)</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.16)</p>
                     </c>
                     <c ca="left">
                        <p>34</p>
                     </c>
                     <c ca="left">
                        <p>Skin</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM349</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.01)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.05)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.25)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.04)</p>
                     </c>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>Downregulated genes in hSNF5/INI1-deficient malignant rhabdoid tumor cell line upon hSNF5/INI1 expression</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM12</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.008)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.25)</p>
                     </c>
                     <c ca="left">
                        <p>7 (0.09)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.14)</p>
                     </c>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>Aniogenic and Non-angiogenic tumours Signature</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM363</p>
                     </c>
                     <c ca="left">
                        <p>5 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>37 (0.007)</p>
                     </c>
                     <c ca="left">
                        <p>46 (0.28)</p>
                     </c>
                     <c ca="left">
                        <p>14 (0.15)</p>
                     </c>
                     <c ca="left">
                        <p>808</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in monocytes in response to IL-10 stimulation for 1 and 4 hours</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM574</p>
                     </c>
                     <c ca="left">
                        <p>4 (0.04)</p>
                     </c>
                     <c ca="left">
                        <p>22 (0.03)</p>
                     </c>
                     <c ca="left">
                        <p>23 (0.27)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>497</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in advanced papillary serous tumor specimens</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM683</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.1)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.06)</p>
                     </c>
                     <c ca="left">
                        <p>11 (0.06)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.17)</p>
                     </c>
                     <c ca="left">
                        <p>111</p>
                     </c>
                     <c ca="left">
                        <p>Downregulated genes wrt 3,5-diaryl-1,2,4-oxadiazole (MX-126374)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM379</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>7 (0.07)</p>
                     </c>
                     <c ca="left">
                        <p>12 (0.12)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.29)</p>
                     </c>
                     <c ca="left">
                        <p>129</p>
                     </c>
                     <c ca="left">
                        <p>Unique genes regulated by tamoxifen, but not estradiol in osteosarcoma cells</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM1067</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.05)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.15)</p>
                     </c>
                     <c ca="left">
                        <p>5 (0.27)</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.21)</p>
                     </c>
                     <c ca="left">
                        <p>64</p>
                     </c>
                     <c ca="left">
                        <p>Upregulated genes in immmortilized epithelial cells in respense to Ad5-GFP infection</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM375</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.13)</p>
                     </c>
                     <c ca="left">
                        <p>6 (0.12)</p>
                     </c>
                     <c ca="left">
                        <p>10 (0.14)</p>
                     </c>
                     <c ca="left">
                        <p>2 (0.27)</p>
                     </c>
                     <c ca="left">
                        <p>127</p>
                     </c>
                     <c ca="left">
                        <p>Unique genes regulated by estradiol, but not raloxifene in osteosarcoma cells</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MCM402</p>
                     </c>
                     <c ca="left">
                        <p>1 (0.11)</p>
                     </c>
                     <c ca="left">
                        <p>4 (0.25)</p>
                     </c>
                     <c ca="left">
                        <p>8 (0.28)</p>
                     </c>
                     <c ca="left">
                        <p>3 (0.14)</p>
                     </c>
                     <c ca="left">
                        <p>116</p>
                     </c>
                     <c ca="left">
                        <p>Downregulated genes in HepG2 T1 treated cells resulting from MIZ depletion</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>**Hypergeometic test enrichment P-value adjusted for multiple testing.</p>
               </tblfn>
            </tbl>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Five common pathway sets (MsigDB) across the four breast cancer gene signatures (significance of overlap, <it>P </it>= 0.04).</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>70 Gene</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Wound Response</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Intrinsic Subtype</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Meta90</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Common GeneSet</b>
                        </p>
                     </c>
                     <c cspan="4" ca="center">
                        <p>
                           <b>No. of mapped genes (Enrichment p-value**)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Description</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>breast cancer estrogen signaling</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.07)</p>
                     </c>
                     <c ca="center">
                        <p>11 (0.001)</p>
                     </c>
                     <c ca="center">
                        <p>11 (0.22)</p>
                     </c>
                     <c ca="center">
                        <p>4 (0.11)</p>
                     </c>
                     <c ca="left">
                        <p>GEArray</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>EMT DOWN</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.02)</p>
                     </c>
                     <c ca="center">
                        <p>2 (0.19)</p>
                     </c>
                     <c ca="center">
                        <p>4 (0.29)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.21)</p>
                     </c>
                     <c ca="left">
                        <p>Jechlinger et al 2003</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CR DNA MET AND MOD</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.01)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.24)</p>
                     </c>
                     <c ca="center">
                        <p>3 (0.24)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.12)</p>
                     </c>
                     <c ca="left">
                        <p>PNAS 2007</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SA REG CASCADE OF CYCLIN EXPR</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.006)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.12)</p>
                     </c>
                     <c ca="center">
                        <p>2 (0.23)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.07)</p>
                     </c>
                     <c ca="left">
                        <p>SigmaAldrich</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>reckPathway</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.005)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.09)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.28)</p>
                     </c>
                     <c ca="center">
                        <p>1 (0.09)</p>
                     </c>
                     <c ca="left">
                        <p>BioCarta</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>** Hypergeometic test enrichment P-value adjusted for multiple testing.</p>
               </tblfn>
            </tbl>
            <p>Table <tblr tid="T4">4</tblr> listed the five common metabolic and signaling pathways using the functional subset of the MsigDB annotation data. All of the signatures apparently enlisted genes customized on a commercial array platform that represent the breast cancer estrogen signaling pathway [see Additional File <supplr sid="S1">1</supplr>].</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>List of modules genes involved in A. estrogen signaling and B. response to MSA in androgen-dependent prostate cell lines.</p>
               </text>
               <file name="1755-8794-1-28-S1.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>For the gene signatures listed in Table <tblr tid="T1">1</tblr>, it should be pointed out that they were constructed using different types of endpoints, along with differing supervised learning algorithms. In attempting to combine results across the signatures, we make the assumption that there exists an underlying tumorigenetic mechanism that manifests itself in terms of the endpoints used by the different authors. One such mechanism might be tumor metastasis.</p>
         </sec>
         <sec>
            <st>
               <p>ER-positive relapse signatures</p>
            </st>
            <p>Both ER+ relapse-signatures showed evidence of E2F activation, response to Interleukin-6 (IL6), and activation of insulin-signaling pathways, some of which have been reported in the literature to be specific to ER+ disease [see Additional Files <supplr sid="S2">2</supplr> and <supplr sid="S3">3</supplr>]. For example, studies have shown in estrogen-sensitive breast cancer cell lines, the widely used antiestrogen tamoxifen treatment inhibits insulin-signaling. The degree of such inhibition can reflect the effectiveness of the tamoxifen treatment and thus correlate with a patient's risk of recurrence <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>List of the fifty-two common MCM sets shared between the two ER+ gene-signatures.</p>
               </text>
               <file name="1755-8794-1-28-S2.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>List of the five common metabolic and signaling pathway sets (MsigDB) shared between the two ER+ gene-signatures.</p>
               </text>
               <file name="1755-8794-1-28-S3.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Cancer gene-expression signatures derived from microarray experiments are beginning to be tested in clinical trials, while the exact biology that enables these gene-signatures to accurately predict tumor metastasis and patient survival is unclear. Microarray experiments are often limited in power by the small number of samples used to derive a panel of prognostic genes relative to the large number of features on the array. In addition, sets of biologically-related genes are often co-regulated while many feature selection procedures are univariate in nature. As a result, gene-signatures developed by different studies typically share very few common components. A recent study showed high prediction concordance of several breast cancer gene-signatures despite minimal overlap in gene identity. It gave main motivation to investigate common oncogenic themes that may not be apparent at the individual gene level. This study explored this hypothesis by evaluating the functional overlap of the signatures on the basis of annotated gene sets. When the gene signatures are mapped to the deregulated pathway space, two things become clear. First, there is a significant degree of functional overlap in oncogenic and prognostic pathways. Second, many of these common pathways provide plausible explanations of tumor biology through which these signatures predict patient outcome. There are several conclusions to be gleaned from this study. First, this work explains why independent signatures that appear to perform equally well at predicting patient prognosis show minimal overlap in gene membership. This is because such genes are different members of pathways and processes that are relevant to prognosis. Thus, the lack of gene overlap found between the various signatures listed in Table <tblr tid="T1">1</tblr> should not be considered problematic. The implication of our study is that most of these signatures will do well in clinical trials given that they seem to be picking up the same pathway signals. We can thus be assured that the gene lists found by different investigators are consistent, even if they do not contain the same genes.</p>
         <p>Second, the results have suggested that the interpretability and delineation of how diverse cancer gene expression signatures work are more likely attainable at the pathway level rather than the individual gene level. On the other hand, as many studies have already suggested so, feature selection methods need to be based on biologically related gene sets that are deregulated as a group <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. However, it is not a straightforward task to construct a prognostic signature based on pathways that are composed of overlapping sets of genes. New statistical methods need to be established in this area. This is beyond the scope of the study and is currently under investigation.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Table <tblr tid="T1">1</tblr> lists the six BR-signatures that are compared in this study. Fan <it>et al</it>. <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> showed high prediction concordance of signature 1&#8211;3 and signature 5. In addition, a 90-gene meta-signature <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> is included. This signature was derived in a meta-analysis framework by integrating four microarray data sets, which included the van't Veer data set and the Sorlie data set. Another signature included here is the subset of 60-gene profile from Wang <it>et al</it>. <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> that was derived in tumors with estrogen receptor (ER) positive status. The recurrence-score signature <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> is also an ER+ disease signature that has been shown in a clinical trial to be able to identify patients with very low risk of recurrence on hormone therapy using tamoxifen alone, and do not require adjuvant chemotherapy.</p>
         <sec>
            <st>
               <p>Annotation</p>
            </st>
            <p>The set of signature genes were annotated using two different annotation sources:</p>
            <sec>
               <st>
                  <p>Literature-defined module</p>
               </st>
               <p>A collection of 661 literature-defined modules from the Molecular Concept (MCM) database MCM that focuses on human cancer studies. These include gene sets from peer-reviewed publications using microarrays to study gene expression changes subject to experimental perturbation such as drug treatment or candidate gene activation.</p>
            </sec>
            <sec>
               <st>
                  <p>Pathway module</p>
               </st>
               <p>The functional subsets from the molecular signature database or MSigDB GSEA, including modules representing metabolic and signaling pathways imported from online pathway databases such as BioCarta <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, signalling pathway database <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and the Kyoto Encyclopedia of Genes and Genomes (KEGG) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
               <p>Enrichment analysis was performed using hypergeometric tests. In particular, the procedure tests the significance of the proportion of module genes (e.g., estrogen pathway) in the signature being greater than the "population"-proportion of the module genes in the experimental set from which the signature was selected. Multiple testing was adjusted by using the Benjamini-Hochberg procedure <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Notation and methods</p>
            </st>
            <p>For a set of <it>K </it>gene-signatures, let <it>n<sub><it>i </it></sub></it>be the number of genes in signature <it>i </it>and <it>N<sub><it>i </it></sub></it>be the total number of genes in the experimental set from which the signature genes were selected. Furthermore, let <it>J </it>= 661 or 552 denote the number of literature-defined concepts and the number of metabolic and signaling pathways in the two annotation database MCM and MsigDB respectively. For a gene signature, we first perform a module enrichment analysis using a hypergeometric test. As mentioned earlier, the basic idea is to test whether the proportion of the module genes in the signature of size <it>n<sub><it>i </it></sub></it>is significantly larger than the "population"-fraction of the module genes in the experimental set of size <it>N<sub><it>i</it></sub></it>. The <it>j</it>th module is enriched in the <it>i</it>th signature if the hypergeometric test p-value is less than 0.3. Across the <it>K </it>signatures under comparison, this threshold correspond to a p-value of less than 0.05 under a conventional meta-analysis of combining the hypergeometric p-values <inline-formula><m:math name="1755-8794-1-28-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mo>&#8722;</m:mo><m:mn>2</m:mn><m:mstyle displaystyle="true"><m:munderover><m:mo>&#8721;</m:mo><m:mrow><m:mi>i</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>K</m:mi></m:munderover><m:mrow><m:mi>log</m:mi><m:mo>&#8289;</m:mo><m:msub><m:mi>P</m:mi><m:mi>i</m:mi></m:msub></m:mrow></m:mstyle></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeyOeI0IaeGOmaiZaaabCaeaacyGGSbaBcqGGVbWBcqGGNbWzcqWGqbaudaWgaaWcbaGaemyAaKgabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGlbWsa0GaeyyeIuoaaaa@3B29@</m:annotation></m:semantics></m:math></inline-formula> across the four signatures based on a chi-square distribution with 2<it>K </it>degrees of freedom. Let <it>X<sub><it>ij </it></sub></it>be the indicator variable where <it>X</it><sub><it>ij </it></sub>= 1 if the <it>j</it>th module is enriched in the <it>i</it>th (<it>i </it>= 1, ..., <it>K</it>) signature and <it>X</it><sub><it>ij </it></sub>= 0 otherwise. As a result,</p>
            <p>
               <display-formula>
                  <m:math name="1755-8794-1-28-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>i</m:mi>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:mrow>
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>j</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mi>J</m:mi>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>X</m:mi>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mi>j</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mrow>
                           </m:mstyle>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyBa02aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpdaWdXbqaaiabdIfaynaaBaaaleaacqWGPbqAcqWGQbGAaeqaaaqaaiabdQgaQjabg2da9iabigdaXaqaaiabdQeakbqdcqGHris5aaaa@3AE1@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>is the total number of enriched modules in signature <it>i</it>. Then for the set of <it>K </it>signatures, the amount of functional overlap is <inline-formula><m:math name="1755-8794-1-28-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>Y</m:mi><m:mo>=</m:mo><m:mstyle displaystyle="true"><m:munderover><m:mo>&#8721;</m:mo><m:mrow><m:mi>j</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>J</m:mi></m:munderover><m:mrow><m:mstyle displaystyle="true"><m:munderover><m:mo>&#8719;</m:mo><m:mrow><m:mi>i</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>K</m:mi></m:munderover><m:mrow><m:msub><m:mi>X</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow></m:msub></m:mrow></m:mstyle></m:mrow></m:mstyle></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemywaKLaeyypa0ZaaabCaeaadaqeWbqaaiabdIfaynaaBaaaleaacqWGPbqAcqWGQbGAaeqaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabdUealbqdcqGHpis1aaWcbaGaemOAaOMaeyypa0JaeGymaedabaGaemOsaOeaniabggHiLdaaaa@3F7B@</m:annotation></m:semantics></m:math></inline-formula>.</p>
            <p>The significance of overlap is defined as <it>P </it>(<it>Y </it>> <it>y</it><sup><it>obs</it></sup>) under a bootstrapped null distribution. The bootstrap procedure is described elsewhere [see Additional File <supplr sid="S4">4</supplr>].</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Description of algorithm used to test for significance of overlap of datasets.</p>
               </text>
               <file name="1755-8794-1-28-S4.rtf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>We used <it>B </it>= 100,000 in the procedure. The bootstrapped null distribution of <it>Y </it>preserves 1) potential correlation of the signature size <it>n<sub><it>i </it></sub></it>and the number of enriched modules <it>m<sub><it>i</it></sub></it>, and 2) the module-module dependence due to the one-to-many mapping of a gene to the annotation data.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>DG and RS conceived the method and prepared the manuscript. RS performed the analyses. AC contributed to the discussion. All authors have read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We would like to thank D. Rhodes and S. Kalyana-Sundaram for providing the MCM data set. RS, DG, and AMC participated in the conception and design of the study. RS performed the analysis and drafted the manuscript. DG and AMC reviewed the manuscript. RS is supported in part by NCI 2 P30 CA008748-43; DG is supported in part by NIH grant GM72007 and the Huck Institute for Life Sciences; AMC is supported by a Clinical Translational Science Award from the Burroughs Welcome Foundation.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Gene expression profiling predicts clinical outcome of breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>van't Veer</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>HY</fnm>
               </au>
               <au>
                  <snm>Vijver</snm>
                  <mnm>van de</mnm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YDD</fnm>
               </au>
               <au>
                  <snm>Hart</snm>
                  <fnm>AAM</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Peterse</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Kooy</snm>
                  <mnm>van der</mnm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Witteveen</snm>
                  <fnm>AT</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <issue>6871</issue>
            <fpage>530</fpage>
            <lpage>536</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/415530a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11823860</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Molecular portraits of human breast tumours</p>
            </title>
            <aug>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Sorlie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Rijn</snm>
                  <mnm>van de</mnm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jeffrey</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Rees</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Pollack</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Johnsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Akslen</snm>
                  <fnm>LA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>406</volume>
            <issue>6797</issue>
            <fpage>747</fpage>
            <lpage>752</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35021093</pubid>
                  <pubid idtype="pmpid" link="fulltext">10963602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications</p>
            </title>
            <aug>
               <au>
                  <snm>Sorlie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Aas</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Geisler</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Johnsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Rijn</snm>
                  <mnm>van de</mnm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jeffrey</snm>
                  <fnm>SS</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <issue>19</issue>
            <fpage>10869</fpage>
            <lpage>10874</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">58566</pubid>
                  <pubid idtype="pmpid" link="fulltext">11553815</pubid>
                  <pubid idtype="doi">10.1073/pnas.191367098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Gene expression signature of fibroblast serum response predicts human cancer progression: Similarities between tumors and wounds</p>
            </title>
            <aug>
               <au>
                  <snm>Chang</snm>
                  <fnm>HY</fnm>
               </au>
               <au>
                  <snm>Sneddon</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Alizadeh</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Sood</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>West</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chi</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Rijn</snm>
                  <mnm>van de</mnm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
            </aug>
            <source>Plos Biology</source>
            <pubdate>2004</pubdate>
            <volume>2</volume>
            <issue>2</issue>
            <fpage>206</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1371/journal.pbio.0020007</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Outcome signature genes in breast cancer: is there a unique set?</p>
            </title>
            <aug>
               <au>
                  <snm>Ein-Dor</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kela</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Getz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Givol</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>2</issue>
            <fpage>171</fpage>
            <lpage>178</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth469</pubid>
                  <pubid idtype="pmpid" link="fulltext">15308542</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Database of mRNA gene expression profiles of multiple hunian organs</p>
            </title>
            <aug>
               <au>
                  <snm>Son</snm>
                  <fnm>CG</fnm>
               </au>
               <au>
                  <snm>Bilke</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Greer</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Whiteford</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>QR</fnm>
               </au>
               <au>
                  <snm>Cenacchi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Research</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>443</fpage>
            <lpage>450</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">551571</pubid>
                  <pubid idtype="pmpid" link="fulltext">15741514</pubid>
                  <pubid idtype="doi">10.1101/gr.3124505</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Ein-Dor</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zuk</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <issue>15</issue>
            <fpage>5923</fpage>
            <lpage>5928</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1458674</pubid>
                  <pubid idtype="pmpid" link="fulltext">16585533</pubid>
                  <pubid idtype="doi">10.1073/pnas.0601231103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Mootha</snm>
                  <fnm>VK</fnm>
               </au>
               <au>
                  <snm>Mukherjee</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ebert</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Gillette</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Paulovich</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pomeroy</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>43</issue>
            <fpage>15545</fpage>
            <lpage>15550</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1239896</pubid>
                  <pubid idtype="pmpid" link="fulltext">16199517</pubid>
                  <pubid idtype="doi">10.1073/pnas.0506580102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Significance analysis of functional categories in gene expression studies: a structured permutation approach</p>
            </title>
            <aug>
               <au>
                  <snm>Barry</snm>
                  <fnm>WT</fnm>
               </au>
               <au>
                  <snm>Nobel</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Wright</snm>
                  <fnm>FA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>9</issue>
            <fpage>1943</fpage>
            <lpage>1949</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti260</pubid>
                  <pubid idtype="pmpid" link="fulltext">15647293</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>A global test for groups of genes: testing association with a clinical outcome</p>
            </title>
            <aug>
               <au>
                  <snm>Goeman</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Geer</snm>
                  <mnm>van de</mnm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>de Kort</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>van Houwelingen</snm>
                  <fnm>HC</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>1</issue>
            <fpage>93</fpage>
            <lpage>99</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg382</pubid>
                  <pubid idtype="pmpid" link="fulltext">14693814</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Molecular concepts analysis links tumors, pathways, mechanisms, and drugs</p>
            </title>
            <aug>
               <au>
                  <snm>Rhodes</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Kalyana-Sundaram</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tomlins</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Mahavisno</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kasper</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Varambally</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Barrette</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Ghosh</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Varambally</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chinnaiyan</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Neoplasia</source>
            <pubdate>2007</pubdate>
            <volume>9</volume>
            <issue>5</issue>
            <fpage>443</fpage>
            <lpage>454</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1877973</pubid>
                  <pubid idtype="pmpid" link="fulltext">17534450</pubid>
                  <pubid idtype="doi">10.1593/neo.07292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Concordance among gene-expression-based predictors for breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Fan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Wessels</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Weigelt</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nuyten</snm>
                  <fnm>DSA</fnm>
               </au>
               <au>
                  <snm>Nobel</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>van't Veer</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
            </aug>
            <source>New England Journal of Medicine</source>
            <pubdate>2006</pubdate>
            <volume>355</volume>
            <issue>6</issue>
            <fpage>560</fpage>
            <lpage>569</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJMoa052933</pubid>
                  <pubid idtype="pmpid" link="fulltext">16899776</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Shen</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Ghosh</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chinnaiyan</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Bmc Genomics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">544889</pubid>
                  <pubid idtype="pmpid" link="fulltext">15598354</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Paik</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cronin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Baehner</snm>
                  <fnm>FL</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>T</fnm>
               </au>
               <etal/>
            </aug>
            <source>New England Journal of Medicine</source>
            <pubdate>2004</pubdate>
            <volume>351</volume>
            <issue>27</issue>
            <fpage>2817</fpage>
            <lpage>2826</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJMoa041588</pubid>
                  <pubid idtype="pmpid" link="fulltext">15591335</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A gene-expression signature as a predictor of survival in breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Vijver</snm>
                  <mnm>van de</mnm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>van 't Veer</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hart</snm>
                  <fnm>AAM</fnm>
               </au>
               <au>
                  <snm>Voskuil</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Peterse</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>New England Journal of Medicine</source>
            <pubdate>2002</pubdate>
            <volume>347</volume>
            <issue>25</issue>
            <fpage>1999</fpage>
            <lpage>2009</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJMoa021967</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490681</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Tamoxifen interferes with the insulin-like growth factor I receptor (IGF-IR) signaling pathway in breast cancer cells</p>
            </title>
            <aug>
               <au>
                  <snm>Guvakova</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Surmacz</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Cancer Research</source>
            <pubdate>1997</pubdate>
            <volume>57</volume>
            <issue>13</issue>
            <fpage>2606</fpage>
            <lpage>2610</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9205064</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Tamoxifen resistance in breast tumors is driven by growth factor receptor signaling with repression of classic estrogen receptor genomic function</p>
            </title>
            <aug>
               <au>
                  <snm>Massarweh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Osborne</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Creighton</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Qin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tsimelzon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weiss</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rimawi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schiff</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Cancer Res</source>
            <pubdate>2008</pubdate>
            <volume>68</volume>
            <issue>3</issue>
            <fpage>826</fpage>
            <lpage>833</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1158/0008-5472.CAN-07-2707</pubid>
                  <pubid idtype="pmpid" link="fulltext">18245484</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Gene expression profiles and molecular markers to predict distant metastasis of early stage breast cancers</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Atkins</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jatkoe</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Talantov</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sieuwerts</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Timmermans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Berns</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Klijn</snm>
                  <fnm>J</fnm>
               </au>
               <etal/>
            </aug>
            <source>Breast Cancer Research and Treatment</source>
            <pubdate>2003</pubdate>
            <volume>82</volume>
            <fpage>S120</fpage>
            <lpage>S120</lpage>
         </bibl>
         <bibl id="B19">
            <title>
               <p>BioCarta</p>
            </title>
            <url>http://www.biocarta.com</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Signalling pathway database</p>
            </title>
            <url>http://www.grt.kyushu-u.ac.jp/spad</url>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Kyoto Encyclopedia of Genes and Genomes</p>
            </title>
            <url>http://www.genome.jp/kegg/</url>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Controlling the False Discovery Rate &#8211; a Practical and Powerful Approach to Multiple Testing</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hochberg</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Journal of the Royal Statistical Society Series B-Methodological</source>
            <pubdate>1995</pubdate>
            <volume>57</volume>
            <issue>1</issue>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
      </refgrp>
      <sec>
         <st>
            <p>Pre-publication history</p>
         </st>
         <p>The pre-publication history for this paper can be accessed here:</p>
         <p>
            <url>http://www.biomedcentral.com/1755-8794/1/28/prepub</url>
         </p>
      </sec>
   </bm>
</art>
