<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-10-S1-S54</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Zhu</snm>
               <fnm>Dongxiao</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>dzhu@cs.uno.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA</p>
            </ins>
            <ins id="I2">
               <p>Research Institute for Children, Children's Hospital, New Orleans, LA 70118, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <supplement>
            <title>
               <p>Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009)</p>
            </title>
            <editor>Michael Q Zhang, Michael S Waterman and Xuegong Zhang</editor>
            <note>Research</note>
         </supplement>
         <conference>
            <title>
               <p>The Seventh Asia Pacific Bioinformatics Conference (APBC 2009)</p>
            </title>
            <location>Beijing, China</location>
            <date-range>13&#8211;16 January 2009</date-range>
            <url>http://bioinfo.au.tsinghua.edu.cn/apbc2009/</url>
         </conference>
         <issn>1471-2105</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>Suppl 1</issue>
         <fpage>S54</fpage>
         <url>http://www.biomedcentral.com/1471-2105/10/S1/S54</url>
         <xrefbib>
            
         <pubidlist><pubid idtype="pmpid">19208157</pubid><pubid idtype="doi">10.1186/1471-2105-10-S1-S54</pubid></pubidlist></xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>30</day>
               <month>1</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Zhu; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The gene shaving algorithm and many other clustering algorithms identify gene clusters showing high variation across samples. However, gene expression in many signaling pathways show only modest and concordant changes that fail to be identified by these methods. The increasingly available signaling pathway prior knowledge provide new opportunity to solve this problem.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We propose an innovative semi-supervised gene clustering algorithm, where the original gene shaving algorithm was extended and generalized so that prior knowledge of signaling pathways can be incorporated. Different from other methods, our method identifies gene clusters showing concerted and modest expression variation as well as strong expression correlation. Using available pathway gene sets as prior knowledge, whether complete or incomplete, our algorithm is capable of forming tightly regulated gene clusters showing modest variation across samples. We demonstrate the advantages of our algorithm over the original gene shaving algorithm using two microarray data sets. The stability of the gene clusters was accessed using a jackknife approach.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Our algorithm represents one of the first clustering algorithms that is particularly designed to identify signaling pathways of low and concordant gene expression variation. The discriminating power is achieved by manufacturing a principal component enriched by signaling pathways.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Gene clustering that assigns group membership(s) to each gene is a widespread pattern extraction technique. Genes sharing the same membership are often hypothesized to be regulated by the same defined or undefined genomic influence, such as cellular pathway. Model-free clustering techniques such as K-means and hierarchical clustering <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp> are widely used. One limitation of these approaches, as pointed out by many researchers, e.g. <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, is that each gene can only belong to a single cluster. These types of gene clustering algorithms are thus called mutually exclusive clustering. In the context of cellular pathways, they assume that one gene can only be regulated by one pathway at a time, which apparently, is not the case. Model-based clustering or soft clustering <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> provides mechanisms to relax this stringent assumption by introducing "probabilistic" or "fuzzy" memberships. However, these "soft" memberships do not biologically account for the fact that one gene is often simultaneously regulated by multiple genomic influences.</p>
         <p>Singular value decomposition (SVD) <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> has shown great promise towards deconvolving channels of genomic influence. Assuming rows of data matrix correspond to genes and columns correspond to physiological/genetic conditions under which the gene expression abundance was interrogated using gene chips, the SVD factors the data matrix into three matrices. The first matrix, which contains most of information, is called a gene coefficient matrix where each column (principal component, PC) defines a preliminary gene cluster that might be regulated by a specific genomic influence. We will describe more details of SVD in the method section. SVD has been repeatedly shown to be able to deconvolve the observed gene expression signal into a composite of multiple overlapping genomic influences, many of them correspond to signaling pathways <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>Thus SVD provides a methodology base for non-mutually exclusive clustering. The gene clusters generated by SVD are often preliminary due to the fact that many non-relevant genes might contaminate the PC's that define gene clusters. Hastie et al <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> proposed removing non-relevant genes in an iterative fashion, in which the least correlated genes with the leading PC is treated as non-relevant. The gene shaving algorithm quickly became an important tool in the pattern discovery arsenal. It iteratively searches for clusters of genes showing high variation across the samples, and correlation across the genes <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. The former is achieved by working with the leading PC and the latter is achieved by iteratively discarding non-relevant genes to the cluster. There are other types of non-mutually exclusive clustering methods as well, such as plaid model <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>The underlying assumption of the gene shaving algorithm is that the leading PC accounting for the largest portion of variation is always of exclusive interest to the investigator <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B13">13</abbr></abbrgrp>. Consequently the algorithm iteratively refines the first gene cluster defined by the first PC by shaving off a proportion of genes that are least correlated with the leading PC. The second gene cluster is formed by performing the same procedure on the orthogonal data, resulting from the residuals of regression, and so on. However, the underlying assumption that the whole algorithm is based on is not always true for every single case. In fact, gene expression in many signaling pathways show modest but concordant changes. The gene shaving algorithm would most likely to fail in these cases by working exclusively with the leading PC.</p>
         <p>Gene set based methods, such as Gene Set Enrichment Analysis (GSEA) were designed to overcome this limitation. Since it's first introduction in 2003 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, it has been widely applied to interpret genome-wide expression profiles <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. However, the approach only ranks pre-compiled gene sets according to the relevancy to the data and does not predict any new genes in the gene sets. Therefore, it strictly depends on the availability and validity of <it>a priori </it>defined gene sets. In reality a gene set is not always available in a complete and accurate format. What is typically available is partial pathway learned from empirical experimental studies.</p>
         <p>We seek a seamless combination of the strengths of the two methodological frameworks. We manufacture a PC that is most enriched by prior knowledge (signaling pathway of interest). Performing the analysis iteratively we will be able to identify the gene cluster showing modest but concordant changes. In many cases, we are further interested in finding genes that are concordantly up or down-regulated by genomic influences. Therefore, it might be beneficial to turn our attention not only to the PC that the prior knowledge is most enriched, but also to the positive PC and the negative PC separatively. The hypothesis can be substantiated by previous works that positive and negative PC's can be enriched by completely different biological functions, e.g. <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>In our work, we eliminate non-relevant genes iteratively following and improving the procedure used in the gene shaving algorithm <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. In each iteration, a weighted average expression profile was calculated and used as the seed profile to rank genes. With the heuristic removal of non-relevant genes at the beginning of the iterations, and some relevant genes by the end, the enrichment of prior knowledge has seen a sharp increase, followed by a gradual decrease. We then propose a trace-back step to retrieve the gene cluster in which enrichment of prior knowledge is maximized (Figure <figr fid="F1">1</figr>).</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>The schematic diagram of the proposed algorithm</p>
            </caption>
            <text>
               <p><b>The schematic diagram of the proposed algorithm</b>. "Enrichment test" means to determine the PC(s) that are most enriched by a prior knowledge gene set. <it>&#945;</it>% is set to 10% following Hastie et al <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
            </text>
            <graphic file="1471-2105-10-S1-S54-1"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>We aim to demonstrate that the proposed algorithm is capable of identifying tightly regulated gene sets showing modest and concerted variation using incomplete prior knowledge and real-world microarray data set. Ground truth, which indicates a "complete" gene set used as precondition for applying GSEA algorithm <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B16">16</abbr></abbrgrp>, is desirable to demonstrate the claimed advantages of our algorithm. It is often not available. Therefore, we use four "high-amplitude" and four "low-amplitude" gene sets identified in <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> as ground truth to evaluate the ability of our algorithm to recover them using subsets of a variety of lengths. The high and low amplitude genes used in this example are well-studied genes in the cell cycle, and many of them are co-regulated by a number of signaling pathways <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. We then use incomplete prior knowledge supplied by our collaborator and apply our algorithm to predict new WNT and NOTCH pathway genes in the somitogenesis process.</p>
         <sec>
            <st>
               <p>Recovering low and high amplitude gene sets using incomplete prior knowledge</p>
            </st>
            <p>As a proof of concept, we first analyzed a cell cycle data set originally reported in <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The data set consists of whole yeast genome expression profiles interrogated over two full cell cycles (20 evenly spaced time points) synchronized by elutriation. We considered the same 308 genes as in the paper derived using Fourier transform. In each of the four gene sets, genes were further classified into high-amplitude and low-amplitude groups according to magnitude of variation. The processed data are available from the authors' website at <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <p>We treated the high-amplitude genes and low-amplitude genes in each gene set as "complete", as assumed in classical GSEA analysis. We sampled subsets of increasing sizes from 5 to complete (e.g. 40) with a step size of 5. In each step experiment, we generated 500 subsets of the same size (with replicates), and for each subset we applied our algorithm to demonstrate its ability to recover the full gene set using the hypergeometric test explained in method section. The <it>P</it>-values of the tests were used as a measure for such an ability. For visualization convenience, the <it>P</it>-values were negatively log-transformed and higher value corresponds to better recovery of the complete gene set.</p>
            <p>The high-amplitude and low-amplitude complete gene sets were plotted in Figure <figr fid="F2">2a</figr> (upper panel of Figure <figr fid="F2">2</figr>). In both Fig <figr fid="F2">2b</figr> (lower left panel of Figure <figr fid="F2">2</figr>) and Fig <figr fid="F2">2c</figr> (low right panel of Figure <figr fid="F2">2</figr>), the ability of recovering the complete gene set (ground truth) was plotted against the increasing subset size respectively. The observed monotonic increase indicates that the larger the subsets (prior knowledge) are, the more capable of recovering the complete gene set. It is worth mentioning that Figure <figr fid="F2">2b</figr> demonstrates the capability of our algorithm to recover low-amplitude gene set, and Figure <figr fid="F2">2c</figr> demonstrates the capability of the gene shaving algorithm <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> to recover high-amplitude gene set.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Demonstration of the claimed advantages of our algorithm using the "ground truth" reported in <abbrgrp><abbr bid="B17">17</abbr></abbrgrp></p>
               </caption>
               <text>
                  <p><b>Demonstration of the claimed advantages of our algorithm using the "ground truth" reported in </b><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. (a) Plots of expression profiles of high-amplitude and low-amplitude gene sets. (b) Evaluating the capability of our algorithm to recover a complete low-amplitude gene set. The gene shaving shaving algorithm <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> fails in this case because it exclusively works with the leading PC. X-axis represents the increasing sizes of the subsets, and Y-axis represents the -<it>log2P </it>of the enrichment, indicating increased capacity of recovering a complete gene set. (c) Evaluating the capability of gene shaving algorithm <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> to recover a complete high-amplitude gene set.</p>
               </text>
               <graphic file="1471-2105-10-S1-S54-2"/>
            </fig>
            <p>Our algorithm can be viewed as an generalization of the gene shaving algorithm. Gene shaving algorithm exclusively works with the leading PC. Therefore, it is only capable of identifying high-amplitude signaling pathways. Our algorithm adaptively works with the PC that is most enriched by prior knowledge. Therefore, it is capable of identifying either high-amplitude or low-amplitude signaling pathways wherever prior knowledge is available. Comparing Figure <figr fid="F2">2b</figr> to Figure <figr fid="F2">2c</figr> more closely, it is evident that our algorithm recovers low-amplitude gene sets even better than gene shaving algorithm recovers high-amplitude ones. This is demonstrated by uniformly larger mean values and overall smaller variance on the vertical axis. The results of analyzing other complete gene sets of appropriate size lead to the same conclusion (see additional file <supplr sid="S1">1</supplr>). The proof-of-concept analysis provided compelling evidence that our algorithm is particularly suitable for identifying sets of tightly regulated genes with modest variation.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Supplemental figures.</p>
               </text>
               <file name="1471-2105-10-S1-S54-S1.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Predicting WNT and NOTCH pathway genes using prior knowledge</p>
            </st>
            <sec>
               <st>
                  <p>Microarray data and prior knowledge</p>
               </st>
               <p>We then proceed to re-analyze microarray data originally reported in Dequeant et al <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> to predict genes in WNT and NOTCH pathways. In this experiment, the genome-wide gene expression was interrogated over 17 developmental stages using Affymetrix GeneChip 430A. Using the Lomb-Scargle periodogram <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> the top 687 genes were used for gene clustering so that all prior knowledge genes are included. Microrarray data are available at ArrayExpress at <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
               <p>Prior knowledge corresponds to a list of experimentally validated cyclic genes regulated by the segmentation clock, a molecular oscillator acting during somitogenesis <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. The segmentation clock is a set of periodic processes linked to the formation of the vertebrate embryo segments (somites) that give rise to the segments in the adult body plan of a vertebrate animal. Malfunction of cyclic genes are the direct cause of many developmental diseases, such as Noonan syndrome and truncated tail <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Therefore, predicted cyclic genes are potential human disease genes. In particular, we have incomplete sets of 11 genes in the WNT pathway, and 9 genes in the NOTCH pathway as our prior knowledge. Our objective is to predict more WNT and NOTCH genes using prior knowledge, microarray data and our proposed algorithm.</p>
            </sec>
            <sec>
               <st>
                  <p>Finding the most enriched PC using prior knowledge</p>
               </st>
               <p>In each iteration of our algorithm, we search for the PC that is most enriched by known WNT and NOTCH genes. We filtered the gene coefficients in each PC using the cutoff and tested enrichment of known pathway genes using the hypergeometric test (see method section). Figure <figr fid="F3">3</figr> shows what happened in the first iteration where all 11 known WNT genes and all 9 known NOTCH genes are included in the second PC (enrichment level is <it>E </it>- 06). After separating positive and negative PC's, in Figure <figr fid="F4">4</figr>, all known WNT genes are included in the second negative PC and all known NOTCH genes are included in the second positive PC (enrichment level is <it>E </it>- 10). The marked increase of P-value reveals that separating positive PC from negative PC is a key to better enrichment of prior knowledge. The fact that prior knowledge is mostly enriched in PC's other than the leading one indicates that the gene expression in the NOTCH and WNT pathways show only modest and concordant changes. The enrichment of prior knowledge in the gene cluster could be further improved as our algorithm iterates. In the next section, we present results of generating the "best" WNT and NOTCH clusters in which enrichment of prior knowledge is optimized.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>SVD analysis without splitting negative and positive PC's</p>
                  </caption>
                  <text>
                     <p><b>SVD analysis without splitting negative and positive PC's</b>. WNT and NOTCH genes are maximally enriched (P-value: E-06) in the second PC (red lines), not the leading PC.</p>
                  </text>
                  <graphic file="1471-2105-10-S1-S54-3"/>
               </fig>
               <fig id="F4">
                  <title>
                     <p>Figure 4</p>
                  </title>
                  <caption>
                     <p>SVD analysis with splitting negative and positive PC's</p>
                  </caption>
                  <text>
                     <p><b>SVD analysis with splitting negative and positive PC's</b>. Further, WNT and NOTCH genes are maximally enriched (P-value: E-10) in the second negative PC and the second positive PC (red lines), and the level of enrichment is dramatically increased because the sizes of negative and positive PC's decrease.</p>
                  </text>
                  <graphic file="1471-2105-10-S1-S54-4"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Comparing our semi-supervised algorithms with the gene shaving algorithm</p>
               </st>
               <p>We aim to show that our semi-supervised algorithm is uniquely able to identify low variation signaling pathway genes but not the gene shaving algorithm. For predicting WNT cluster, our algorithm terminates after 18 iterations, and for predicting NOTCH cluster, it terminates after 20 iterations. We then traced back to retrieve the optimized clusters. Both WNT and NOTCH clusters were retrieved at the 9th iteration that prior knowledge is most enriched, and were smallest clusters containing all prior knowledge genes (Figure <figr fid="F5">5c</figr>). From Figure <figr fid="F5">5a</figr>, the original gene shaving algorithm <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> apparently failed in this case demonstrated by no enrichment of prior knowledge at all. The reason is, as discussed before, that WNT and NOTCH pathway genes are concordantly regulated in modest magnitude while gene shaving algorithm only works with the leading PC. Figures <figr fid="F5">5b</figr> and <figr fid="F5">5c</figr> present the prior knowledge enrichment achieved by two variants of our semi-supervised algorithm: with or without separating positive PC's from negative PC's. It is evident that splitting PC's gives rise to better clustering performance.</p>
               <fig id="F5">
                  <title>
                     <p>Figure 5</p>
                  </title>
                  <caption>
                     <p>Algorithm comparisons</p>
                  </caption>
                  <text>
                     <p><b>Algorithm comparisons</b>. Horizontal axis represents the number of iterations in both upper or lower panels. The vertical axis of the upper panel corresponds to the -<it>log2P</it>-value of the enrichment of prior knowledge. The vertical axis of the lower panel corresponds to the number of genes in the cluster (upper) and size of the cluster (lower). (a) The performance of the original gene shaving algorithm gauged by prior knowledge enrichment over iterations <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. (b) The performance of our semi-supervised gene shaving algorithm without splitting positive and negative PC's. (c) The performance of our semi-supervised gene shaving algorithm with splitting positive and negative PC's.</p>
                  </text>
                  <graphic file="1471-2105-10-S1-S54-5"/>
               </fig>
               <p>The left panel of Figure <figr fid="F6">6</figr> plots gene expression profiles of the predicted NOTCH cluster, and right panel displays the annotation of those genes. Genes in the shaded areas are from our prior knowledge <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and genes that are pointed by red arrows indicate the genes are experimentally validated to be positive, and genes pointed by blue arrows indicate the genes are potentially relevant through literature search. Note that the two pathways are far less from well understood, and therefore, many predicted genes, although not currently supported by experimental evidence, are likely to be validated later.</p>
               <fig id="F6">
                  <title>
                     <p>Figure 6</p>
                  </title>
                  <caption>
                     <p>The predicted NOTCH cluster</p>
                  </caption>
                  <text>
                     <p><b>The predicted NOTCH cluster</b>. Highlighted genes are prior knowledge. Genes that are pointed by red arrows correspond to experimentally validated NOTCH genes, and genes pointed by blue arrows correspond to potentially interesting genes by expert opinion and literature search. The whole list of prior knowledge and prediction are available in supplemental tables.</p>
                  </text>
                  <graphic file="1471-2105-10-S1-S54-6"/>
               </fig>
               <p>To make our prediction useful for improving current understanding of the mechanisms of WNT and NOTCH pathways in somatogenesis, we performed analysis to infer what kinds of biological functions (defined by Gene Ontology, GO) are most enriched in the pathways, and what kind of transcription factors (inferred through ChIP-chip experiments) are most likely to be involved in regulating the two pathways. Table <tblr tid="T1">1</tblr> presents the results of abovementioned enrichment analysis. The analysis was done through the web-server of the Segal lab: <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. In table <tblr tid="T1">1</tblr>, results appear to be meaningful since many significantly enriched GO terms (column 3) are related to embryonic development, and both enriched transcription factors (column 4): MyoG and MyoD are closely related to cell differentiation <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>. In particular, Myod and Myog have distinct regulatory roles at a similar set of target genes. The role of Myog in mediating terminal differentiation is partially to enhance expression of a subset of genes previously turned on by Myod <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
               <tbl id="T1">
                  <title>
                     <p>Table 1</p>
                  </title>
                  <caption>
                     <p>Biological function enrichment analysis and transcription factor association analysis.<abbrgrp><abbr bid="B23"/></abbrgrp></p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c ca="center">
                           <p>Gene Set</p>
                        </c>
                        <c ca="center">
                           <p>Size</p>
                        </c>
                        <c ca="center">
                           <p>GO Annotation</p>
                        </c>
                        <c ca="center">
                           <p>Transcription Factors</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>WNT</p>
                        </c>
                        <c ca="center">
                           <p>45</p>
                        </c>
                        <c ca="center">
                           <p>embryonic development (1.13E-04)</p>
                        </c>
                        <c ca="center">
                           <p>MyoG_Myotubes (9.47E-03) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp></p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>cytosol (9.15E-06)</p>
                        </c>
                        <c ca="center">
                           <p>MyoD_Growing cells (1.99E-05) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp></p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>cytosolic part (4.48E-08)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>iron ion binding (3.92E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>tube development (3.86E-04)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>branching morphogenesis of a tube (9.88E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>tube morphogenesis (7.26E-05)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>patterning of blood vessels (3.57E-05)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>embryonic pattern specification (1.11E-04)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>oxygen binding (6.89E-14)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>gas transport (4.60E-14)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>hemoglobin complex (1.12E-14)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>NOTCH</p>
                        </c>
                        <c ca="center">
                           <p>36</p>
                        </c>
                        <c ca="center">
                           <p>developmental maturation (3.86E-04)</p>
                        </c>
                        <c ca="center">
                           <p>MyoG_Myotubes (9.47E-03) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp></p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>negative regulation of cell differentiation (3.01E-04)</p>
                        </c>
                        <c ca="center">
                           <p>MyoD_Growing cells (1.99E-05) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp></p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>ectoderm development (1.91E-05)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>cell maturation (1.94E-04)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>tissue morphogenesis (1.12E-05)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>epidermis morphogenesis (2.00E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>hair cell differentiation (5.26E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>mechanoreceptor differentiation (7.56E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>negative regulation of neuron differentiation (3.49E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>regulation of neuron differentiation (3.93E-05)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>cell fate determination (9.65E-06)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>auditory receptor cell fate commitment (3.78E-08)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>The third column contains biological functions significantly enriched in WNT and NOTCH pathways, and the fourth column contains transcription factors significantly associated with WNT and NOTCH pathways. The analysis was done through the web-server of the Segal lab: <abbrgrp><abbr bid="B23">23</abbr></abbrgrp></p>
                  </tblfn>
               </tbl>
            </sec>
            <sec>
               <st>
                  <p>Stability of clusters against perturbation of prior knowledge</p>
               </st>
               <p>Our approach predicts new pathway genes based on the available prior knowledge, therefore, it is critical to investigate the sensitivity of our prediction to a modest perturbation of prior knowledge. Since in this data set we don't know such ground truth as we did in the cell cycle data analysis, we performed sensitivity analysis using leave-one-out and leave-two-out jackknife approaches, see method section for technical details. Narrower Jackknife confidence interval of the enrichment indicates better stability of our enrichment estimation against perturbation of prior knowledge. In Figure <figr fid="F7">7a</figr> where the leave-one-out approach was applied, the estimation of enrichment is perfectly stable (zero variance) and increases until the ninth iteration. Recall that we traced back and retrieved the "best" NOTCH gene cluster right in the ninth iteration. This translates into the fact that our cluster analysis is very robust against moderate perturbation of prior knowledge. In Figure <figr fid="F7">7b</figr> where the leave-two-out approach was used follows a similar trend but with better stability (a narrower confidence interval). This is due to the fact that there are a larger number of Jackknife samples available in leave-two-out approach.</p>
               <fig id="F7">
                  <title>
                     <p>Figure 7</p>
                  </title>
                  <caption>
                     <p>Leave-one-out and Leave-two-out Jackknife estimations and confidence intervals of the enrichment</p>
                  </caption>
                  <text>
                     <p><b>Leave-one-out and Leave-two-out Jackknife estimations and confidence intervals of the enrichment</b>. (a) Accessing cluster sensitivity to perturbation of prior knowledge using leave-one-out approach (b) Accessing cluster sensitivity to perturbation of prior knowledge using leave-two-out approach.</p>
                  </text>
                  <graphic file="1471-2105-10-S1-S54-7"/>
               </fig>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>With exception of a few recent works <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>, most clustering algorithms these days are non-supervised in the sense that prior knowledge is not properly utilized to guide the learning process. Instead prior knowledge is often used in the post-learning phase in that researchers predict functions of unknown genes based on genes of known functions lying in the same cluster. The traditional gene shaving method focuses on the leading PC that accounts for most of variation in the data. On one hand, it is useful in discovering high variation pathway genes <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B29">29</abbr></abbrgrp>, on the other hand, it tends to overlook essential pathway genes that have modest expression variation. We hypothesized that highly concerted expression behavior of these genes, albeit modest in variation, may help shape its pattern out of the noisy microarray data using appropriate analysis techniques, i.e., SVD.</p>
         <p>The main contribution of this work is that we proposed an optimization algorithm combining the strengths of gene set based analysis and iterative gene selection. The iterative fashion inspired from the gene shaving algorithm allows distilling desired gene cluster using prior knowledge, while the latter enables us to discover gene clusters of modest and concerted expression change. The PC's that define gene clusters group a series of tightly regulated genes ranked by variance over samples. The orthogonality as specified in SVD analysis indicates those gene clusters of different variation were regulated by orthogonal defined or undefined genomic influences (Table 1 of <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>).</p>
         <p>Our method is particularly suitable for identifying gene clusters with modest and concerted expression change, therefore it is not limited to identify periodically expressed gene clusters. When there is no prior knowledge available, the optimization process can be done through optimizing the enrichment of interesting Gene Ontology (GO) vocabulary, for example, somitogenesis [GO:0001756]. The technique for testing enrichment of GO term is very similar to that was used here, also see review in <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. A recursive dendrogram can be constructed as a foundation to generate overlapping gene clusters, from which the optimal clusters can be identified and retrieved according to the enrichment of the interesting GO term(s) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Our algorithm represents one of the first clustering algorithms that is particularly designed to identify signaling pathways of low and concordant gene expression variation. The discriminating power is achieved by manufacturing a principal component enriched by the prior knowledge.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Singular Value Decomposition</p>
            </st>
            <p>Assume the gene expression data is in the matrix format <it>X</it><sub><it>p </it>&#215; <it>n</it></sub>, where rows (<it>p</it>) correspond to genes and columns (<it>n</it>) correspond to conditions under which gene expression abundance were interrogated. Singular value decomposition (SVD) of the rectangular matrix <it>X </it>can be expressed as follows:</p>
            <p>
               <display-formula id="M1">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i1">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>X</m:mi>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:msub>
                              <m:mi>U</m:mi>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:msub>
                              <m:mi>S</m:mi>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:msubsup>
                              <m:mi>V</m:mi>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                              <m:mi>T</m:mi>
                           </m:msubsup>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aaSbaaSqaaiabdchaWjabgEna0kabd6gaUbqabaGccqGH9aqpcqWGvbqvdaWgaaWcbaGaemiCaaNaey41aqRaemOBa4gabeaakiabdofatnaaBaaaleaacqWGUbGBcqGHxdaTcqWGUbGBaeqaaOGaemOvay1aa0baaSqaaiabd6gaUjabgEna0kabd6gaUbqaaiabdsfaubaakiabcYcaSaaa@486F@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <it>U</it><sub><it>p </it>&#215; <it>n </it></sub>is the gene coefficient, and <it>U</it><sub><it>ij </it></sub>is the contribution of <it>i</it><sub><it>th </it></sub>, <it>i </it>= 1, ..., <it>p</it>, gene to the <it>j</it><sub><it>th</it></sub>, <it>j </it>= 1, ..., <it>n</it>, PC. If we correspond each <it>U</it><sub><it>j </it></sub>to a genomic influence <it>j</it>, then <it>U</it><sub><it>ij </it></sub>defines how much the gene <it>i </it>is regulated by the genomic influence <it>j</it>. <it>S</it><sub><it>n </it>&#215; <it>n </it></sub>is the singular value matrix, where the diagonal contains list of singular values, and the magnitude of singular values corresponds to percentage of variation explained by each PC. <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i2"><m:semantics><m:mrow><m:msubsup><m:mi>V</m:mi><m:mrow><m:mi>n</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mi>T</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOvay1aa0baaSqaaiabd6gaUjabgEna0kabd6gaUbqaaiabdsfaubaaaaa@3349@</m:annotation></m:semantics></m:math></inline-formula> stores PC's <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. We then separated positive PC's from negative PC's according to the signs of entries in <it>U</it><sub><it>p </it>&#215; <it>n</it></sub>, i.e.,</p>
            <p>
               <display-formula id="M2">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i3">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>X</m:mi>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:msubsup>
                              <m:mi>X</m:mi>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                              <m:mo>+</m:mo>
                           </m:msubsup>
                           <m:mo>+</m:mo>
                           <m:msubsup>
                              <m:mi>X</m:mi>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#215;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                              <m:mo>&#8722;</m:mo>
                           </m:msubsup>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aaSbaaSqaaiabdchaWjabgEna0kabd6gaUbqabaGccqGH9aqpcqWGybawdaqhaaWcbaGaemiCaaNaey41aqRaemOBa4gabaGaey4kaScaaOGaey4kaSIaemiwaG1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgkHiTaaakiabc6caUaaa@43BC@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Refer to supplemental figure <figr fid="F1">1</figr> for a schematic illustration of the procedure. As shown in later data analysis examples, the separation operation is the key to enhance the prior knowledge enrichment level and to differentiate between antiphased WNT and NOTCH clusters.</p>
         </sec>
         <sec>
            <st>
               <p>Testing gene coefficients</p>
            </st>
            <p>Smaller fraction numbers of <it>U</it><sub><it>ij </it></sub>may indicate the contribution of <it>i</it><sub><it>th </it></sub>gene to <it>j</it><sub><it>th </it></sub>PC is negligible. We used a cut-off value that was originally used in <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> to test the vanishing of <it>U</it><sub><it>ij </it></sub>(similar to a 3<it>&#963; </it>statistical significance):</p>
            <p>
               <display-formula id="M3">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i4">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>U</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mi>j</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mrow>
                              <m:mo>{</m:mo>
                              <m:mrow>
                                 <m:mtable columnalign="right">
                                    <m:mtr columnalign="right">
                                       <m:mtd columnalign="right">
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>U</m:mi>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mi>j</m:mi>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mrow>
                                       </m:mtd>
                                       <m:mtd columnalign="right">
                                          <m:mrow>
                                             <m:mtext>for&#160;</m:mtext>
                                             <m:mo>|</m:mo>
                                             <m:msub>
                                                <m:mi>U</m:mi>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mi>j</m:mi>
                                                </m:mrow>
                                             </m:msub>
                                             <m:mo>|</m:mo>
                                             <m:mo>&gt;</m:mo>
                                             <m:mfrac>
                                                <m:mi>p</m:mi>
                                                <m:mrow>
                                                   <m:msqrt>
                                                      <m:mi>n</m:mi>
                                                   </m:msqrt>
                                                </m:mrow>
                                             </m:mfrac>
                                             <m:mo>,</m:mo>
                                          </m:mrow>
                                       </m:mtd>
                                    </m:mtr>
                                    <m:mtr columnalign="right">
                                       <m:mtd columnalign="right">
                                          <m:mn>0</m:mn>
                                       </m:mtd>
                                       <m:mtd columnalign="right">
                                          <m:mrow>
                                             <m:mtext>for&#160;</m:mtext>
                                             <m:mo>|</m:mo>
                                             <m:msub>
                                                <m:mi>U</m:mi>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mi>j</m:mi>
                                                </m:mrow>
                                             </m:msub>
                                             <m:mo>|</m:mo>
                                             <m:mo>&#8804;</m:mo>
                                             <m:mfrac>
                                                <m:mi>p</m:mi>
                                                <m:mrow>
                                                   <m:msqrt>
                                                      <m:mi>n</m:mi>
                                                   </m:msqrt>
                                                </m:mrow>
                                             </m:mfrac>
                                             <m:mo>.</m:mo>
                                          </m:mrow>
                                       </m:mtd>
                                    </m:mtr>
                                 </m:mtable>
                              </m:mrow>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyvau1aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGH9aqpdaGabaqaauaabiqaciaaaeaacqWGvbqvdaWgaaWcbaGaemyAaKMaemOAaOgabeaaaOqaaiabbAgaMjabb+gaVjabbkhaYjabbccaGiabcYha8jabdwfavnaaBaaaleaacqWGPbqAcqWGQbGAaeqaaOGaeiiFaWNaeyOpa4tcfa4aaSaaaeaacqWGWbaCaeaadaGcaaqaaiabd6gaUbqabaaaaOGaeiilaWcabaGaeGimaadabaGaeeOzayMaee4Ba8MaeeOCaiNaeeiiaaIaeiiFaWNaemyvau1aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGG8baFcqGHKjYOjuaGdaWcaaqaaiabdchaWbqaamaakaaabaGaemOBa4gabeaaaaGaeiOla4caaaGccaGL7baaaaa@5B27@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Each element in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i5"><m:semantics><m:mrow><m:msubsup><m:mi>X</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>+</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgUcaRaaaaaa@3302@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i6"><m:semantics><m:mrow><m:msubsup><m:mi>X</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>&#8722;</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgkHiTaaaaaa@330D@</m:annotation></m:semantics></m:math></inline-formula> is compared to the value <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i7"><m:semantics><m:mrow><m:mfrac><m:mi>p</m:mi><m:mrow><m:msqrt><m:mi>n</m:mi></m:msqrt></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGWbaCaeaadaGcaaqaaiabd6gaUbqabaaaaaaa@2F51@</m:annotation></m:semantics></m:math></inline-formula>, where <it>n </it>is the number of genes and <it>p </it>is a weight factor whose recommended value is 3. If the magnitude of the element in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i5"><m:semantics><m:mrow><m:msubsup><m:mi>X</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>+</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgUcaRaaaaaa@3302@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i6"><m:semantics><m:mrow><m:msubsup><m:mi>X</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>&#8722;</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgkHiTaaaaaa@330D@</m:annotation></m:semantics></m:math></inline-formula> is greater than <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i7"><m:semantics><m:mrow><m:mfrac><m:mi>p</m:mi><m:mrow><m:msqrt><m:mi>n</m:mi></m:msqrt></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGWbaCaeaadaGcaaqaaiabd6gaUbqabaaaaaaa@2F51@</m:annotation></m:semantics></m:math></inline-formula>, the corresponding gene is determined to contribute significantly to the PC's. Alternatively the list of genes that are significantly up-regulated or down-regulated by the underlying genomic influence corresponding to each PC.</p>
         </sec>
         <sec>
            <st>
               <p>Enrichment test</p>
            </st>
            <p>For each PC <it>j</it>, suppose there is a gene set <it>K </it>of <it>k </it>genes that <it>U</it><sub><it>ij </it></sub>is not 0, and for a biological pathway, suppose there is a prior knowledge gene set <it>M </it>of <it>m </it>genes in known in the pathway. Also assume there are <it>n </it>genes NOT in the pathway, and <it>x </it>is the number of common genes shared by <it>K </it>and <it>M</it>. The probability of observing exactly <it>x </it>common genes is:</p>
            <p>
               <display-formula id="M4">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i8">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>P</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>X</m:mi>
                           <m:mo>=</m:mo>
                           <m:mi>x</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mrow>
                                       <m:mtable>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mi>m</m:mi>
                                             </m:mtd>
                                          </m:mtr>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mi>x</m:mi>
                                             </m:mtd>
                                          </m:mtr>
                                       </m:mtable>
                                    </m:mrow>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mrow>
                                       <m:mtable>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mi>n</m:mi>
                                             </m:mtd>
                                          </m:mtr>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mrow>
                                                   <m:mi>k</m:mi>
                                                   <m:mo>&#8722;</m:mo>
                                                   <m:mi>x</m:mi>
                                                </m:mrow>
                                             </m:mtd>
                                          </m:mtr>
                                       </m:mtable>
                                    </m:mrow>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mrow>
                                       <m:mtable>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mrow>
                                                   <m:mi>m</m:mi>
                                                   <m:mo>+</m:mo>
                                                   <m:mi>n</m:mi>
                                                </m:mrow>
                                             </m:mtd>
                                          </m:mtr>
                                          <m:mtr>
                                             <m:mtd>
                                                <m:mi>k</m:mi>
                                             </m:mtd>
                                          </m:mtr>
                                       </m:mtable>
                                    </m:mrow>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiuaaLaeiikaGIaemiwaGLaeyypa0JaemiEaGNaeiykaKIaeyypa0tcfa4aaSaaaeaadaqadaqaauaabeqaceaaaeaacqWGTbqBaeaacqWG4baEaaaacaGLOaGaayzkaaWaaeWaaeaafaqabeGabaaabaGaemOBa4gabaGaem4AaSMaeyOeI0IaemiEaGhaaaGaayjkaiaawMcaaaqaamaabmaabaqbaeqabiqaaaqaaiabd2gaTjabgUcaRiabd6gaUbqaaiabdUgaRbaaaiaawIcacaGLPaaaaaGccqGGUaGlaaa@4719@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>In order to estimate the probability of observing <it>x </it>common genes or more is purely due to chance, we test the following one-sided hypothesis:</p>
            <p>
               <display-formula id="M5">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i9">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>H</m:mi>
                              <m:mn>0</m:mn>
                           </m:msub>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi mathvariant="script">O</m:mi>
                              <m:mn>1</m:mn>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:msub>
                              <m:mi mathvariant="script">O</m:mi>
                              <m:mn>2</m:mn>
                           </m:msub>
                           <m:mtext>&#160;versus&#160;</m:mtext>
                           <m:msub>
                              <m:mi mathvariant="script">O</m:mi>
                              <m:mn>1</m:mn>
                           </m:msub>
                           <m:mo>&#8805;</m:mo>
                           <m:msub>
                              <m:mi mathvariant="script">O</m:mi>
                              <m:mn>2</m:mn>
                           </m:msub>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemisaG0aaSbaaSqaaiabicdaWaqabaGccqGG6aGot0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFoe=tdaWgaaWcbaGaeGymaedabeaakiabg2da9iab=5q8pnaaBaaaleaacqaIYaGmaeqaaOGaeeiiaaIaeeODayNaeeyzauMaeeOCaiNaee4CamNaeeyDauNaee4CamNaeeiiaaIae8NdX=0aaSbaaSqaaiabigdaXaqabaGccqGHLjYScqWFoe=tdaWgaaWcbaGaeGOmaidabeaakiabcYcaSaaa@52B5@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i10"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">O</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NdX=0aaSbaaSqaaiabigdaXaqabaaaaa@3881@</m:annotation></m:semantics></m:math></inline-formula> is a parameter corresponding to the probability of genes in the prior knowledge belonging to the PC, and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i11"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">O</m:mi><m:mn>2</m:mn></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8NdX=0aaSbaaSqaaiabikdaYaqabaaaaa@3883@</m:annotation></m:semantics></m:math></inline-formula> is a parameter corresponding to the probability of genes not in prior knowledge belonging to the PC. Under <it>H</it><sub>0</sub>, the test statistic <it>x </it>follows a hypergeometric distribution with known parameters <it>m</it>, <it>n </it>and <it>k</it>.</p>
            <p>The <it>P</it>-value is then defined as the probability of observing <it>x </it>or more overlaps given <it>H</it><sub>0 </sub>is true. Therefore, it is calculated as follows:</p>
            <p>
               <display-formula id="M6">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i12">
                     <m:semantics>
                        <m:mrow>
                           <m:mtable columnalign="left">
                              <m:mtr columnalign="left">
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>P</m:mi>
                                          <m:mi>V</m:mi>
                                       </m:msub>
                                    </m:mrow>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mo>=</m:mo>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:mi>P</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>X</m:mi>
                                       <m:mo>&#8804;</m:mo>
                                       <m:mi>x</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mtd>
                              </m:mtr>
                              <m:mtr columnalign="left">
                                 <m:mtd columnalign="left">
                                    <m:mrow/>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mo>=</m:mo>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:mn>1</m:mn>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mi>P</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>X</m:mi>
                                       <m:mo>&lt;</m:mo>
                                       <m:mi>x</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mtd>
                              </m:mtr>
                              <m:mtr columnalign="left">
                                 <m:mtd columnalign="left">
                                    <m:mrow/>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mo>=</m:mo>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:mn>1</m:mn>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mstyle displaystyle="true">
                                          <m:munderover>
                                             <m:mo>&#8721;</m:mo>
                                             <m:mrow>
                                                <m:mi>o</m:mi>
                                                <m:mo>=</m:mo>
                                                <m:mn>1</m:mn>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mn>1</m:mn>
                                             </m:mrow>
                                          </m:munderover>
                                          <m:mrow>
                                             <m:mi>P</m:mi>
                                             <m:mo stretchy="false">(</m:mo>
                                             <m:mi>X</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mi>o</m:mi>
                                             <m:mo stretchy="false">)</m:mo>
                                          </m:mrow>
                                       </m:mstyle>
                                    </m:mrow>
                                 </m:mtd>
                              </m:mtr>
                              <m:mtr columnalign="left">
                                 <m:mtd columnalign="left">
                                    <m:mrow/>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mo>=</m:mo>
                                 </m:mtd>
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:mn>1</m:mn>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mstyle displaystyle="true">
                                          <m:munderover>
                                             <m:mo>&#8721;</m:mo>
                                             <m:mrow>
                                                <m:mi>o</m:mi>
                                                <m:mo>=</m:mo>
                                                <m:mn>1</m:mn>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mn>1</m:mn>
                                             </m:mrow>
                                          </m:munderover>
                                          <m:mrow>
                                             <m:mfrac>
                                                <m:mrow>
                                                   <m:mrow>
                                                      <m:mo>(</m:mo>
                                                      <m:mrow>
                                                         <m:mtable>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mi>m</m:mi>
                                                               </m:mtd>
                                                            </m:mtr>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mi>o</m:mi>
                                                               </m:mtd>
                                                            </m:mtr>
                                                         </m:mtable>
                                                      </m:mrow>
                                                      <m:mo>)</m:mo>
                                                   </m:mrow>
                                                   <m:mrow>
                                                      <m:mo>(</m:mo>
                                                      <m:mrow>
                                                         <m:mtable>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mi>n</m:mi>
                                                               </m:mtd>
                                                            </m:mtr>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mrow>
                                                                     <m:mi>k</m:mi>
                                                                     <m:mo>&#8722;</m:mo>
                                                                     <m:mi>o</m:mi>
                                                                  </m:mrow>
                                                               </m:mtd>
                                                            </m:mtr>
                                                         </m:mtable>
                                                      </m:mrow>
                                                      <m:mo>)</m:mo>
                                                   </m:mrow>
                                                </m:mrow>
                                                <m:mrow>
                                                   <m:mrow>
                                                      <m:mo>(</m:mo>
                                                      <m:mrow>
                                                         <m:mtable>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mrow>
                                                                     <m:mi>m</m:mi>
                                                                     <m:mo>+</m:mo>
                                                                     <m:mi>n</m:mi>
                                                                  </m:mrow>
                                                               </m:mtd>
                                                            </m:mtr>
                                                            <m:mtr>
                                                               <m:mtd>
                                                                  <m:mi>k</m:mi>
                                                               </m:mtd>
                                                            </m:mtr>
                                                         </m:mtable>
                                                      </m:mrow>
                                                      <m:mo>)</m:mo>
                                                   </m:mrow>
                                                </m:mrow>
                                             </m:mfrac>
                                          </m:mrow>
                                       </m:mstyle>
                                    </m:mrow>
                                 </m:mtd>
                              </m:mtr>
                           </m:mtable>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabqWaaaaabaGaemiuaa1aaSbaaSqaaiabdAfawbqabaaakeaacqGH9aqpaeaacqWGqbaucqGGOaakcqWGybawcqGHKjYOcqWG4baEcqGGPaqkaeaaaeaacqGH9aqpaeaacqaIXaqmcqGHsislcqWGqbaucqGGOaakcqWGybawcqGH8aapcqWG4baEcqGGPaqkaeaaaeaacqGH9aqpaeaacqaIXaqmcqGHsisldaaeWbqaaiabdcfaqjabcIcaOiabdIfayjabg2da9iabd+gaVjabcMcaPaWcbaGaem4Ba8Maeyypa0JaeGymaedabaGaemiEaGNaeyOeI0IaeGymaedaniabggHiLdaakeaaaeaacqGH9aqpaeaacqaIXaqmcqGHsisldaaeWbqcfayaamaalaaabaWaaeWaaeaafaqabeGabaaabaGaemyBa0gabaGaem4Ba8gaaaGaayjkaiaawMcaamaabmaabaqbaeqabiqaaaqaaiabd6gaUbqaaiabdUgaRjabgkHiTiabd+gaVbaaaiaawIcacaGLPaaaaeaadaqadaqaauaabeqaceaaaeaacqWGTbqBcqGHRaWkcqWGUbGBaeaacqWGRbWAaaaacaGLOaGaayzkaaaaaaWcbaGaem4Ba8Maeyypa0JaeGymaedabaGaemiEaGNaeyOeI0IaeGymaedaniabggHiLdaaaaaa@7113@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
         </sec>
         <sec>
            <st>
               <p>Semi-supervised gene shaving algorithm</p>
            </st>
            <p>1: Start with the centered data matrix <it>X </it>that each row has zero mean</p>
            <p>2: <b>while </b>TRUE <b>do</b></p>
            <p>3: &#160;&#160;&#160;Singular value decomposition <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i13"><m:semantics><m:mrow><m:msub><m:mi>X</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow></m:msub><m:mo>=</m:mo><m:msubsup><m:mi>U</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>+</m:mo></m:msubsup><m:msub><m:mi>S</m:mi><m:mrow><m:mi>n</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow></m:msub><m:msubsup><m:mi>V</m:mi><m:mrow><m:mi>n</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mi>T</m:mi></m:msubsup><m:mo>+</m:mo><m:msubsup><m:mi>U</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>&#8722;</m:mo></m:msubsup><m:msub><m:mi>S</m:mi><m:mrow><m:mi>n</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow></m:msub><m:msubsup><m:mi>V</m:mi><m:mrow><m:mi>n</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mi>T</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiwaG1aaSbaaSqaaiabdchaWjabgEna0kabd6gaUbqabaGccqGH9aqpcqWGvbqvdaqhaaWcbaGaemiCaaNaey41aqRaemOBa4gabaGaey4kaScaaOGaem4uam1aaSbaaSqaaiabd6gaUjabgEna0kabd6gaUbqabaGccqWGwbGvdaqhaaWcbaGaemOBa4Maey41aqRaemOBa4gabaGaemivaqfaaOGaey4kaSIaemyvau1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgkHiTaaakiabdofatnaaBaaaleaacqWGUbGBcqGHxdaTcqWGUbGBaeqaaOGaemOvay1aa0baaSqaaiabd6gaUjabgEna0kabd6gaUbqaaiabdsfaubaaaaa@5DFC@</m:annotation></m:semantics></m:math></inline-formula></p>
            <p>4: &#160;&#160;&#160;<b>for all </b>column of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i14"><m:semantics><m:mrow><m:msubsup><m:mi>U</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>+</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyvau1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgUcaRaaaaaa@32FC@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i15"><m:semantics><m:mrow><m:msubsup><m:mi>U</m:mi><m:mrow><m:mi>p</m:mi><m:mo>&#215;</m:mo><m:mi>n</m:mi></m:mrow><m:mo>&#8722;</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyvau1aa0baaSqaaiabdchaWjabgEna0kabd6gaUbqaaiabgkHiTaaaaaa@3307@</m:annotation></m:semantics></m:math></inline-formula><b>do</b></p>
            <p>5: &#160;&#160;&#160;&#160;&#160;&#160;if column elements are greater than a cut-off <b>then</b></p>
            <p>6: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;NO change</p>
            <p>7: &#160;&#160;&#160;&#160;&#160;&#160;<b>else</b></p>
            <p>8: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Set to 0</p>
            <p>9: &#160;&#160;&#160;&#160;&#160;&#160;<b>end if</b></p>
            <p>10: &#160;&#160;&#160;&#160;&#160;&#160;<b>end for</b></p>
            <p>11: &#160;&#160;&#160;&#160;&#160;&#160;<b>for all </b>Gene sets correspond to each columns <b>do</b></p>
            <p>12: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Test enrichment of prior knowledge in each gene set</p>
            <p>13: &#160;&#160;&#160;&#160;&#160;&#160;<b>end for</b></p>
            <p>14: &#160;&#160;&#160;&#160;&#160;&#160;<b>if </b>Two or more columns that are most enriched with prior knowledge exist <b>then</b></p>
            <p>15: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Break</p>
            <p>16: &#160;&#160;&#160;&#160;&#160;&#160;<b>else</b></p>
            <p>17: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Retrieve the best PC that are most enriched by prior knowledge</p>
            <p>18: &#160;&#160;&#160;&#160;&#160;&#160;<b>end if</b></p>
            <p>19: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Sort genes according to absolute correlation with the best PC</p>
            <p>20: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Discard <it>&#945;</it>% least correlated genes (<it>&#945; </it>= 10% followed from <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>)</p>
            <p>21: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Assign the reduced data matrix to <it>X</it></p>
            <p>22: <b>end while</b></p>
            <p>23: Trace-back to retrieve the best gene cluster</p>
            <p>As shown in the above Algorithm and Figure <figr fid="F1">1</figr>, the algorithm iterates until there are two or more most enriched PC's coexisting as defined by prior knowledge. The iterations stop here since we don't yet know a good way to further reduce the size of the cluster. Inconsiderate reduction might cause a loss of important genes. There are two ways of tracing back to retrieve the best gene cluster. One is to find the smallest cluster containing all prior knowledge, another is to find the cluster in which the enrichment of prior knowledge optimized. We chose the latter because it does not rely on the assumption that all prior knowledge need to be accurate. In fact, each gene coefficient can be used to measure the relative importance of genes in forming the cluster pattern. Genes in prior knowledge that help shaping out patterns receive higher weight, otherwise receive lower weight.</p>
         </sec>
         <sec>
            <st>
               <p>Stability analysis of gene clusters &#8211; a jackknife approach</p>
            </st>
            <p>Jackknife approach, e.g. "leave-one-out", is a resampling approach that is frequently used to access the stability of an estimator such as enrichment studied here. Suppose we wish to estimate enrichment parameter (<it>&#951;</it>) as a complicated statistic (<it>T</it>) of <it>n </it>genes in prior knowledge as well as <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i16"><m:semantics><m:mi mathvariant="script">D</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXteaaa@374F@</m:annotation></m:semantics></m:math></inline-formula>,</p>
            <p>
               <display-formula id="M7">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i17">
                     <m:semantics>
                        <m:mrow>
                           <m:mover accent="true">
                              <m:mi>&#951;</m:mi>
                              <m:mo>^</m:mo>
                           </m:mover>
                           <m:mo>=</m:mo>
                           <m:mi>T</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mn>1</m:mn>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mn>2</m:mn>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mn>...</m:mn>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mi>i</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mn>...</m:mn>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mi>n</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mi mathvariant="script">D</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafq4TdGMbaKaacqGH9aqpcqWGubavcqGGOaakcqWGNbWzdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabdEgaNnaaBaaaleaacqaIYaGmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaSqaaiabdMgaPjabgkHiTiabigdaXaqabaGccqGGSaalcqWGNbWzdaWgaaWcbaGaemyAaKgabeaakiabcYcaSiabdEgaNnaaBaaaleaacqWGPbqAcqGHRaWkcqaIXaqmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaSqaaiabd6gaUbqabaGccqGGSaalt0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFdeprcqGGPaqkcqGGUaGlaaa@5ED3@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Let <it>j</it>th partial estimate of <it>&#951; </it>be given by the estimate computed with gene <it>i </it>removed,</p>
            <p>
               <display-formula id="M8">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i18">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mover accent="true">
                                 <m:mi>&#951;</m:mi>
                                 <m:mo>^</m:mo>
                              </m:mover>
                              <m:mi>j</m:mi>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mi>T</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mn>1</m:mn>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mn>2</m:mn>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mn>...</m:mn>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mn>...</m:mn>
                           <m:mo>,</m:mo>
                           <m:msub>
                              <m:mi>g</m:mi>
                              <m:mi>n</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mi mathvariant="script">D</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafq4TdGMbaKaadaWgaaWcbaGaemOAaOgabeaakiabg2da9iabdsfaujabcIcaOiabdEgaNnaaBaaaleaacqaIXaqmaeqaaOGaeiilaWIaem4zaC2aaSbaaSqaaiabikdaYaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWGNbWzdaWgaaWcbaGaemyAaKMaeyOeI0IaeGymaedabeaakiabcYcaSiabdEgaNnaaBaaaleaacqWGPbqAcqGHRaWkcqaIXaqmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaem4zaC2aaSbaaSqaaiabd6gaUbqabaGccqGGSaalt0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFdeprcqGGPaqkcqGGUaGlaaa@5C9E@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>The jackknife estimate of <it>&#951; </it>is given by the average of the pseudovalues <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>,</p>
            <p>
               <display-formula id="M9">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i19">
                     <m:semantics>
                        <m:mrow>
                           <m:msup>
                              <m:mi>&#951;</m:mi>
                              <m:mo>&#8727;</m:mo>
                           </m:msup>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mn>1</m:mn>
                              <m:mi>&#951;</m:mi>
                           </m:mfrac>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>i</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>n</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>n</m:mi>
                                 <m:mover accent="true">
                                    <m:mi>&#951;</m:mi>
                                    <m:mo>^</m:mo>
                                 </m:mover>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>n</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:msub>
                                    <m:mover accent="true">
                                       <m:mi>&#951;</m:mi>
                                       <m:mo>^</m:mo>
                                    </m:mover>
                                    <m:mi>j</m:mi>
                                 </m:msub>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4TdG2aaWbaaSqabeaacqGHxiIkaaGccqGH9aqpjuaGdaWcaaqaaiabigdaXaqaaiabeE7aObaakmaaqahabaGaeiikaGIaemOBa4Mafq4TdGMbaKaacqGHsislcqGGOaakcqWGUbGBcqGHsislcqaIXaqmcqGGPaqkcuaH3oaAgaqcamaaBaaaleaacqWGQbGAaeqaaOGaeiykaKcaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGUbGBa0GaeyyeIuoakiabc6caUaaa@4928@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>An approximate sampling error for <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i20"><m:semantics><m:mrow><m:msup><m:mover accent="true"><m:mi>&#951;</m:mi><m:mo>^</m:mo></m:mover><m:mo>&#8727;</m:mo></m:msup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafq4TdGMbaKaadaahaaWcbeqaaiabgEHiQaaaaaa@2EAD@</m:annotation></m:semantics></m:math></inline-formula> can be obtained as the following <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>:</p>
            <p>
               <display-formula id="M10">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i21">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>V</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>r</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msup>
                              <m:mi>&#951;</m:mi>
                              <m:mo>&#8727;</m:mo>
                           </m:msup>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>V</m:mi>
                                 <m:mi>a</m:mi>
                                 <m:mi>r</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msubsup>
                                    <m:mi>&#951;</m:mi>
                                    <m:mi>j</m:mi>
                                    <m:mo>&#8727;</m:mo>
                                 </m:msubsup>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                              <m:mi>n</m:mi>
                           </m:mfrac>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mstyle displaystyle="true">
                                    <m:msubsup>
                                       <m:mo>&#8721;</m:mo>
                                       <m:mrow>
                                          <m:mi>j</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mn>1</m:mn>
                                       </m:mrow>
                                       <m:mi>n</m:mi>
                                    </m:msubsup>
                                    <m:mrow>
                                       <m:msup>
                                          <m:mrow>
                                             <m:mo stretchy="false">(</m:mo>
                                             <m:msubsup>
                                                <m:mi>&#951;</m:mi>
                                                <m:mi>j</m:mi>
                                                <m:mo>&#8727;</m:mo>
                                             </m:msubsup>
                                             <m:mo>&#8722;</m:mo>
                                             <m:msup>
                                                <m:mi>&#951;</m:mi>
                                                <m:mo>&#8727;</m:mo>
                                             </m:msup>
                                             <m:mo stretchy="false">)</m:mo>
                                          </m:mrow>
                                          <m:mn>2</m:mn>
                                       </m:msup>
                                    </m:mrow>
                                 </m:mstyle>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>n</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>n</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOvayLaemyyaeMaemOCaiNaeiikaGIaeq4TdG2aaWbaaSqabeaacqGHxiIkaaGccqGGPaqkcqGH9aqpjuaGdaWcaaqaaiabdAfawjabdggaHjabdkhaYjabcIcaOiabeE7aOnaaDaaabaGaemOAaOgabaGaey4fIOcaaiabcMcaPaqaaiabd6gaUbaakiabg2da9KqbaoaalaaabaWaaabmaeaacqGGOaakcqaH3oaAdaqhaaqaaiabdQgaQbqaaiabgEHiQaaacqGHsislcqaH3oaAdaahaaqabeaacqGHxiIkaaGaeiykaKYaaWbaaeqabaGaeGOmaidaaaqaaiabdQgaQjabg2da9iabigdaXaqaaiabd6gaUbGaeyyeIuoaaeaacqWGUbGBcqGGOaakcqWGUbGBcqGHsislcqaIXaqmcqGGPaqkaaGaeiOla4caaa@5B6D@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Likewise, an approximate (1 - <it>&#945;</it>)% confidence interval is given by <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>,</p>
            <p>
               <display-formula id="M11">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-10-S1-S54-i22">
                     <m:semantics>
                        <m:mrow>
                           <m:msup>
                              <m:mi>&#951;</m:mi>
                              <m:mo>&#8727;</m:mo>
                           </m:msup>
                           <m:mo>&#177;</m:mo>
                           <m:msub>
                              <m:mi>t</m:mi>
                              <m:mrow>
                                 <m:mi>&#945;</m:mi>
                                 <m:mo>/</m:mo>
                                 <m:mn>2</m:mn>
                                 <m:mo>,</m:mo>
                                 <m:mi>n</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:msqrt>
                              <m:mrow>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mstyle displaystyle="true">
                                          <m:msubsup>
                                             <m:mo>&#8721;</m:mo>
                                             <m:mrow>
                                                <m:mi>j</m:mi>
                                                <m:mo>=</m:mo>
                                                <m:mn>1</m:mn>
                                             </m:mrow>
                                             <m:mi>n</m:mi>
                                          </m:msubsup>
                                          <m:mrow>
                                             <m:msup>
                                                <m:mrow>
                                                   <m:mo stretchy="false">(</m:mo>
                                                   <m:msubsup>
                                                      <m:mi>&#951;</m:mi>
                                                      <m:mi>j</m:mi>
                                                      <m:mo>&#8727;</m:mo>
                                                   </m:msubsup>
                                                   <m:mo>&#8722;</m:mo>
                                                   <m:msup>
                                                      <m:mi>&#951;</m:mi>
                                                      <m:mo>&#8727;</m:mo>
                                                   </m:msup>
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mrow>
                                                <m:mn>2</m:mn>
                                             </m:msup>
                                          </m:mrow>
                                       </m:mstyle>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>n</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>n</m:mi>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mn>1</m:mn>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                           </m:msqrt>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4TdG2aaWbaaSqabeaacqGHxiIkaaGccqGHXcqScqWG0baDdaWgaaWcbaGaeqySdeMaei4la8IaeGOmaiJaeiilaWIaemOBa4MaeyOeI0IaeGymaedabeaakmaakaaajuaGbaWaaSaaaeaadaaeWaqaaiabcIcaOiabeE7aOnaaDaaabaGaemOAaOgabaGaey4fIOcaaiabgkHiTiabeE7aOnaaCaaabeqaaiabgEHiQaaacqGGPaqkdaahaaqabeaacqaIYaGmaaaabaGaemOAaOMaeyypa0JaeGymaedabaGaemOBa4gacqGHris5aaqaaiabd6gaUjabcIcaOiabd6gaUjabgkHiTiabigdaXiabcMcaPaaaaSqabaGccqGGSaalaaa@534B@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where <it>t</it><sub><it>&#945;</it>/2, <it>n</it>-1 </sub>satisfies <it>Pr</it>(<it>t</it><sub><it>n </it></sub>&#8805; <it>t</it><sub><it>&#945;</it>/2, <it>n</it>-1</sub>) = <it>&#945;</it>, with <it>t</it><sub><it>n </it></sub>denoting a <it>t</it>-distributed random variable with <it>n </it>degree of freedom.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author declares that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Author's contributions</p>
         </st>
         <p>DZ conceived and designed the method, analyzed data and drafted the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>DZ is supported by Research Start-up Grants from the University of New Orleans and Research Institute for Children of Children's Hospital New Orleans.</p>
            <p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 10 Supplement 1, 2009: Proceedings of The Seventh Asia Pacific Bioinformatics Conference (APBC) 2009. The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1471-2105/10?issue=S1</url></p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>14587</fpage>
            <lpage>15151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">33923</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843931</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Kitareewan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dmitrovsky</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>6</issue>
            <fpage>2907</fpage>
            <lpage>2912</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15868</pubid>
                  <pubid idtype="pmpid" link="fulltext">10077610</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.6.2907</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Network constrained clustering for gene microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hero</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Khanna</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>21</issue>
            <fpage>4014</fpage>
            <lpage>4020</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti655</pubid>
                  <pubid idtype="pmpid" link="fulltext">16141248</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Alizadeh</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Staudt</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <issue>2</issue>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15015</pubid>
                  <pubid idtype="pmpid" link="fulltext">11178228</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Model-based clustering and data transformations for gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Yeung</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>17</volume>
            <issue>10</issue>
            <fpage>977</fpage>
            <lpage>087</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/bioinformatics/17.10.977</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Gasch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>11</issue>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">133443</pubid>
                  <pubid idtype="pmpid" link="fulltext">12429058</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Clustering microarray gene expression data using weighted Chinese restaurant process</p>
            </title>
            <aug>
               <au>
                  <snm>Qin</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>16</issue>
            <fpage>1988</fpage>
            <lpage>1997</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl284</pubid>
                  <pubid idtype="pmpid" link="fulltext">16766561</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Applications of gene shaving and mixture models to cluster microarray gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Do</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Cancer Informatics</source>
            <pubdate>2007</pubdate>
            <volume>2</volume>
            <fpage>25</fpage>
            <lpage>43</lpage>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Singular value decomposition for genome-wide expression data processing and modeling</p>
            </title>
            <aug>
               <au>
                  <snm>Alter</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <issue>18</issue>
            <fpage>10101</fpage>
            <lpage>10106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">27718</pubid>
                  <pubid idtype="pmpid" link="fulltext">10963673</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.18.10101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>SVDMAN &#8211; singular value decomposition analysis of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Wall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dyck</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brettin</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <issue>6</issue>
            <fpage>566</fpage>
            <lpage>568</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.6.566</pubid>
                  <pubid idtype="pmpid" link="fulltext">11395437</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Disentangling information flow in the Ras-cAMP signaling network</p>
            </title>
            <aug>
               <au>
                  <snm>Carter</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Rupp</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fink</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Galitski</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Genome Research</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>520</fpage>
            <lpage>526</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457029</pubid>
                  <pubid idtype="pmpid" link="fulltext">16533914</pubid>
                  <pubid idtype="doi">10.1101/gr.4473506</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Plaid models for gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Lazzroni</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Statistica Sinica</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>61</fpage>
            <lpage>86</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>MCM-test: a fuzzy-set-theory-based approach to differential analysis of gene pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Liang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mandal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>9</volume>
            <issue>Suppl 6</issue>
            <fpage>S16</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2423439</pubid>
                  <pubid idtype="pmpid" link="fulltext">18541051</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-9-S6-S16</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>PGC-1<it>&#945;</it>-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes</p>
            </title>
            <aug>
               <au>
                  <snm>Mootha</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lindgren</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Eriksson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sihag</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lehar</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Puigserver</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Carlsson</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ridderstrale</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Laurila</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Houstis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Patterson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Spiegelman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hirschhorn</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Altshuler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Groop</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>2003</pubdate>
            <volume>34</volume>
            <issue>3</issue>
            <fpage>267</fpage>
            <lpage>273</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1180</pubid>
                  <pubid idtype="pmpid" link="fulltext">12808457</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection</p>
            </title>
            <aug>
               <au>
                  <snm>Tian</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Greenberg</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kong</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Altschuler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kohane</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>38</issue>
            <fpage>13544</fpage>
            <lpage>13549</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1200092</pubid>
                  <pubid idtype="pmpid" link="fulltext">16174746</pubid>
                  <pubid idtype="doi">10.1073/pnas.0506577102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Mootha</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mukherjee</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ebert</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gillette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Paulovich</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pomeroy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>43</issue>
            <fpage>15545</fpage>
            <lpage>15550</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1239896</pubid>
                  <pubid idtype="pmpid" link="fulltext">16199517</pubid>
                  <pubid idtype="doi">10.1073/pnas.0506580102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Periodic gene expression program of the fission yeast cell cycle</p>
            </title>
            <aug>
               <au>
                  <snm>Rustici</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mata</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kivinen</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lio</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Penkett</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burns</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hayles</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nurse</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bahler</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <issue>8</issue>
            <fpage>809</fpage>
            <lpage>817</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1377</pubid>
                  <pubid idtype="pmpid" link="fulltext">15195092</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization</p>
            </title>
            <aug>
               <au>
                  <snm>Spellman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Anders</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Futcher</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Mol Biol Cell</source>
            <pubdate>1998</pubdate>
            <volume>9</volume>
            <issue>12</issue>
            <fpage>3273</fpage>
            <lpage>3297</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">25624</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843569</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>PTC versus paired normal thyroid tissue</p>
            </title>
            <url>http://www.sanger.ac.uk/PostGenomics/S\_prombe/</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>A Complex Oscillating Network of Signaling Genes Underlies the Mouse Segmentation Clock</p>
            </title>
            <aug>
               <au>
                  <snm>Dequeant</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Glynn</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gaudenz</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wahl</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mushegian</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pourquie</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2006</pubdate>
            <volume>314</volume>
            <issue>5805</issue>
            <fpage>1595</fpage>
            <lpage>1598</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1133141</pubid>
                  <pubid idtype="pmpid" link="fulltext">17095659</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms</p>
            </title>
            <aug>
               <au>
                  <snm>Glynn</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mushegian</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>310</fpage>
            <lpage>316</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti789</pubid>
                  <pubid idtype="pmpid" link="fulltext">16303799</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>A Complex Oscillating Network of Signaling Genes Underlies the Mouse Segmentation Clock</p>
            </title>
            <url>http://www.ebi.ac.uk/microarray-as/aer/#ae-browse/q=E-TABM-163[2]</url>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Segal lab website</p>
            </title>
            <url>http://genie.weizmann.ac.il/genomicaweb/enrichment/genesets.jsp</url>
         </bibl>
         <bibl id="B24">
            <title>
               <p>An initial blueprint for myogenic differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Blais</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tsikitis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Acosta-Alvear</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sharan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kluger</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Dynlacht</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Gene &amp; Development</source>
            <pubdate>2005</pubdate>
            <volume>19</volume>
            <issue>48</issue>
            <fpage>553</fpage>
            <lpage>569</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1101/gad.1281105</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Global and gene-specific analyses show distinct roles for Myod and Myog at a common set of promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Cao</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Charlotte</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kooperberg</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Boyer</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tapscott</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>The EMBO Journal</source>
            <pubdate>2006</pubdate>
            <volume>25</volume>
            <fpage>502</fpage>
            <lpage>511</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1383539</pubid>
                  <pubid idtype="pmpid" link="fulltext">16437161</pubid>
                  <pubid idtype="doi">10.1038/sj.emboj.7600958</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Incorporating gene functions as priors in model-based clustering of microarray gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Pittler</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mears</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zack</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Swain</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>7</issue>
            <fpage>795</fpage>
            <lpage>801</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl011</pubid>
                  <pubid idtype="pmpid" link="fulltext">16434443</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Larson</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Almasri</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>317</issue>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data</p>
            </title>
            <aug>
               <au>
                  <snm>Tseng</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <issue>17</issue>
            <fpage>2247</fpage>
            <lpage>2255</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btm320</pubid>
                  <pubid idtype="pmpid" link="fulltext">17597097</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Pathway level analysis of gene expression using singular value decomposition</p>
            </title>
            <aug>
               <au>
                  <snm>Tomphor</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kepler</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>225</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1261155</pubid>
                  <pubid idtype="pmpid" link="fulltext">16156896</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-225</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Enrichment or depletion of a GO category within a class of genes: which test?</p>
            </title>
            <aug>
               <au>
                  <snm>Rivas</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Personnaz</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>23</volume>
            <issue>4</issue>
            <fpage>401</fpage>
            <lpage>407</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl633</pubid>
                  <pubid idtype="pmpid" link="fulltext">17182697</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <aug>
               <au>
                  <snm>Manly</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Randomization, Bootstrap and Monte Carlo Methods in Biology</source>
            <publisher>Boca Raton: Chapman and Hall</publisher>
            <pubdate>1997</pubdate>
         </bibl>
      </refgrp>
   </bm>
</art>