<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-380</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>A Population Proportion approach for ranking differentially expressed genes</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Gadgil</snm>
               <fnm>Mugdha</fnm>
               <insr iid="I1"/>
               <email>mc.gadgil@ncl.res.in</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Chemical Engineering and Process Development, National Chemical Laboratory, Pune, 411008, India </p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>380</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/380</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18801167</pubid>
               <pubid idtype="doi">10.1186/1471-2105-9-380</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>12</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>18</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>18</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Gadgil; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>DNA microarrays are used to investigate differences in gene expression between two or more classes of samples. Most currently used approaches compare mean expression levels between classes and are not geared to find genes whose expression is significantly different in only a subset of samples in a class. However, biological variability can lead to situations where key genes are differentially expressed in only a subset of samples. To facilitate the identification of such genes, a new method is reported.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>The key difference between the Population Proportion Ranking Method (PPRM) presented here and almost all other methods currently used is in the quantification of variability. PPRM quantifies variability in terms of inter-sample ratios and can be used to calculate the relative merit of differentially expressed genes with a specified difference in expression level between at least some samples in the two classes, which at the same time have lower than a specified variability within each class.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>PPRM is tested on simulated data and on three publicly available cancer data sets. It is compared to the t test, PPST, COPA, OS, ORT and MOST using the simulated data. Under the conditions tested, it performs as well or better than the other methods tested under low intra-class variability and better than t test, PPST, COPA and OS when a gene is differentially expressed in only a subset of samples. It performs better than ORT and MOST in recognizing non differentially expressed genes with high variability in expression levels across all samples. For biological data, the success of predictor genes identified in appropriately classifying an independent sample is reported.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>DNA microarrays are used to monitor the expression level of thousands of genes simultaneously, and are extensively used in various areas of biological research <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. The reader is referred to Schena <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and Bowtell and Sambrook <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> for a detailed introduction to microarray technology. A biological problem which is being increasingly addressed through the use of microarray assays is the identification of differences in gene expression between two or more classes of samples e.g. between disease and normal tissue <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. The methods for identifying differentially expressed genes vary greatly <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>, but all have a goal of identifying genes with a significant difference in expression level between samples in the two classes. A simple method to analyze such data is to compare the sample means of the expression level of each gene in the two classes to obtain a 'fold-change' <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> in the expression level of the gene between the two classes. However, fold change calculations fail to account for variability in expression levels between samples within a class. As aptly pointed out by Simon <it>et al </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, "some twofold average effects represent statistically significant differences and some do not". Statistical methods like t-test <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp> and ANOVA <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp> are used to assess the significance of differential expression by incorporating data on variability between samples. Many alternative approaches of incorporating data on variability have also been developed <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B35">35</abbr></abbrgrp>.</p>
         <p>Unlike the case of replicate <it>in vitro </it>data which are expected to have extremely low intra-class variability under ideal conditions, the expression level of a gene can vary significantly within samples obtained from different individuals in one class due to biological variation <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Also, clinically similar phenotypes can be caused by different molecular mechanisms <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Genes which are differentially expressed in only a subset of samples in a class can be important in such cases <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. Most analysis methods compare the means of intra-class expression levels and are not likely to find genes whose expression is significantly different in only a subset of samples in a class, or have high intra-class variability.</p>
         <p>A few approaches have been previously proposed to identify such genes <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. One approach to identify such genes proposed by Lyons-Weiler <it>et al </it><abbrgrp><abbr bid="B39">39</abbr></abbrgrp>, is the Permutation Percentile Separability Test (PPST). This test identifies genes for which a statistically significant number of samples in group A exhibit expression intensities beyond a particular percentile of the observed expression intensities of that gene in group B. Another approach is proposed by Bijlani <it>et al </it><abbrgrp><abbr bid="B38">38</abbr></abbrgrp> who compare the expression level of a gene in every sample in one class to the mean of the expression level in the other class. The proposed application of this method is to select genes which can be used for class distinction. Tomlins <it>et al </it><abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, Tibshirani <it>et al </it><abbrgrp><abbr bid="B41">41</abbr></abbrgrp>, Wu <it>et al </it><abbrgrp><abbr bid="B43">43</abbr></abbrgrp> and Lian <it>et al </it><abbrgrp><abbr bid="B44">44</abbr></abbrgrp> use variations of transformation of gene expression values using the sample median and median absolute deviation in the Cancer Outlier Profile Analysis (COPA), Outlier sums (OS), Outlier Robust <it>t</it>-statistics (ORT) and Maximum Ordered Subset <it>t</it>-statistics (MOST) methods respectively. The performance of COPA and OS has been shown to deteriorate as the number of outliers increase <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>.</p>
         <p>All the methods listed above except PPST use some normalized form of the algebraic difference between expression levels as a measure of heterogeneity to identify 'outliers'. These methods might not be suitable for cases where a subset of samples in a class are responsible for significantly increasing the variability in the class, and are spread over a large range. Consider the following hypothetical example; a group of 10 samples have expression levels of a gene as [50, 50, 75, 80, 100, 120, 120, 300, 500, and 700]. Defining an outlier as a value more than the interquartile range above the third quartile, as used by some researchers <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, only one sample (700) is identified as an outlier. However a closer look at the data indicates that the last three samples are responsible for the increased variability in the class. This motivated the need to explore alternative ways to quantify variability.</p>
         <p>This paper presents a Population Proportion Ranking Method (henceforth referred to as PPRM) to qualitatively rank differentially expressed genes. This method uses inter-sample ratios to quantify variability in expression levels. To my knowledge, this is the first reported method using this approach. The method allows the user to pre-define the required magnitude of difference in expression level of a gene between samples in the two classes and the allowable level of intra-class variability, and has the ability to identify genes which might be differentially expressed in only a subset of the samples in a class and have high variability within a class. The basic steps in the method are outlined in Figure <figr fid="F1">1</figr>. Briefly, the inter-class variability is quantified by calculating the ratio of expression level of a sample in class T (Treated) to its expression level in a sample in class N (Normal), for all possible combinations of samples in the two classes (referred to henceforth as interclass ratios). Depending on the desired relative difference between the classes to identify a gene as differentially expressed, an inter-class ratio cutoff is chosen. The higher the inter-class ratio cutoff, the greater the required difference between classes. The fraction of inter-class ratios calculated above, which are greater than this inter-class ratio cutoff, is calculated (f<sub>TN</sub>). A higher value of f<sub>TN </sub>implies that a larger proportion of samples have the required difference between the two classes.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Summary of the Population Proportion Ranking Method</p>
            </caption>
            <text>
               <p><b>Summary of the Population Proportion Ranking Method</b>. The inter-class variability is quantified by calculating the inter-class ratio of expression level of a sample in class T to its expression level in a sample in class N, for all possible combinations of samples in the two classes. Depending on the desired relative difference between the classes to identify a gene as differentially expressed, an inter-class ratio cutoff is chosen. The fraction of inter-class ratios calculated above, which are greater than this inter-class ratio cutoff, is calculated (f<sub>TN</sub>). Intra-class variability for a class is similarly quantified by calculating the intra-class ratios of expression level of a sample in the class to its expression level in every other sample in the same class. Analogous to the inter-class ratio cutoff, an intra-class ratio cutoff is chosen based on acceptable level of variability within a class. The fraction of intra-class ratios calculated above which are greater than the cutoff is calculated (f<sub>TT</sub>, f<sub>NN</sub>). Genes in which f<sub>TT </sub>and/or f<sub>NN </sub>fraction is significantly smaller than f<sub>TN </sub>are ranked based on an established statistical method of comparing population proportions.</p>
            </text>
            <graphic file="1471-2105-9-380-1"/>
         </fig>
         <p>Intra-class variability for a class is similarly quantified by calculating the ratios of expression level of a sample in the class to its expression level in every other sample in the same class (referred to henceforth as intra-class ratios). Analogous to the inter-class ratio cutoff, an intra-class ratio cutoff is chosen based on acceptable level of variability within a class. The fraction of intra-class ratios calculated above which are greater than the cutoff is calculated (f<sub>TT </sub>&amp; f<sub>NN</sub>). Genes in which these fractions are significantly smaller than f<sub>TN </sub>are ranked based on an established statistical method of comparing population proportions <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>.</p>
         <p>Simulated data sets where the truly differentially expressed genes are known are used to test the ability of PPRM to identify differentially expressed genes. The performance of PPRM is compared to the t test, PPST<abbrgrp><abbr bid="B39">39</abbr></abbrgrp>, COPA <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, OS <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>, ORT <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> and MOST <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> for the simulated data, and is found to be comparable or better under the conditions tested. Thus, PPRM could be a valuable addition to the repertoire of existing methods for detecting genes differentially expressed in a subset of samples in a class. However, simulated data sets do not necessarily mimic the variability in real biological data sets. Hence, this method is also applied to three publicly available cancer data sets to identify differentially expressed genes.</p>
         <p>Since there is no gold standard of true differentially expressed genes in an experimental study, an approach of using differentially expressed genes identified by the method as predictors to test their ability to successfully classify independent sample(s) is used for validation of the method in real-world data. This approach was also used by Jeffery <it>et al </it>for evaluation of lists of differentially expressed genes identified <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. The method proposed in this paper is tested on 3 publicly available cancer data sets: leukemia <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, colon cancer <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> and prostate cancer <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. In case of the leukemia data set, an independent sample set is available to test whether the top differentially expressed genes identified can correctly classify independent samples. For the other two data sets, leave-one-out cross-validation (LOOCV) is implemented to test the accuracy of classification.</p>
         <p>The particular method of choice for identifying differentially expressed genes depends on the biological question, and PPRM provides an additional tool to rank genes complying with a given set of constraints.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>In this section, the Population Proportion Ranking Method (PPRM) is described, followed by a discussion on the assumptions used in PPRM and results of testing of this method on simulated and experimental data.</p>
         <sec>
            <st>
               <p>Population Proportion Ranking Method</p>
            </st>
            <p>Let the number of samples in class T (for 'Treated') be m<sub>T </sub>and the number of samples in class N (for Normal) be m<sub>N</sub>. T<sub>i</sub>, for i = 1 to m<sub>T</sub>, are the expression levels of a gene in the m<sub>T </sub>samples of class T and N<sub>j</sub>, for j = 1 to m<sub>N</sub>, are the expression levels of the gene in the m<sub>N </sub>samples of class N.</p>
            <p>The inter-class variability is quantified using ratio <inline-formula><m:math name="1471-2105-9-380-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TN</m:mtext></m:mrow><m:mrow><m:mtext>i</m:mtext><m:mo>,</m:mo><m:mtext>j</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabb6eaobqaaiabbMgaPjabcYcaSiabbQgaQbaaaaa@3313@</m:annotation></m:semantics></m:math></inline-formula>, defined below:</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msubsup>
                              <m:mtext>R</m:mtext>
                              <m:mrow>
                                 <m:mtext>TN</m:mtext>
                              </m:mrow>
                              <m:mrow>
                                 <m:mtext>i</m:mtext>
                                 <m:mo>,</m:mo>
                                 <m:mtext>j</m:mtext>
                              </m:mrow>
                           </m:msubsup>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>T</m:mtext>
                                    <m:mtext>i</m:mtext>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mtext>j</m:mtext>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                           <m:mi>f</m:mi>
                           <m:mi>o</m:mi>
                           <m:mi>r</m:mi>
                           <m:mtext>&#160;</m:mtext>
                           <m:mi>i</m:mi>
                           <m:mo>=</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>T</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mi>j</m:mi>
                           <m:mo>=</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>N</m:mi>
                           </m:msub>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabb6eaobqaaiabbMgaPjabcYcaSiabbQgaQbaakiabg2da9KqbaoaalaaabaGaeeivaq1aaSbaaeaacqqGPbqAaeqaaaqaaiabb6eaonaaBaaabaGaeeOAaOgabeaaaaGccqWGMbGzcqWGVbWBcqWGYbGCcqqGGaaicqWGPbqAcqGH9aqpcqaIXaqmcqGG6aGocqWGTbqBdaWgaaWcbaGaemivaqfabeaakiabcYcaSiabdQgaQjabg2da9iabigdaXiabcQda6iabd2gaTnaaBaaaleaacqWGobGtaeqaaaaa@4E4B@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>The intra-class variability is quantified using ratios <inline-formula><m:math name="1471-2105-9-380-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TT</m:mtext></m:mrow><m:mrow><m:mtext>i</m:mtext><m:mo>,</m:mo><m:mtext>k</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabbsfaubqaaiabbMgaPjabcYcaSiabbUgaRbaaaaa@3321@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math name="1471-2105-9-380-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>NN</m:mtext></m:mrow><m:mrow><m:mtext>j</m:mtext><m:mo>,</m:mo><m:mtext>1</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabb6eaojabb6eaobqaaiabbQgaQjabcYcaSiabbgdaXaaaaaa@3297@</m:annotation></m:semantics></m:math></inline-formula> defined below:</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i5" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msubsup>
                              <m:mtext>R</m:mtext>
                              <m:mrow>
                                 <m:mtext>TT</m:mtext>
                              </m:mrow>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                           </m:msubsup>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>T</m:mtext>
                                    <m:mtext>i</m:mtext>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>T</m:mtext>
                                    <m:mtext>k</m:mtext>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                           <m:mi>f</m:mi>
                           <m:mi>o</m:mi>
                           <m:mi>r</m:mi>
                           <m:mtext>&#160;</m:mtext>
                           <m:mi>i</m:mi>
                           <m:mo>=</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>T</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mi>k</m:mi>
                           <m:mo>=</m:mo>
                           <m:mi>i</m:mi>
                           <m:mo>+</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>T</m:mi>
                           </m:msub>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabbsfaubqaaiabdMgaPjabcYcaSiabdUgaRbaakiabg2da9KqbaoaalaaabaGaeeivaq1aaSbaaeaacqqGPbqAaeqaaaqaaiabbsfaunaaBaaabaGaee4AaSgabeaaaaGccqWGMbGzcqWGVbWBcqWGYbGCcqqGGaaicqWGPbqAcqGH9aqpcqaIXaqmcqGG6aGocqWGTbqBdaWgaaWcbaGaemivaqfabeaakiabcYcaSiabdUgaRjabg2da9iabdMgaPjabgUcaRiabigdaXiabcQda6iabd2gaTnaaBaaaleaacqWGubavaeqaaaaa@50B6@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>and</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msubsup>
                              <m:mtext>R</m:mtext>
                              <m:mrow>
                                 <m:mtext>NN</m:mtext>
                              </m:mrow>
                              <m:mrow>
                                 <m:mtext>j</m:mtext>
                                 <m:mo>,</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msubsup>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mtext>i</m:mtext>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mtext>1</m:mtext>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                           <m:mi>f</m:mi>
                           <m:mi>o</m:mi>
                           <m:mi>r</m:mi>
                           <m:mtext>&#160;</m:mtext>
                           <m:mi>j</m:mi>
                           <m:mo>=</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>N</m:mi>
                           </m:msub>
                           <m:mo>,</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>=</m:mo>
                           <m:mi>j</m:mi>
                           <m:mo>+</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>:</m:mo>
                           <m:msub>
                              <m:mi>m</m:mi>
                              <m:mi>N</m:mi>
                           </m:msub>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabb6eaojabb6eaobqaaiabbQgaQjabcYcaSiabigdaXaaakiabg2da9KqbaoaalaaabaGaeeOta40aaSbaaeaacqqGPbqAaeqaaaqaaiabb6eaonaaBaaabaGaeeymaedabeaaaaGccqWGMbGzcqWGVbWBcqWGYbGCcqqGGaaicqWGQbGAcqGH9aqpcqaIXaqmcqGG6aGocqWGTbqBdaWgaaWcbaGaemOta4eabeaakiabcYcaSiabigdaXiabg2da9iabdQgaQjabgUcaRiabigdaXiabcQda6iabd2gaTnaaBaaaleaacqWGobGtaeqaaaaa@4F20@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>A ratio-cutoff is chosen based on biological knowledge of the magnitude of difference in expression level required between groups (C<sub>TN</sub>) and amount of variability that is acceptable within groups (C<sub>TT </sub>and C<sub>NN</sub>). For example, an inter-class ratio cutoff of 3 implies that there should be at least a 3 fold difference in expression between a sample in class T and another sample in class N for the gene to be identified as differentially expressed for that pair of samples and an intra-class ratio cutoff of 1.5 means that the maximum acceptable difference in expression between any two samples in a class is 1.5 fold. Increasing C<sub>TN </sub>will lead to identification of genes which have a larger magnitude of difference between the two classes, while changing intra-class ratios (C<sub>TT </sub>and C<sub>NN</sub>) allows the user to change the magnitude of variability acceptable within a given class. Naturally, since increasing C<sub>TN </sub>or decreasing C<sub>TT </sub>or C<sub>NN </sub>leads to a decrease in the number of genes identified as differentially expressed, these parameters can be used to identify a tractable number of differentially expressed genes of a certain nature, for further analysis.</p>
            <p>To identify differentially expressed genes, the fraction of the inter-class ratios <inline-formula><m:math name="1471-2105-9-380-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TN</m:mtext></m:mrow><m:mrow><m:mtext>i</m:mtext><m:mo>,</m:mo><m:mtext>j</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabb6eaobqaaiabbMgaPjabcYcaSiabbQgaQbaaaaa@3313@</m:annotation></m:semantics></m:math></inline-formula> which are either greater than the ratio-cutoff C<sub>TN </sub>or smaller than 1/C<sub>TN </sub>is calculated as f<sub>TN</sub>. Similarly the fraction of intra-class ratios <inline-formula><m:math name="1471-2105-9-380-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TT</m:mtext></m:mrow><m:mrow><m:mi>i</m:mi><m:mo>,</m:mo><m:mi>k</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabbsfaubqaaiabdMgaPjabcYcaSiabdUgaRbaaaaa@3325@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math name="1471-2105-9-380-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>NN</m:mtext></m:mrow><m:mrow><m:mtext>j</m:mtext><m:mo>,</m:mo><m:mtext>1</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabb6eaojabb6eaobqaaiabbQgaQjabcYcaSiabbgdaXaaaaaa@3297@</m:annotation></m:semantics></m:math></inline-formula> which are either greater than the ratio-cutoff C<sub>TT </sub>and C<sub>NN </sub>respectively or smaller than 1/C<sub>TT </sub>and 1/C<sub>NN </sub>respectively are calculated as f<sub>TT </sub>and f<sub>NN</sub>.</p>
            <p>Thus,</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>f</m:mtext>
                              <m:mrow>
                                 <m:mtext>TN</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mtable columnalign="left">
                                       <m:mtr>
                                          <m:mtd>
                                             <m:msub>
                                                <m:mtext>Number&#160;of&#160;inter&#160;group&#160;ratios&#160;greater&#160;than&#160;C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>TN</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mo>+</m:mo>
                                             <m:mtext>Number&#160;of&#160;inter&#160;group&#160;ratios&#160;smaller&#160;than&#160;</m:mtext>
                                             <m:mn>1</m:mn>
                                             <m:msub>
                                                <m:mtext>/C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>TN</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                    </m:mtable>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8727;</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOzay2aaSbaaSqaaiabbsfaujabb6eaobqabaGccqGH9aqpjuaGdaWcaaqaamaabmaaeaqabeaacqqGobGtcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGPbqAcqqGUbGBcqqG0baDcqqGLbqzcqqGYbGCcqqGGaaicqqGNbWzcqqGYbGCcqqGVbWBcqqG1bqDcqqGWbaCcqqGGaaicqqGYbGCcqqGHbqycqqG0baDcqqGPbqAcqqGVbWBcqqGZbWCcqqGGaaicqqGNbWzcqqGYbGCcqqGLbqzcqqGHbqycqqG0baDcqqGLbqzcqqGYbGCcqqGGaaicqqG0baDcqqGObaAcqqGHbqycqqGUbGBcqqGGaaicqqGdbWqdaWgaaqaaiabbsfaujabb6eaobqabaaabaGaey4kaSIaeeOta4KaeeyDauNaeeyBa0MaeeOyaiMaeeyzauMaeeOCaiNaeeiiaaIaee4Ba8MaeeOzayMaeeiiaaIaeeyAaKMaeeOBa4MaeeiDaqNaeeyzauMaeeOCaiNaeeiiaaIaee4zaCMaeeOCaiNaee4Ba8MaeeyDauNaeeiCaaNaeeiiaaIaeeOCaiNaeeyyaeMaeeiDaqNaeeyAaKMaee4Ba8Maee4CamNaeeiiaaIaee4CamNaeeyBa0MaeeyyaeMaeeiBaWMaeeiBaWMaeeyzauMaeeOCaiNaeeiiaaIaeeiDaqNaeeiAaGMaeeyyaeMaeeOBa4MaeeiiaaIaeGymaeJaee4la8Iaee4qam0aaSbaaeaacqqGubavcqqGobGtaeqaaaaacaGLOaGaayzkaaaabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabgEHiQiabb2gaTnaaBaaabaGaeeOta4eabeaaaaaaaa@ADD6@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>f</m:mtext>
                              <m:mrow>
                                 <m:mtext>TT</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mtable columnalign="left">
                                       <m:mtr>
                                          <m:mtd>
                                             <m:msub>
                                                <m:mtext>Number&#160;of&#160;intra&#160;T&#160;group&#160;ratios&#160;greater&#160;than&#160;C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>TT</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mo>+</m:mo>
                                             <m:mtext>Number&#160;of&#160;intra&#160;T&#160;group&#160;ratios&#160;smaller&#160;than&#160;</m:mtext>
                                             <m:mn>1</m:mn>
                                             <m:msub>
                                                <m:mtext>/C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>TT</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                    </m:mtable>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8727;</m:mo>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>/</m:mo>
                                 <m:mn>2</m:mn>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOzay2aaSbaaSqaaiabbsfaujabbsfaubqabaGccqGH9aqpjuaGdaWcaaqaamaabmaaeaqabeaacqqGobGtcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGPbqAcqqGUbGBcqqG0baDcqqGYbGCcqqGHbqycqqGGaaicqqGubavcqqGGaaicqqGNbWzcqqGYbGCcqqGVbWBcqqG1bqDcqqGWbaCcqqGGaaicqqGYbGCcqqGHbqycqqG0baDcqqGPbqAcqqGVbWBcqqGZbWCcqqGGaaicqqGNbWzcqqGYbGCcqqGLbqzcqqGHbqycqqG0baDcqqGLbqzcqqGYbGCcqqGGaaicqqG0baDcqqGObaAcqqGHbqycqqGUbGBcqqGGaaicqqGdbWqdaWgaaqaaiabbsfaujabbsfaubqabaaabaGaey4kaSIaeeOta4KaeeyDauNaeeyBa0MaeeOyaiMaeeyzauMaeeOCaiNaeeiiaaIaee4Ba8MaeeOzayMaeeiiaaIaeeyAaKMaeeOBa4MaeeiDaqNaeeOCaiNaeeyyaeMaeeiiaaIaeeivaqLaeeiiaaIaee4zaCMaeeOCaiNaee4Ba8MaeeyDauNaeeiCaaNaeeiiaaIaeeOCaiNaeeyyaeMaeeiDaqNaeeyAaKMaee4Ba8Maee4CamNaeeiiaaIaee4CamNaeeyBa0MaeeyyaeMaeeiBaWMaeeiBaWMaeeyzauMaeeOCaiNaeeiiaaIaeeiDaqNaeeiAaGMaeeyyaeMaeeOBa4MaeeiiaaIaeGymaeJaee4la8Iaee4qam0aaSbaaeaacqqGubavcqqGubavaeqaaaaacaGLOaGaayzkaaaabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabgEHiQiabcIcaOiabb2gaTnaaBaaabaGaeeivaqfabeaacqGHsislcqaIXaqmcqGGPaqkcqGGVaWlcqaIYaGmaaaaaa@B749@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i10" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>f</m:mtext>
                              <m:mrow>
                                 <m:mtext>NN</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo>(</m:mo>
                                    <m:mtable columnalign="left">
                                       <m:mtr>
                                          <m:mtd>
                                             <m:msub>
                                                <m:mtext>Number&#160;of&#160;intra&#160;N&#160;group&#160;ratios&#160;greater&#160;than&#160;C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>NN</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mo>+</m:mo>
                                             <m:mtext>Number&#160;of&#160;intra&#160;N&#160;group&#160;ratios&#160;smaller&#160;than&#160;</m:mtext>
                                             <m:mn>1</m:mn>
                                             <m:msub>
                                                <m:mtext>/C</m:mtext>
                                                <m:mrow>
                                                   <m:mtext>NN</m:mtext>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mtd>
                                       </m:mtr>
                                    </m:mtable>
                                    <m:mo>)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8727;</m:mo>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>/</m:mo>
                                 <m:mn>2</m:mn>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOzay2aaSbaaSqaaiabb6eaojabb6eaobqabaGccqGH9aqpjuaGdaWcaaqaamaabmaaeaqabeaacqqGobGtcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGPbqAcqqGUbGBcqqG0baDcqqGYbGCcqqGHbqycqqGGaaicqqGobGtcqqGGaaicqqGNbWzcqqGYbGCcqqGVbWBcqqG1bqDcqqGWbaCcqqGGaaicqqGYbGCcqqGHbqycqqG0baDcqqGPbqAcqqGVbWBcqqGZbWCcqqGGaaicqqGNbWzcqqGYbGCcqqGLbqzcqqGHbqycqqG0baDcqqGLbqzcqqGYbGCcqqGGaaicqqG0baDcqqGObaAcqqGHbqycqqGUbGBcqqGGaaicqqGdbWqdaWgaaqaaiabb6eaojabb6eaobqabaaabaGaey4kaSIaeeOta4KaeeyDauNaeeyBa0MaeeOyaiMaeeyzauMaeeOCaiNaeeiiaaIaee4Ba8MaeeOzayMaeeiiaaIaeeyAaKMaeeOBa4MaeeiDaqNaeeOCaiNaeeyyaeMaeeiiaaIaeeOta4KaeeiiaaIaee4zaCMaeeOCaiNaee4Ba8MaeeyDauNaeeiCaaNaeeiiaaIaeeOCaiNaeeyyaeMaeeiDaqNaeeyAaKMaee4Ba8Maee4CamNaeeiiaaIaee4CamNaeeyBa0MaeeyyaeMaeeiBaWMaeeiBaWMaeeyzauMaeeOCaiNaeeiiaaIaeeiDaqNaeeiAaGMaeeyyaeMaeeOBa4MaeeiiaaIaeGymaeJaee4la8Iaee4qam0aaSbaaeaacqqGobGtcqqGobGtaeqaaaaacaGLOaGaayzkaaaabaGaeeyBa02aaSbaaeaacqqGobGtaeqaaiabgEHiQiabcIcaOiabb2gaTnaaBaaabaGaeeOta4eabeaacqGHsislcqaIXaqmcqGGPaqkcqGGVaWlcqaIYaGmaaaaaa@B6D1@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Genes for which f<sub>TN </sub>is significantly greater than f<sub>TT </sub>and f<sub>NN </sub>are calculated using a standard statistical test of comparing population proportions <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Thus, the null hypothesis tested is f<sub>TN </sub>&#8804; f<sub>TT </sub>and/or f<sub>TN </sub>&#8804; f<sub>NN</sub>. In biological terms, this translates to a null hypothesis that the inter class variability is less than or equal to the intra class variability. The allowable inter and intra-class variability is quantified by their respective ratio cutoffs. The test statistic is calculated using the formula <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>:</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i11" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>z</m:mtext>
                              <m:mrow>
                                 <m:mtext>TT</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>f</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:msub>
                                    <m:mtext>f</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TT</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msqrt>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mtext>q</m:mtext>
                                          <m:mrow>
                                             <m:mtext>TT</m:mtext>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mn>1</m:mn>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mtext>q</m:mtext>
                                          <m:mrow>
                                             <m:mtext>TT</m:mtext>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mfrac>
                                          <m:mn>1</m:mn>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>T</m:mtext>
                                             </m:msub>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>N</m:mtext>
                                             </m:msub>
                                          </m:mrow>
                                       </m:mfrac>
                                       <m:mo>+</m:mo>
                                       <m:mfrac>
                                          <m:mn>1</m:mn>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>T</m:mtext>
                                             </m:msub>
                                             <m:mo stretchy="false">(</m:mo>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>T</m:mtext>
                                             </m:msub>
                                             <m:mo>&#8722;</m:mo>
                                             <m:mn>1</m:mn>
                                             <m:mo stretchy="false">)</m:mo>
                                             <m:mo>/</m:mo>
                                             <m:mn>2</m:mn>
                                          </m:mrow>
                                       </m:mfrac>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:msqrt>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOEaO3aaSbaaSqaaiabbsfaujabbsfaubqabaGccqGH9aqpjuaGdaWcaaqaaiabbAgaMnaaBaaabaGaeeivaqLaeeOta4eabeaacqGHsislcqqGMbGzdaWgaaqaaiabbsfaujabbsfaubqabaaabaWaaOaaaeaacqqGXbqCdaWgaaqaaiabbsfaujabbsfaubqabaGaeiikaGIaeGymaeJaeyOeI0IaeeyCae3aaSbaaeaacqqGubavcqqGubavaeqaaiabcMcaPiabcIcaOmaalaaabaGaeGymaedabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabb2gaTnaaBaaabaGaeeOta4eabeaaaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqqGTbqBdaWgaaqaaiabbsfaubqabaGaeiikaGIaeeyBa02aaSbaaeaacqqGubavaeqaaiabgkHiTiabigdaXiabcMcaPiabc+caViabikdaYaaacqGGPaqkaeqaaaaaaaa@5A7D@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i12" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>z</m:mtext>
                              <m:mrow>
                                 <m:mtext>NN</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>f</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:msub>
                                    <m:mtext>f</m:mtext>
                                    <m:mrow>
                                       <m:mtext>NN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msqrt>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mtext>q</m:mtext>
                                          <m:mrow>
                                             <m:mtext>NN</m:mtext>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mn>1</m:mn>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mtext>q</m:mtext>
                                          <m:mrow>
                                             <m:mtext>NN</m:mtext>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mfrac>
                                          <m:mn>1</m:mn>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>T</m:mtext>
                                             </m:msub>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>N</m:mtext>
                                             </m:msub>
                                          </m:mrow>
                                       </m:mfrac>
                                       <m:mo>+</m:mo>
                                       <m:mfrac>
                                          <m:mn>1</m:mn>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>N</m:mtext>
                                             </m:msub>
                                             <m:mo stretchy="false">(</m:mo>
                                             <m:msub>
                                                <m:mtext>m</m:mtext>
                                                <m:mtext>N</m:mtext>
                                             </m:msub>
                                             <m:mo>&#8722;</m:mo>
                                             <m:mn>1</m:mn>
                                             <m:mo stretchy="false">)</m:mo>
                                             <m:mo>/</m:mo>
                                             <m:mn>2</m:mn>
                                          </m:mrow>
                                       </m:mfrac>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:msqrt>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOEaO3aaSbaaSqaaiabb6eaojabb6eaobqabaGccqGH9aqpjuaGdaWcaaqaaiabbAgaMnaaBaaabaGaeeivaqLaeeOta4eabeaacqGHsislcqqGMbGzdaWgaaqaaiabb6eaojabb6eaobqabaaabaWaaOaaaeaacqqGXbqCdaWgaaqaaiabb6eaojabb6eaobqabaGaeiikaGIaeGymaeJaeyOeI0IaeeyCae3aaSbaaeaacqqGobGtcqqGobGtaeqaaiabcMcaPiabcIcaOmaalaaabaGaeGymaedabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabb2gaTnaaBaaabaGaeeOta4eabeaaaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqqGTbqBdaWgaaqaaiabb6eaobqabaGaeiikaGIaeeyBa02aaSbaaeaacqqGobGtaeqaaiabgkHiTiabigdaXiabcMcaPiabc+caViabikdaYaaacqGGPaqkaeqaaaaaaaa@5A05@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where,</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i13" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>q</m:mtext>
                              <m:mrow>
                                 <m:mtext>TT</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>+</m:mo>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TT</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo>+</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>/</m:mo>
                                 <m:mn>2</m:mn>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeyCae3aaSbaaSqaaiabbsfaujabbsfaubqabaGccqGH9aqpjuaGdaWcaaqaaiabb6eaonaaBaaabaGaeeivaqLaeeOta4eabeaacqGHRaWkcqqGobGtdaWgaaqaaiabbsfaujabbsfaubqabaaabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabb2gaTnaaBaaabaGaeeOta4eabeaacqGHRaWkcqqGTbqBdaWgaaqaaiabbsfaubqabaGaeiikaGIaeeyBa02aaSbaaeaacqqGubavaeqaaiabgkHiTiabigdaXiabcMcaPiabc+caViabikdaYaaaaaa@4ADF@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i14" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mtext>q</m:mtext>
                              <m:mrow>
                                 <m:mtext>NN</m:mtext>
                              </m:mrow>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mrow>
                                       <m:mtext>TN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>+</m:mo>
                                 <m:msub>
                                    <m:mtext>N</m:mtext>
                                    <m:mrow>
                                       <m:mtext>NN</m:mtext>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>T</m:mtext>
                                 </m:msub>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo>+</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mtext>m</m:mtext>
                                    <m:mtext>N</m:mtext>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>/</m:mo>
                                 <m:mn>2</m:mn>
                              </m:mrow>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeyCae3aaSbaaSqaaiabb6eaojabb6eaobqabaGccqGH9aqpjuaGdaWcaaqaaiabb6eaonaaBaaabaGaeeivaqLaeeOta4eabeaacqGHRaWkcqqGobGtdaWgaaqaaiabb6eaojabb6eaobqabaaabaGaeeyBa02aaSbaaeaacqqGubavaeqaaiabb2gaTnaaBaaabaGaeeOta4eabeaacqGHRaWkcqqGTbqBdaWgaaqaaiabb6eaobqabaGaeiikaGIaeeyBa02aaSbaaeaacqqGobGtaeqaaiabgkHiTiabigdaXiabcMcaPiabc+caViabikdaYaaaaaa@4A97@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>m<sub>T </sub>is the number of samples in class T</p>
            <p>m<sub>N </sub>is the number of samples in class N</p>
            <p>N<sub>TN </sub>is the number of ratios <inline-formula><m:math name="1471-2105-9-380-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TN</m:mtext></m:mrow><m:mrow><m:mtext>i</m:mtext><m:mo>,</m:mo><m:mtext>j</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabb6eaobqaaiabbMgaPjabcYcaSiabbQgaQbaaaaa@3313@</m:annotation></m:semantics></m:math></inline-formula> which are greater than the ratio-cutoff C<sub>TN </sub>or smaller than 1/C<sub>TN</sub></p>
            <p>N<sub>TT </sub>is the number of ratios <inline-formula><m:math name="1471-2105-9-380-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>TT</m:mtext></m:mrow><m:mrow><m:mtext>i</m:mtext><m:mo>,</m:mo><m:mtext>k</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabbsfaujabbsfaubqaaiabbMgaPjabcYcaSiabbUgaRbaaaaa@3321@</m:annotation></m:semantics></m:math></inline-formula> which are greater than the ratio-cutoff C<sub>TT </sub>or smaller than 1/C<sub>TT</sub></p>
            <p>N<sub>NN </sub>is the number of ratios <inline-formula><m:math name="1471-2105-9-380-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mtext>R</m:mtext><m:mrow><m:mtext>NN</m:mtext></m:mrow><m:mrow><m:mtext>j</m:mtext><m:mo>,</m:mo><m:mtext>1</m:mtext></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuai1aa0baaSqaaiabb6eaojabb6eaobqaaiabbQgaQjabcYcaSiabbgdaXaaaaaa@3297@</m:annotation></m:semantics></m:math></inline-formula> which are greater than the ratio-cutoff C<sub>NN </sub>or smaller than 1/C<sub>NN</sub></p>
            <p>The significance values p<sub>TT </sub>and p<sub>NN</sub>, corresponding to z<sub>TT </sub>and z<sub>NN </sub>are calculated. These values indicate the significance level of the difference between proportions of the inter-class ratios greater than inter-class cutoff and the respective intra-class ratios greater than intra-class cutoff. A p-value cut-off is chosen (p<sub>cutoff</sub>) to identify genes with significant difference between the proportion of the inter-class and intra-class ratios which are greater than the respective ratio-cutoffs chosen. Thus, genes with p<sub>TT </sub>&lt; p<sub>cutoff </sub>and p<sub>NN </sub>&lt; p<sub>cutoff </sub>are selected as differentially expressed. It should be noted here that the test allows the flexibility of controlling intra-class variability in only any one class or in both classes. For example, differentially expressed genes with low variability in N only can be ranked by using the condition p<sub>NN </sub>&lt; p<sub>cutoff </sub>and a relatively stringent value of C<sub>NN</sub>. In summary, the three parameters which need to be chosen to rank differentially expressed genes are listed in Table <tblr tid="T1">1</tblr>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Parameters used in the Population Proportion Ranking Method</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Parameter</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Description</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Remark</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C<sub>TN</sub></p>
                     </c>
                     <c ca="left">
                        <p>Ratio cutoff for inter-class ratios</p>
                     </c>
                     <c ca="left">
                        <p>Chosen based on the required magnitude of difference in expression between the two classes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C<sub>TT</sub>, C<sub>NN</sub></p>
                     </c>
                     <c ca="left">
                        <p>Ratio cutoff for intra-class ratios</p>
                     </c>
                     <c ca="left">
                        <p>Chosen based on allowable heterogeneity in expression within a class</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p<sub>cutoff</sub></p>
                     </c>
                     <c ca="left">
                        <p>Significance value cutoff for significance of difference between the proportions of inter-class and intra-class populations greater than respective ratio cutoffs</p>
                     </c>
                     <c ca="left">
                        <p>Chosen based on required stringency in difference between the proportions of inter-class and intra-class populations greater than respective ratio cutoffs</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <sec>
               <st>
                  <p>Assumptions</p>
               </st>
               <p>The test makes an assumption of 1) Random and independent selection of inter-class and intra-class ratios and 2) Large sample size of the inter-class ratios and inter-class ratios, so the sampling distributions of differences of proportions are very closely normally distributed. Though the samples within each class are reasonably expected to be selected randomly and independently, all inter- and intra- group ratios are not independent. Specifically, there are only (m<sub>T </sub>+ m<sub>N </sub>-1) independent inter-class ratios and (m<sub>T </sub>-1) or (m<sub>N </sub>-1) independent intra-class ratios. Hence the effective sample size is smaller leading to smaller reported significance values. However, in order to capture the true variability between all samples in a group or between groups, it is essential to use all inter-class and intra-class ratios. Hence the reported significance values are not exact and should only be used to calculate the relative merit of genes, and not the actual distance between them.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Testing</p>
            </st>
            <p>PPRM is tested on 5 sets of simulated data representing various intra and inter-class variability situations and compared to the t test, PPST, COPA, OS, ORT and MOST. PPST is implemented through the online implementation provided by Lyons-Weiler <it>et al</it>. <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> available at <url>http://bioinformatics.upmc.edu/GE2/GEDA.html</url>. COPA, OS, ORT and MOST were implemented using the R code by Lian <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> available at <url>http://www.ntu.edu.sg/home/henglian/most.htm</url>. PPRM is also tested on three publicly available cancer datas and used to identify predictor genes that can be used for classification. The classification accuracy using predictors identified by the PPRM is comparable to other reported classification accuracies.</p>
         </sec>
         <sec>
            <st>
               <p>Simulated data</p>
            </st>
            <p>PPRM is tested on a simulated data set of 10000 genes measured in 20 samples belonging to two classes: 10 samples in class T and 10 samples in class N. 1000 out of the 10000 genes were modeled as differentially expressed. Simulated data sets were generated using the random number generator function in Matlab (The Mathworks, Inc., Natick, MA, USA) under normal distribution. To simulate the scenario where only a subset of samples within a class are differentially expressed, in case 3, 4 and 5, it is assumed that ~30% of the samples for the 1000 genes show differential expression. Table <tblr tid="T2">2</tblr> indicates the parameters for the normal distributions that were used to simulate the data (mean and standard deviation). Figure <figr fid="F2">2A</figr> shows a representative example for the distribution of expression levels across samples in the two classes for all 5 cases, using the parameters in Table <tblr tid="T2">2</tblr>. Figure <figr fid="F2">2B</figr> shows the distribution of inter-class and both intra-class ratios for all 5 cases. Data for non-differentially expressed genes is simulated using parameters of a mean of 100 and standard deviation of 30 (not indicated in Table <tblr tid="T2">2</tblr>). The inter-class ratio cutoff is chosen equal to the ratio of mean expression level in the two classes. PPST, COPA, OS, ORT, MOST and t test were also used to analyze the simulated data. For COPA, OS, ORT and MOST, p-values for each gene were calculate from the test statistics as the proportion of the 9000 genes (with an identical distribution in both classes) with absolute test statistics larger than that of this gene <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. A significance value cutoff of 0.01 is used for all methods.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Parameters used to generate simulated data for the 5 cases tested</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <b>Class T</b>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <b>Class N</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of Samples</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Stdev*</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of Samples</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Mean</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Stdev*</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 1</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>250</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 2</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>250</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>200</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>900</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 4</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>400</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>300</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>130</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 5</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>900</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>400</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>* Standard deviation</p>
                  <p>Parameters for the normal distributions used are indicated only for the 1000 differentially expressed genes. Data for the 9000 non differentially expressed genes is simulated using a mean of 100 and standard deviation of 30. Expression levels of samples for each case (for a representative example) are indicated in Figure 2A in the form of a heatmap</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Distribution of expression values and inter- and intra-class ratios for all 5 cases listed in Table 2 (for a representative example) (A) Heatmap of expression levels across samples in the two classes for all 5 cases, using the parameters in Table 2</p>
               </caption>
               <text>
                  <p><b>Distribution of expression values and inter- and intra-class ratios for all 5 cases listed in Table 2 (for a representative example) (A) Heatmap of expression levels across samples in the two classes for all 5 cases, using the parameters in Table 2. </b>Values above 900 are indicated by the maximum intensity. (B) Heatmap of absolute values of log<sub>2 </sub>transformed inter-class and intra-class ratios for all 5 cases. Values above 3 are indicated by the maximum intensity.</p>
               </text>
               <graphic file="1471-2105-9-380-2"/>
            </fig>
            <p>For all the methods, the following metrics were used to evaluate the performance of the method.</p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i15" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mtext>Recall</m:mtext>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mtext>True&#160;positives</m:mtext>
                              </m:mrow>
                              <m:mrow>
                                 <m:mtext>True&#160;positives</m:mtext>
                                 <m:mo>+</m:mo>
                                 <m:mtext>False&#160;negatives</m:mtext>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>&#215;</m:mo>
                           <m:mn>100</m:mn>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOuaiLaeeyzauMaee4yamMaeeyyaeMaeeiBaWMaeeiBaWMaeyypa0tcfa4aaSaaaeaacqqGubavcqqGYbGCcqqG1bqDcqqGLbqzcqqGGaaicqqGWbaCcqqGVbWBcqqGZbWCcqqGPbqAcqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCaeaacqqGubavcqqGYbGCcqqG1bqDcqqGLbqzcqqGGaaicqqGWbaCcqqGVbWBcqqGZbWCcqqGPbqAcqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCcqGHRaWkcqqGgbGrcqqGHbqycqqGSbaBcqqGZbWCcqqGLbqzcqqGGaaicqqGUbGBcqqGLbqzcqqGNbWzcqqGHbqycqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCaaGccqGHxdaTcqaIXaqmcqaIWaamcqaIWaamaaa@744D@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula>
                  <m:math name="1471-2105-9-380-i16" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mtext>FPR</m:mtext>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mtext>False&#160;positives</m:mtext>
                              </m:mrow>
                              <m:mrow>
                                 <m:mtext>False&#160;positives</m:mtext>
                                 <m:mo>+</m:mo>
                                 <m:mtext>True&#160;negatives</m:mtext>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>&#215;</m:mo>
                           <m:mn>100</m:mn>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOrayKaeeiuaaLaeeOuaiLaeyypa0tcfa4aaSaaaeaacqqGgbGrcqqGHbqycqqGSbaBcqqGZbWCcqqGLbqzcqqGGaaicqqGWbaCcqqGVbWBcqqGZbWCcqqGPbqAcqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCaeaacqqGgbGrcqqGHbqycqqGSbaBcqqGZbWCcqqGLbqzcqqGGaaicqqGWbaCcqqGVbWBcqqGZbWCcqqGPbqAcqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCcqGHRaWkcqqGubavcqqGYbGCcqqG1bqDcqqGLbqzcqqGGaaicqqGUbGBcqqGLbqzcqqGNbWzcqqGHbqycqqG0baDcqqGPbqAcqqG2bGDcqqGLbqzcqqGZbWCaaGccqGHxdaTcqaIXaqmcqaIWaamcqaIWaamaaa@70FF@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>where:</p>
            <p>True positives = Number of truly differentially expressed genes identified</p>
            <p>False positives = Number of genes identified which are not differentially expressed</p>
            <p>False negatives = Number of truly differentially expressed genes not identified</p>
            <p>True negatives = Number of genes which are not differentially expressed, which are correctly not identified</p>
            <p>FPR = False positive rate</p>
            <p>In order to assess the effect of violation of the assumption of independence, the distributions of z<sub>TT </sub>and z<sub>NN </sub>were analyzed for the simulated data. The mean and standard distribution of z<sub>NN </sub>for all 5 cases analyzed is shown in Figure <figr fid="F3">3</figr>. The average of the means for the 9000 non- differentially expressed genes across all 5 cases is 0.07, and of the standard deviation is 1.3, while the values of the same statistics for all 10000 genes across all 5 cases are 0.4 and 1.8 respectively. The reader is reminded that due to lack of independence of all the inter- and intra- group ratios, the p-values calculated are not exact and are to be used only for the purpose of prioritizing and ranking genes. In all the following discussion, the p-value cutoff is used for selecting a subset of the top ranked differentially expressed genes. An alternate approach would be to select a fixed number of top ranking genes. However, in cases where more than one gene has the same significance value, selecting a fixed number of top ranking genes involves randomly disregarding some genes. Hence to avoid this, the p-value cutoff approach is used.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Mean and standard distribution of z<sub>NN </sub>for all 5 cases of simulated data indicated in Table 2</p>
               </caption>
               <text>
                  <p><b>Mean and standard distribution of z<sub>NN </sub>for all 5 cases of simulated data indicated in Table 2.</b> The mean and standard deviation of only the 9000 non-differentially expressed genes is indicated separately from the mean and standard deviation of all 10000 genes. Case 1: Solid bars, Case 2: Dotted fill, Case 3: Vertical lines, Case 4: Horizontal lines, Case 5: Diagonal lines.</p>
               </text>
               <graphic file="1471-2105-9-380-3"/>
            </fig>
            <p>An ideal method will have a 100% Recall and 0% False Positive Rate (FPR). Figure <figr fid="F4">4</figr> summarizes the Recall and FPR for all methods for the 5 cases described in Table <tblr tid="T2">2</tblr>. The inter-class ratio cutoff (C<sub>TN</sub>) used is chosen based on the known ratio of the means of all samples in the two classes. The intra-class ratio cutoffs (C<sub>TT </sub>and C<sub>NN</sub>) are chosen to be equal to the inter-class cutoff in all cases, with exceptions as described below. The C<sub>TN</sub>, C<sub>TT </sub>and C<sub>NN </sub>values used for PPRM in for all 5 cases are listed in Table <tblr tid="T3">3</tblr>. A significance value cutoff of 0.01 is used for all methods.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Inter-class and intra-class ratio cutoffs used in the analysis of simulated data using PPRM</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>C<sub>TN</sub></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>C<sub>TT</sub></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>C<sub>NN</sub></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 1</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 4</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Case 5</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Results on the analysis of simulated data using PPRM, t test, PPST, COPA, OS, ORT and MOST</p>
               </caption>
               <text>
                  <p><b>Results on the analysis of simulated data using PPRM, t test, PPST, COPA, OS, ORT and MOST.</b> (A) Percentage Recall for all 5 cases listed in Table 2 (B) Percentage FPR for all 5 cases listed in Table 2. PPRM: Blue, t test: Orange, PPST: Green, COPA: Brown, OS: Yellow, ORT: Red, MOST: Pink.</p>
               </text>
               <graphic file="1471-2105-9-380-4"/>
            </fig>
            <p>Case 1 is an example of a case of differential expression, with low variability within samples. As seen in Figure <figr fid="F2">2B</figr>, all intra-class ratios have small values while the inter-class ratios are higher. PPRM, t test, ORT and MOST identify most differentially expressed genes, with PPRM having the lowest FPR.</p>
            <p>Case 2 is an example of genes which do not have a significant difference in expression level in the two classes and have larger variability as compared to case 1. Here again, PPRM has the lowest FPR among all methods tested.</p>
            <p>Case 3 is an example of genes which have a low variability in one class, but very high variability in the other due to a subset of samples. Here, the intra-class ratios for class T are small, while those for class N are high (Figure <figr fid="F2">2B</figr>). In this case, COPA, OS, ORT and MOST have a 100% Recall. PPRM does not identify any differentially expressed gene when heterogeneity in both classes is controlled (i.e. both conditions p<sub>TT </sub>&lt; p<sub>cutoff </sub>and p<sub>NN </sub>&lt; p<sub>cutoff </sub>used; data not shown). However, if heterogeneity in class T is allowed by only using the condition p<sub>NN </sub>&lt; p<sub>cutoff</sub>, PPRM has a 96% Recall and 1% FPR, which is similar to the other methods. This is an example of the application of PPRM allowing the control of heterogeneity in any one class only.</p>
            <p>Case 4 is an example of genes which have moderate variability in one class and high variability in the other. This is different from case 3 in having the magnitude of expression level between the two classes lower (average 2-fold) than that in case 3 (average 3-fold). Again, the t test, PPST and OS have a poor Recall. ORT and MOST have a Recall of 99% and 94% with a FPR of 1%. PPRM does not identify any differentially expressed gene when heterogeneity in both classes is controlled (data not shown), but when heterogeneity in class T is allowed (p<sub>NN </sub>&lt; p<sub>cutoff </sub>is the only condition used), a 98% Recall is obtained, but at the cost of 6% FPR. There is thus a trade-off between identifying all truly differentially expressed genes and obtaining false positives. Increasing the stringency of the parameters (e.g. increase in C<sub>TN</sub>, decrease in p<sub>cutoff</sub>) can reduce FPR at the expense of Recall (data not shown).</p>
            <p>Case 5 is an example of a gene with high variability in both classes, which should ideally not be identified as differentially expressed. Here, there does not appear to be a significant difference in the distribution of inter-class and intra-class ratios, as seen in Figure <figr fid="F2">2B</figr>. PPRM has FPR of 0.02% which is the lowest, followed by the t test and PPST at 1%. COPA, OS, ORT and MOST have a FPR of 11%. (Note: Not accounting for variability in class N by PPRM has a FPR of 7%. This FPR decreases as the values of C<sub>TN </sub>and C<sub>TT </sub>are increased)</p>
            <p>In summary, in cases where the heterogeneity in the sample population is low as exemplified by Case 1, all tests except COPA and OS perform reasonably well in identifying true positives. The t test, PPST, COPA and OS fail to identify differentially expressed genes in most cases, whereas PPRM, ORT and MOST can identify most differentially expressed genes in all cases. However, though ORT and MOST give lower FPR for case 4, they give higher FPRs than PPRM in Case 2 and 5 representing non differentially expressed genes.</p>
            <p>In the case of simulated data, the inter-class and intra-class ratios were chosen based on knowledge of expression levels of truly differentially expressed genes, which will clearly not be the case in real world data. However, for real-world data, these parameters will be chosen based on the requirement of specific types of genes. More than one set of parameters can be used for an analysis to obtain different groups of differentially expressed genes. For example, using low intra-class cutoffs allows the identification of differentially expressed genes with low intra-class variability whereas using a higher value of one intra-class cutoff (C<sub>TT </sub>or C<sub>NN</sub>) also identifies genes with higher heterogeneity in that group (T or N, respectively).</p>
         </sec>
         <sec>
            <st>
               <p>Experimental Data</p>
            </st>
            <p>Variability in simulated data cannot mimic the heterogeneity in real biological data, and hence PPRM is also tested on the following three publicly available experimental data sets. Since there is no gold standard of a list of differentially expressed genes in real world data, simply identifying differentially expressed genes in a data set is not adequate to test the method. Though the distinguishing feature of PPRM lies in its ability to identify differentially expressed genes with greater variability between samples in a class, the method is also able to identify differentially expressed with low variability within groups based on the choice of parameters used for the test. Hence, in analyzing real biological data, an approach of identifying a relatively small number of 'predictor' genes is adopted and their accuracy in being able to predict the class of an unknown sample is tested. This approach of validation of new methods of identification of differentially expressed genes has also been used by other researchers <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. The classification accuracy is expected to be similar to other reported values, but not necessarily better since the primary goal of this report is not to identify genes for classification.</p>
            <p>In order to identify biomarkers, stringent conditions are used (i.e. higher values of inter-class ratio cut-off, lower value of intra-class ratio cutoff and lower values of cutoff of the p-value) to select a small number of genes with low heterogeneity in expression within a class. For the biological data sets used below, misclassification rates reported using some other methods are included for the sake of general comparison. For the leukemia data set, the independent data set available is used to test the prediction power of selected genes. For all other data sets, a LOOCV technique is used. To avoid bias in gene selection from the sample which is left out, the list of differentially expressed genes is calculated separately every time with the same parameters, and this list is used to predict the class of the sample that is left out. Classification is performed using Discriminant Analysis in Matlab (The Mathworks, Inc., Natick, MA, USA).</p>
            <sec>
               <st>
                  <p>Leukemia data</p>
               </st>
               <p>Gene expression profiles of two types of leukemia samples were derived from 47 patients with acute lymphoblastic leukemia (ALL) and 25 patients with acute myeloblastic leukemia by Golub <it>et al </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Data is obtained from the Broad Institute website at <url>http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi</url>.</p>
               <p>The training data consists of gene expression data from 27 patients with acute lymphoblastic leukemia (ALL) and 11 patients with acute myeloblastic leukemia (AML) while the independent data set consisted of 20 ALL samples and 14 AML samples. Genes for which less than 5 samples had a "Present" call were not used in the analysis. The values of the three parameters for PPRM are listed in Table <tblr tid="T4">4</tblr>. In the original publication by Golub <it>et al </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, the authors identified 50 genes as biomarkers based on their method of neighborhood analysis, and tested the use of these genes to predict the class of samples in the independent data set. They correctly classified all samples on which a prediction is made, 29 out of 34, declining to predict the other five. Using a support vector machine method, Furey <it>et al </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp> could correctly classify 30 to 32 out of the 34 samples. Using the parameters listed in Table <tblr tid="T4">4</tblr>, six differentially expressed genes were identified using PPRM. These genes were used as biomarkers to test the accuracy of class prediction for samples in the independent data set. Out of the 34 samples, 33 were accurately classified using the 6 genes identified by PPRM.</p>
               <tbl id="T4">
                  <title>
                     <p>Table 4</p>
                  </title>
                  <caption>
                     <p>Parameters used for the analysis of the three cancer data sets</p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c ca="center">
                           <p>
                              <b>Parameter</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Leukemia</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Prostate Cancer</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Colon cancer</b>
                           </p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>C<sub>TN</sub></p>
                        </c>
                        <c ca="center">
                           <p>2</p>
                        </c>
                        <c ca="center">
                           <p>3.5</p>
                        </c>
                        <c ca="center">
                           <p>3</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>C<sub>TT</sub></p>
                        </c>
                        <c ca="center">
                           <p>1.5</p>
                        </c>
                        <c ca="center">
                           <p>2</p>
                        </c>
                        <c ca="center">
                           <p>3</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>C<sub>NN</sub></p>
                        </c>
                        <c ca="center">
                           <p>1.5</p>
                        </c>
                        <c ca="center">
                           <p>2</p>
                        </c>
                        <c ca="center">
                           <p>-</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>p<sub>cutoff</sub></p>
                        </c>
                        <c ca="center">
                           <p>0.0001</p>
                        </c>
                        <c ca="center">
                           <p>0.001</p>
                        </c>
                        <c ca="center">
                           <p>1e-10</p>
                        </c>
                     </r>
                  </tblbdy>
               </tbl>
            </sec>
            <sec>
               <st>
                  <p>Prostate cancer data</p>
               </st>
               <p>The prostate cancer data set generated by Singh <it>et al </it><abbrgrp><abbr bid="B48">48</abbr></abbrgrp> consists of 92 samples, 45 of which were non-tumor prostate samples and 47 of which were prostate tumor. The data set is publicly available and is obtained from the Broad Institute website <url>http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi</url>. Genes for which less than 20 samples had a "Present" call were not used in the analysis. A LOOCV technique is used for this data set. In the original paper, a 10% error rate in sample classification using LOOCV is obtained, while Dettling <it>et al </it><abbrgrp><abbr bid="B50">50</abbr></abbrgrp> reported misclassification rates between 5%&#8211;14% using supervised clustering. In this study, using the parameters listed in Table <tblr tid="T4">4</tblr>, an 8% error rate in sample classification using LOOCV is obtained. The number of biomarker genes identified in all LOOCV runs is between 9 and 18.</p>
            </sec>
            <sec>
               <st>
                  <p>Colon cancer data</p>
               </st>
               <p>The colon cancer data set generated by Alon <it>et al</it>. <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>consists of 62 samples, 40 tumor samples and 22 normal controls. The gene expression data were downloaded from <url>http://microarray.princeton.edu/oncology/affydata/index.html</url>. LOOCV is also used for this data set. Other researchers have obtained misclassification rates (including unclassified samples) between 8% to 34% <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp> using various methods like nearest neighbor classifiers, SVM, boosting, 'Minimum Redundancy- Maximum Relevancy', Bayes error filter for gene selection and supervised clustering.</p>
               <p>In this study, using the parameters listed in Table <tblr tid="T4">4</tblr>, a 16% error rate in sample classification using the LOOCV is obtained. The number of biomarker genes identified in all LOOCV validation runs is between 7 and 13, with one exception where 23 genes were identified.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>DNA microarray analysis is being increasingly used to identify differences between two or more classes like diseased and healthy tissue. Most methods used for the identification of differentially expressed genes between two classes identify genes where the variability between samples in a class is low. However there can be significant variability among samples in a class due to differences between individual subjects and their environment <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. PPRM uses inter-sample ratios to quantify variability in expression. This method allows for the identification of genes where the user can define the allowable heterogeneity within one or both classes and required difference in expression between samples in the two classes. Since all inter-class and intra-class ratios used in this method are not independent, the significance values calculated by PPRM are not exact and should be used only for ranking and prioritizing genes. The mean and standard deviation of the test statistic are reported for the simulated data sets to facilitate the assessment of the impact of violation of the assumptions for the sample size of 10 samples in each class (i.e. 100 inter-class ratios and 45 intra-class ratios for each class, out of which 19 and 9 respectively are independent).</p>
         <p>PPRM works as well or better than all other methods tested in data sets where the heterogeneity in samples is low. In simulated cases tested where variability is high, ORT, MOST and PPRM successfully identify most differentially expressed genes. In addition to a high Recall, it is necessary for any method to minimize the number of false positives identified. Genes with high variability in expression levels among samples in both classes should not be identified as differentially expressed simply because the expression level in some samples in one class is different than the expression level of some samples in the other class. This is tested in case 2 and 5 in the simulated data, where reassuringly a very low FPR of 0.2 and 0.02% is obtained using PPRM. However, for these cases ORT and MOST consistently resulted in higher values of the test statistic for the 1000 non differentially expressed genes resulting in high FPRs. This is likely due to the lack of an additional constraint of relative difference in these methods as available in PPRM.</p>
         <p>PPRM is also able to identify differentially expressed with low variability within groups, based on the choice of parameters used for the test. Hence, it is possible to test it on publicly available cancer data sets by assessing the success of the genes identified in correctly classifying samples in the two groups. The classification accuracies obtained for the three publicly available cancer data sets used for testing are similar to those reported using other methods.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The Population Proportion Ranking Method (PPRM) presented here quantifies variability in terms of inter-sample ratios and allows for the identification of genes where the user can define the allowable heterogeneity within one or both classes and required difference in expression between samples in the two classes for ranking differentially expressed genes.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The motivation for this problem was obtained while MG was a post doctoral research associate in Wei-Shou Hu's laboratory in the Chemical Engineering and Materials Science Department at the University of Minnesota. This research was supported by a start up grant MLP011026 from the National Chemical Laboratory.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>A concise guide to cDNA microarray analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Hegde</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Abernathy</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gay</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dharap</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gaspard</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Snesrud</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BioTechniques</source>
            <pubdate>2000</pubdate>
            <volume>29</volume>
            <issue>3</issue>
            <fpage>548</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10997270</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Expression profiling using cDNA microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Duggan</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Bittner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Meltzer</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Trent</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <issue>1 suppl</issue>
            <fpage>10</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/4434</pubid>
                  <pubid idtype="pmpid" link="fulltext">9915494</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Microarrays: Biotechnology's discovery platform for functional genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Schena</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Heller</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Theriault</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Konrad</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lachenmeier</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
            </aug>
            <source>Trends in Biotechnology</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <issue>7</issue>
            <fpage>301</fpage>
            <lpage>306</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0167-7799(98)01219-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">9675914</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>DNA chips: State-of-the art</p>
            </title>
            <aug>
               <au>
                  <snm>Ramsay</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nature Biotechnology</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>40</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt0198-40</pubid>
                  <pubid idtype="pmpid" link="fulltext">9447591</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Microarray Biochip Technology</p>
            </title>
            <aug>
               <au>
                  <snm>Schena</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <cnm>ed</cnm>
               </au>
            </aug>
            <publisher>Natick, MA: Eaton Publishing</publisher>
            <pubdate>2000</pubdate>
         </bibl>
         <bibl id="B6">
            <title>
               <p>DNA Microarrays: A molecular cloning manual</p>
            </title>
            <aug>
               <au>
                  <snm>Bowtell</snm>
                  <fnm>DDL</fnm>
               </au>
               <au>
                  <snm>Sambrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <cnm>eds</cnm>
               </au>
            </aug>
            <publisher>Cold Spring, NY: Cold Spring Harbor Press</publisher>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B7">
            <title>
               <p>From signatures to models: Understanding cancer using microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kaminski</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Regev</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>2005</pubdate>
            <volume>37</volume>
            <issue>6 suppl</issue>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15920529</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>DNA-microarray analysis of brain cancer: Molecular classification for therapy</p>
            </title>
            <aug>
               <au>
                  <snm>Mischel</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Cloughesy</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>SF</fnm>
               </au>
            </aug>
            <source>Nature Reviews Neuroscience</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>10</issue>
            <fpage>782</fpage>
            <lpage>792</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrn1518</pubid>
                  <pubid idtype="pmpid" link="fulltext">15378038</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Towards integrated clinico-genomic models for personalized medicine: Combining gene expression signatures and clinical factors in breast cancer outcomes prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Nevins</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Dressman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pittman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>West</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Human Molecular Genetics</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <issue>2</issue>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12928487</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Gene-expression profiling in human cutaneous melanoma</p>
            </title>
            <aug>
               <au>
                  <snm>Carr</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Bittner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Trent</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Oncogene</source>
            <pubdate>2003</pubdate>
            <volume>22</volume>
            <issue>20</issue>
            <fpage>3076</fpage>
            <lpage>3080</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/sj.onc.1206448</pubid>
                  <pubid idtype="pmpid" link="fulltext">12789283</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Gene expression profiling of lymphoid malignancies</p>
            </title>
            <aug>
               <au>
                  <snm>Staudt</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Annual Review of Medicine</source>
            <pubdate>2002</pubdate>
            <volume>53</volume>
            <fpage>303</fpage>
            <lpage>318</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.med.53.082901.103941</pubid>
                  <pubid idtype="pmpid" link="fulltext">11818476</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Applications of microarray technology in breast cancer research</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>CS</fnm>
               </au>
            </aug>
            <source>Breast Cancer Research</source>
            <pubdate>2001</pubdate>
            <volume>3</volume>
            <issue>3</issue>
            <fpage>158</fpage>
            <lpage>175</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">138681</pubid>
                  <pubid idtype="pmpid" link="fulltext">11305951</pubid>
                  <pubid idtype="doi">10.1186/bcr291</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Towards a novel classification of human malignancies based on gene expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Alizadeh</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Rijn</snm>
                  <mnm>Van De</mnm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Journal of Pathology</source>
            <pubdate>2001</pubdate>
            <volume>195</volume>
            <issue>1</issue>
            <fpage>41</fpage>
            <lpage>52</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/path.889</pubid>
                  <pubid idtype="pmpid" link="fulltext">11568890</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Molecular classification of head and neck squamous cell carcinoma using cDNA microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Belbin</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Barber</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Socci</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wenig</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Prystowsky</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Childs</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Cancer Research</source>
            <pubdate>2002</pubdate>
            <volume>62</volume>
            <issue>4</issue>
            <fpage>1184</fpage>
            <lpage>1190</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11861402</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Predicting the clinical status of human breast cancer by using gene expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>West</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dressman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ishida</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Spang</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zuzan</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Olson</snm>
                  <fnm>JA</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Marks</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Nevins</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <issue>20</issue>
            <fpage>11462</fpage>
            <lpage>11467</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">58752</pubid>
                  <pubid idtype="pmpid" link="fulltext">11562467</pubid>
                  <pubid idtype="doi">10.1073/pnas.201162998</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Transcriptional gene expression profiles of colorectal adenoma, adenocarcinoma, and normal tissue examined by oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Notterman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Alon</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Sierk</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Cancer Research</source>
            <pubdate>2001</pubdate>
            <volume>61</volume>
            <issue>7</issue>
            <fpage>3124</fpage>
            <lpage>3130</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11306497</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling</p>
            </title>
            <aug>
               <au>
                  <snm>Alizadeh</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Elsen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Lossos</snm>
                  <fnm>IS</fnm>
               </au>
               <au>
                  <snm>Rosenwald</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Boldrick</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Sabet</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Powell</snm>
                  <fnm>JI</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Maru</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hudson</snm>
                  <fnm>J</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Greiner</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Weisenburger</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Armitage</snm>
                  <fnm>JO</fnm>
               </au>
               <au>
                  <snm>Warnke</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Grever</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Byrd</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Staudt</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>403</volume>
            <issue>6769</issue>
            <fpage>503</fpage>
            <lpage>511</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35000501</pubid>
                  <pubid idtype="pmpid" link="fulltext">10676951</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring</p>
            </title>
            <aug>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gaasenbeek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Coller</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caligiuri</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bloomfield</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <issue>5439</issue>
            <fpage>531</fpage>
            <lpage>527</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5439.531</pubid>
                  <pubid idtype="pmpid" link="fulltext">10521349</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Variance stabilization applied to microarray data calibration and to the quantification of differential expression</p>
            </title>
            <aug>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Von Heydebreck</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Su?ltmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Poustka</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>suppl 1</issue>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12169536</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Jain</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Thatte</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Braciale</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ley</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>O'Connell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>JK</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>15</issue>
            <fpage>1945</fpage>
            <lpage>1951</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg264</pubid>
                  <pubid idtype="pmpid" link="fulltext">14555628</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Pan</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>4</issue>
            <fpage>546</fpage>
            <lpage>554</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.4.546</pubid>
                  <pubid idtype="pmpid" link="fulltext">12016052</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Pan</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>11</issue>
            <fpage>1333</fpage>
            <lpage>1340</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg167</pubid>
                  <pubid idtype="pmpid" link="fulltext">12874044</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>A comparison of statistical methods for analysis of high density oligonucleotide array data</p>
            </title>
            <aug>
               <au>
                  <snm>Rajagopalan</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>12</issue>
            <fpage>1469</fpage>
            <lpage>1476</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg202</pubid>
                  <pubid idtype="pmpid" link="fulltext">12912826</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A model for measurement error for gene expression arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Rocke</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <issue>6</issue>
            <fpage>557</fpage>
            <lpage>569</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1089/106652701753307485</pubid>
                  <pubid idtype="pmpid" link="fulltext">11747612</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Nonparametric methods for identifying differentially expressed genes in microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Troyanskaya</snm>
                  <fnm>OG</fnm>
               </au>
               <au>
                  <snm>Garber</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Altman</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>11</issue>
            <fpage>1454</fpage>
            <lpage>1461</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.11.1454</pubid>
                  <pubid idtype="pmpid" link="fulltext">12424116</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Significance analysis of microarrays applied to the ionizing radiation response</p>
            </title>
            <aug>
               <au>
                  <snm>Tusher</snm>
                  <fnm>VG</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <issue>9</issue>
            <fpage>5116</fpage>
            <lpage>5121</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">33173</pubid>
                  <pubid idtype="pmpid" link="fulltext">11309499</pubid>
                  <pubid idtype="doi">10.1073/pnas.091062498</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>EVE (external variance estimation) increases statistical power for detecting differentially expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Wille</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gruissem</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Buhlmann</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hennig</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2007</pubdate>
            <volume>52</volume>
            <issue>3</issue>
            <fpage>561</fpage>
            <lpage>569</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-313X.2007.03227.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">17680783</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Exploring the metabolic and genetic control of gene expression on a genomic scale</p>
            </title>
            <aug>
               <au>
                  <snm>DeRisi</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <issue>5338</issue>
            <fpage>680</fpage>
            <lpage>686</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.278.5338.680</pubid>
                  <pubid idtype="pmpid" link="fulltext">9381177</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification</p>
            </title>
            <aug>
               <au>
                  <snm>Simon</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Radmacher</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Dobbin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>McShane</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Journal of the National Cancer Institute</source>
            <pubdate>2003</pubdate>
            <volume>95</volume>
            <issue>6</issue>
            <fpage>14</fpage>
            <lpage>18</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12509396</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A Bayesian framework for the analysis of microarray expression data: Regularized t-test and statistical inferences of gene changes</p>
            </title>
            <aug>
               <au>
                  <snm>Baldi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <issue>6</issue>
            <fpage>509</fpage>
            <lpage>519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.6.509</pubid>
                  <pubid idtype="pmpid" link="fulltext">11395427</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Comparison of discrimination methods for the classification of tumors using gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fridlyand</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Journal of the American Statistical Association</source>
            <pubdate>2002</pubdate>
            <volume>97</volume>
            <issue>457</issue>
            <fpage>77</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1198/016214502753479248</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Using ANOVA to analyze microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Churchill</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Biotechniques</source>
            <pubdate>2004</pubdate>
            <volume>37</volume>
            <issue>2</issue>
            <fpage>173</fpage>
            <lpage>175</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">15335204</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kulaeva</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Hoff</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Petrov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shams</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tainsky</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>11</issue>
            <fpage>1348</fpage>
            <lpage>1359</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg165</pubid>
                  <pubid idtype="pmpid" link="fulltext">12874046</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Assessing gene significance from cDNA microarray expression data via mixed models</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfinger</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wolfinger</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hamadeh</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bushel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Afshari</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Paules</snm>
                  <fnm>RS</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <issue>6</issue>
            <fpage>625</fpage>
            <lpage>637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1089/106652701753307520</pubid>
                  <pubid idtype="pmpid" link="fulltext">11747616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>CLEAR-test: Combining inference for differential expression and variability in microarray data analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Valls</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Grau</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sole</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Herna?ndez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Montaner</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Dopazo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Peinado</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Capella</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Moreno</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pujana</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Journal of Biomedical Informatics</source>
            <pubdate>2008</pubdate>
            <volume>41</volume>
            <issue>1</issue>
            <fpage>33</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jbi.2007.05.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">17597009</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Variation in gene expression within and among natural populations</p>
            </title>
            <aug>
               <au>
                  <snm>Oleksiak</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Churchill</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Crawford</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32</volume>
            <issue>2</issue>
            <fpage>261</fpage>
            <lpage>266</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng983</pubid>
                  <pubid idtype="pmpid" link="fulltext">12219088</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Heterogeneity in motoneuron disease</p>
            </title>
            <aug>
               <au>
                  <snm>Lambrechts</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Robberecht</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Carmeliet</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Trends in Neurosciences</source>
            <pubdate>2007</pubdate>
            <volume>30</volume>
            <issue>10</issue>
            <fpage>536</fpage>
            <lpage>544</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tins.2007.07.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">17825438</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Prediction of biologically significant components from microarray data: Independently Consistent Expression Discriminator (ICED)</p>
            </title>
            <aug>
               <au>
                  <snm>Bijlani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Pearce</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Ogihara</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>1</issue>
            <fpage>62</fpage>
            <lpage>70</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.1.62</pubid>
                  <pubid idtype="pmpid" link="fulltext">12499294</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Tests for finding complex patterns of differential expression in cancers: towards individualized medicine</p>
            </title>
            <aug>
               <au>
                  <snm>Lyons-Weiler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Becich</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Godfrey</snm>
                  <fnm>TE</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>110</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">514539</pubid>
                  <pubid idtype="pmpid" link="fulltext">15307894</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-110</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Molecular portrait of high productivity in recombinant NSO cells</p>
            </title>
            <aug>
               <au>
                  <snm>Seth</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Philp</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Lau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kok</snm>
                  <fnm>YJ</fnm>
               </au>
               <au>
                  <snm>Yap</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>WS</fnm>
               </au>
            </aug>
            <source>Biotechnol Bioeng</source>
            <pubdate>2007</pubdate>
            <volume>97</volume>
            <issue>4</issue>
            <fpage>933</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bit.21234</pubid>
                  <pubid idtype="pmpid" link="fulltext">17149768</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Outlier sums for differential gene expression analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Biostatistics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>1</issue>
            <fpage>2</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/biostatistics/kxl005</pubid>
                  <pubid idtype="pmpid" link="fulltext">16702229</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Tomlins</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Rhodes</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Perner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dhanasekaran</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Mehra</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>XW</fnm>
               </au>
               <au>
                  <snm>Varambally</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Cao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Tchinda</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kuefer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Montie</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Shah</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Pienta</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Chinnaiyan</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>310</volume>
            <issue>5748</issue>
            <fpage>644</fpage>
            <lpage>648</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1117679</pubid>
                  <pubid idtype="pmpid" link="fulltext">16254181</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Cancer outlier differential gene expression detection</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Biostatistics (Oxford, England)</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>3</issue>
            <fpage>566</fpage>
            <lpage>575</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17021278</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>MOST: detecting cancer differential gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Lian</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Biostat</source>
            <pubdate>2007</pubdate>
            <note>kxm042</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">Lian H: MOST: detecting cancer differential gene expression.</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Statistics</p>
            </title>
            <aug>
               <au>
                  <snm>McClave</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Sincich</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <publisher>Prentice Hall</publisher>
            <edition>8</edition>
            <pubdate>1999</pubdate>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Jeffery</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Culhane</snm>
                  <fnm>AC</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>359</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1544358</pubid>
                  <pubid idtype="pmpid" link="fulltext">16872483</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-359</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Alon</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Barka</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Notterman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ybarra</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mack</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>12</issue>
            <fpage>6745</fpage>
            <lpage>6750</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21986</pubid>
                  <pubid idtype="pmpid" link="fulltext">10359783</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.12.6745</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Gene expression correlates of clinical prostate cancer behavior</p>
            </title>
            <aug>
               <au>
                  <snm>Singh</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Febbo</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Manola</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ladd</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Renshaw</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>D'Amico</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Richie</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Loda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kantoff</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Sellers</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>Cancer Cell</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <issue>2</issue>
            <fpage>203</fpage>
            <lpage>209</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1535-6108(02)00030-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12086878</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Support vector machine classification and validation of cancer tissue samples using microarray expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Cristianini</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Duffy</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bednarski</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Schummer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>10</issue>
            <fpage>906</fpage>
            <lpage>914</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.10.906</pubid>
                  <pubid idtype="pmpid" link="fulltext">11120680</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Supervised clustering of genes</p>
            </title>
            <aug>
               <au>
                  <snm>Dettling</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bu?hlmann</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome biology</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151171</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537558</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Tissue classification with gene expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Ben-Dor</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bruhn</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nachman</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Schummer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yakhini</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <issue>3&#8211;4</issue>
            <fpage>559</fpage>
            <lpage>583</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1089/106652700750050943</pubid>
                  <pubid idtype="pmpid" link="fulltext">11108479</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Minimum redundancy feature selection from microarray gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Ding</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Peng</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Journal of Bioinformatics and Computational Biology</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <issue>2</issue>
            <fpage>185</fpage>
            <lpage>205</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">15852500 </pubid>
                  <pubid idtype="doi">10.1142/S0219720005001004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>An entropy-based gene selection method for cancer classification using microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Krishnan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mondry</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">15790388</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Gene selection for classification of microarray data based on the Bayes error</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>HW</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>1</issue>
            <fpage>370</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2089123</pubid>
                  <pubid idtype="pmpid" link="fulltext">17915022</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-8-370</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
