<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-50</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Assessing stability of gene selection in microarray data analysis</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Qiu</snm>
               <fnm>Xing</fnm>
               <insr iid="I1"/>
               <email>xqiu@bst.rochester.edu</email>
            </au>
            <au id="A2">
               <snm>Xiao</snm>
               <fnm>Yuanhui</fnm>
               <insr iid="I1"/>
               <email>yxiao@bst.rochester.edu</email>
            </au>
            <au id="A3">
               <snm>Gordon</snm>
               <fnm>Alexander</fnm>
               <insr iid="I1"/>
               <email>Alexander_Gordon@urmc.rochester.edu</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Yakovlev</snm>
               <fnm>Andrei</fnm>
               <insr iid="I1"/>
               <email>Andrei_Yakovlev@urmc.rochester.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Rochester, New York 14642, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>50</fpage>
         <url>http://www.biomedcentral.com/1471-2105/7/50</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16451725</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-50</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>17</day>
               <month>8</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>01</day>
               <month>2</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>01</day>
               <month>2</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Qiu et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The number of genes declared differentially expressed is a random variable and its variability can be assessed by resampling techniques. Another important stability indicator is the frequency with which a given gene is selected across subsamples. We have conducted studies to assess stability and some other properties of several gene selection procedures with biological and simulated data.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Using resampling techniques we have found that some genes are selected much less frequently (across sub-samples) than other genes with the same adjusted <it>p</it>-values. The extent to which this type of instability manifests itself can be assessed by a method introduced in this paper. The effect of correlation between gene expression levels on the performance of multiple testing procedures is studied by computer simulations.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Resampling represents a tool for reducing the set of initially selected genes to those with a sufficiently high selection frequency. Using resampling techniques it is also possible to assess variability of different performance indicators. Stability properties of several multiple testing procedures are described at length in the present paper.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The result of every analysis of microarray data is an outcome of a random experiment. For example, the number of genes declared differentially expressed and the estimated false discovery rate (FDR) should be treated as random variables and their variability has to be assessed in the same fashion that the population variance is estimated in the usual statistical inference. The variance of the number of differentially expressed genes (as well as other outcomes of a given selection procedure) may depend on the chosen statistical test, method of multiple testing adjustment, effect sizes for different genes, and the correlation structure of the data. The latter factor deserves special attention. Although some normalization procedures may lead to a significant reduction in the correlation between gene expression levels, and thus between the associated test statistics, the remaining correlation may be strong enough to have a disastrous effect on the statistical inference from microarray data <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The effect correlations in microarray data on variability of the most basic performance indicators of various testing procedures calls for further investigation.</p>
         <p>There is another facet of the problem to consider. Every specific analysis of microarray data results in a list of candidate genes that are deemed differentially expressed across the two conditions under study. The composition of this list is subject to random fluctuations and this effect also needs to be quantitatively assessed. Even if one concentrates on the selection of individual genes rather than gene combinations, the situation here is similar to that in the regression analysis aimed at selecting significant explanatory variables (covariates). When focusing on a specific variable, one can observe a certain degree of instability of this variable selection inherent in any pertinent statistical procedure. The term "stability" means "replication stability" for the selection of significant variables. This kind of stability is easy to assess and interpret in simulation studies where the "true" set of differentially expressed genes is known. When analyzing biological data, one can resort to resampling techniques for this purpose <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. In particular, one can apply a subsampling counterpart of the "delete-<it>d</it>-jackknife" procedure to the sample at hand and estimate the frequency with which a given gene has been selected across all sub-samples. Then an additional selection criterion can be imposed by finally selecting only those genes with a frequency of selection greater than, say, 80%.</p>
         <p>The above discussion suggests the following two ways of using resampling techniques in microarray data analysis. These techniques can be used to assess stability characteristics of a given selection procedure and compare different procedures. In this case, one is usually interested in certain characteristics of the whole set of selected genes rather than its individual members. In their very interesting paper, Pavlidis et al. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> used leave-one-out resampling to study the stability of gene selection in conjunction of the required number of replicates in the analysis of differential expression of genes. The authors proposed two stability measures (metrics) to compare the ranked list of the genes originally selected to the ranking obtained when one replicate is removed. Then the stability measures are averaged over the subsamples. The first measure refers to the fraction of genes among the originally selected ones that are recovered in a given subsample. Stolovitzky <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> proposed a similar measure which is not conditioned on the set of genes originally selected from the data prior to resampling. However, statistical properties of the robustness index introduced by Stolovitzky remain unclear. The second measure by Pavlidis et al. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> is more subtle; it has to do with the degree to which the ordering is preserved and can be used whenever the number of selected genes does not show strong variations among subsamples. We propose the delete-<it>d</it>-jackknife variance of the number of selected genes (which is the primary endpoint to be assessed when comparing different methodologies) across subsamples as a pertinent measure of stability of a chosen testing procedure. This measure has clear statistical properties and is easier to interpret. The distribution of the number of selected genes can also be estimated using the delete-<it>d</it>-jackknife method <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Another way of using resampling techniques is to assess the stability of selection for individual genes in line with the currently practiced methodology of significance testing in microarray analysis. This can be accomplished by estimating the frequency of selection of each gene given it has been selected at least in one subsample. As we show in the present paper, this measure also provides valuable information on the performance of each selection procedure when its dependence on adjusted <it>p</it>-values is included in the analysis.</p>
         <p>We have conducted a simulation study to evaluate the effect of correlation between gene expression levels on the performance of several selection procedures in terms of the variability of such important indicators as the number of selected genes and the proportion of falsely rejected among all rejected null hypotheses. All these indicators are directly accessible in computer simulations, thereby providing an explanatory insight into the performance of different procedures. From this perspective, the Bonferroni and Westfall-Young multiple testing procedures are explored in conjunction with the Student <it>t</it>, Kolmogorov-Smirnov, and Cram&#233;r-von Mises two-sample tests. The latter two tests are distribution free. The Bonferroni and Westfall-Young step-down procedures <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> are designed to control the familywise error rate (FWER). The FWER is defined loosely as the probability of at least one Type 1 error in the context of multiple testing. The FDR-based procedures are also explored; these are represented by the empirical Bayes method <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp> as well as the Benjamini-Hochberg and Benjamini-Yakutieli procedures <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. The FDR is defined as the expected fraction of falsely rejected among all rejected hypotheses. It should be noted that our simulation studies do not attempt to model the actual correlation structure of microarray data; their only purpose is to see which specific performance indicators may be sensitive to the presence of correlation in the data. The quantitative characteristics we report from the simulated data cannot be extrapolated to biological data and can only be viewed as proof of principle.</p>
         <p>Another set of experiments was concerned with actual biological data. We assessed probabilistic characteristics of the number of selected genes by resampling from a large set of data on two types of childhood leukemia available from the St. Jude Children's Research Hospital Database <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Using this data set, we also assessed the replication stability of gene selection and its dependence on adjusted <it>p</it>-values.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Biological data</p>
            </st>
            <p>For the purposes of this study, use was made of the St. Jude Children's Research Hospital (SJCRH) Database on childhood leukemia which is publicly available on their website under the Supplemental Data section: <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> The whole SJCRH Database contains gene expression data on 335 subjects, each represented by a separate array (Affymetrix, Santa Clara, CA) reporting measurements on the same set of 12558 genes. We selected two groups of patients with hyperdiploid (Hyperdip) and T-cell acute lymphoblastic leukemia (TALL), respectively. The groups were balanced to include 43 patients in each group. Since the nature of our study was purely methodological, the choice of the data set was quite arbitrary; it was dictated solely by sample size considerations. The microarray data thus chosen were background corrected and normalized using the Bioconductor RMA software. This software implements the quantile normalization procedure <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp> carried out at the probe feature level. After the normalization, each gene is represented in the final data set by the logarithm (base 2) of its expression level.</p>
         </sec>
         <sec>
            <st>
               <p>Simulated data</p>
            </st>
            <p>Our simulation study was designed to illustrate the effect of correlation on the performance of gene selection procedures. We simulated 2<it>n </it>independent multi-variate normal random vectors with exchangeable correlation structure, each representing log-intensities of 1255 genes of which the first 125 genes were designated to be differentially expressed. Two sets of simulations were conducted with the sample size chosen to be <it>n </it>= 15 and <it>n </it>= 43, respectively. We use the following self-explanatory notation for the four sets of simulated data: SIM15, SIM15CORR, SIM43, SIM43CORR. In total, 200 independent data sets, each consisting of 2<it>n </it>simulated vectors, were generated for each sample size. The marginal distributions of the log-intensities of "Not Different" genes were standard normal, while the log-intensities of "Different" genes expressions followed the normal distribution with mean two and unit variance.</p>
            <p>The exchangeable pairwise correlation structure was superimposed on the normal vectors with independent components as discussed in <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Briefly, we first generate a 1255 &#215; 2<it>n </it>matrix with each entry being an independent realization of a standard normal random variable. To model a set of "Different" genes, we add a value of 2 to the first 125 rows in the first group and denote the resultant matrix by <it>X </it>= {<it>x</it><sub><it>ij</it></sub>}, <it>i </it>= 1, ..., 1255; <it>j </it>= 1, ..., 2<it>n</it>. All the elements <it>xij </it>of this matrix are stochastically independent, but those with <it>i </it>= 1, 2, ..., 125 and <it>j </it>= 1, 2, ..., <it>n </it>are normally distributed with mean 2 and unit variance. Expression levels of the genes outside this special set of 125 genes follow the standard normal distribution. Next we generate a 2<it>n</it>-dimensional random vector with independent and identically distributed components, each component having a standard normal distribution. Denote this vector by <it>A </it>= {<it>a</it><sub><it>j</it></sub>}, <it>j </it>= 1, ..., 2<it>n</it>. Define <m:math name="1471-2105-7-50-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>y</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow></m:msub><m:mo>=</m:mo><m:msqrt><m:mi>&#961;</m:mi></m:msqrt><m:msub><m:mi>a</m:mi><m:mi>j</m:mi></m:msub><m:mo>+</m:mo><m:msqrt><m:mrow><m:mn>1</m:mn><m:mo>&#8722;</m:mo><m:mi>&#961;</m:mi></m:mrow></m:msqrt><m:msub><m:mi>x</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaWgaaWcbaGaemyAaKMaemOAaOgabeaakiabg2da9maakaaabaacciGae8xWdihaleqaaOGaemyyae2aaSbaaSqaaiabdQgaQbqabaGccqGHRaWkdaGcaaqaaiabigdaXiabgkHiTiab=f8aYbWcbeaakiabdIha4naaBaaaleaacqWGPbqAcqWGQbGAaeqaaaaa@3FE1@</m:annotation></m:semantics></m:math>, <it>i </it>= 1, ..., 1255; <it>j </it>= 1, ..., 2<it>n</it>, so that for any <it>i</it><sub>1 </sub>&#8800; <it>i</it><sub>2 </sub>and <it>j </it>we have corr (<m:math name="1471-2105-7-50-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>y</m:mi><m:mrow><m:msub><m:mi>i</m:mi><m:mn>1</m:mn></m:msub><m:mi>j</m:mi></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaWgaaWcbaGaemyAaK2aaSbaaWqaaiabigdaXaqabaWccqWGQbGAaeqaaaaa@3233@</m:annotation></m:semantics></m:math>, <m:math name="1471-2105-7-50-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>y</m:mi><m:mrow><m:msub><m:mi>i</m:mi><m:mn>2</m:mn></m:msub><m:mi>j</m:mi></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaWgaaWcbaGaemyAaK2aaSbaaWqaaiabikdaYaqabaWccqWGQbGAaeqaaaaa@3235@</m:annotation></m:semantics></m:math>) = <it>&#961; </it>In the present study, the correlated data were generated for a single value of the correlation coefficient <it>&#961; </it>= 0.6. This high correlation coeffcient was chosen to more clearly demonstrate the effects of correlation. However, this value is not overly unrealistic because the mean (over all gene pairs) correlation coefficient estimated from the raw data referred to in Section 2.1 is equal to 0.72.</p>
            <p>In order to see whether or not the stability of gene selection is related to the power, our explanatory simulations were conducted under two different scenarios. Under the first scenario, the sample size was small (<it>n </it>= 15) so that the power was lower than 100%. Under the second scenario the sample size was sufficiently large (<it>n </it>= 43) to attain a 100% power.</p>
         </sec>
         <sec>
            <st>
               <p>Resampling techniques</p>
            </st>
            <p>When analyzing biological data, we used a subsampling version of the delete-<it>d </it>jackknife method <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>, which is technically equivalent to the leave-<it>d</it>-out cross-validation. It can be proven that if <m:math name="1471-2105-7-50-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msqrt><m:mi>n</m:mi></m:msqrt></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabd6gaUbWcbeaaaaa@2E2C@</m:annotation></m:semantics></m:math>/<it>d </it>&#8594; 0 and <it>n </it>- <it>d </it>&#8594; &#8734;, then the delete-<it>d</it>-jackknife is consistent for the median <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Therefore, the general recommendation is to leave out more than <it>d </it>= <m:math name="1471-2105-7-50-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msqrt><m:mi>n</m:mi></m:msqrt></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabd6gaUbWcbeaaaaa@2E2C@</m:annotation></m:semantics></m:math> but much fewer that <it>n </it>observations. A similar recommendation holds for the variance <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. We used <it>d </it>= 7 to perturb the data set, which is only slightly greater than <m:math name="1471-2105-7-50-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msqrt><m:mi>n</m:mi></m:msqrt></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabd6gaUbWcbeaaaaa@2E2C@</m:annotation></m:semantics></m:math>, to be as close as possible to the most widely used delete-one version of jackknife. It should be noted that the delete-1-jackknife method may be inconsistent for some estimators (sample quantiles representing a typical example) and the delete-<it>d</it>-jackknife was proposed to remedy this problem <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. When implemeting the delete-<it>d</it>-jackknife method, we resorted to sampling without replacement because the empirical Bayes method is very sensitive to ties. As far as subsampling versions of the delete-<it>d</it>-jackknife method are concerned, the schemes with and without replacement are essentially identical when the number of subsamples is large <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <p>The total number of subsamples was typically equal to 200. In a separate study, we ascertained that the results for 1000 subsamples were largely similar. Let <it>Z </it>be the number of selected genes. The variance of <it>Z </it>is estimated by a resampling counterpart of the jackknife sample variance <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>:</p>
            <p>
               <m:math name="1471-2105-7-50-i5" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>V</m:mi>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mi>n</m:mi>
                              <m:mo>&#8722;</m:mo>
                              <m:mi>d</m:mi>
                           </m:mrow>
                           <m:mrow>
                              <m:mi>d</m:mi>
                              <m:mi>B</m:mi>
                           </m:mrow>
                        </m:mfrac>
                        <m:msup>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>l</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mi>B</m:mi>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>Z</m:mi>
                                             <m:mrow>
                                                <m:mi>n</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mi>d</m:mi>
                                                <m:mo>,</m:mo>
                                                <m:mi>l</m:mi>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mfrac>
                                             <m:mn>1</m:mn>
                                             <m:mi>B</m:mi>
                                          </m:mfrac>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>k</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mn>1</m:mn>
                                                </m:mrow>
                                                <m:mi>B</m:mi>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>Z</m:mi>
                                                   <m:mrow>
                                                      <m:mi>n</m:mi>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:mi>d</m:mi>
                                                      <m:mo>,</m:mo>
                                                      <m:mi>k</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mn>2</m:mn>
                        </m:msup>
                        <m:mo>,</m:mo>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGwbGvcqGH9aqpdaWcaaqaaiabd6gaUjabgkHiTiabdsgaKbqaaiabdsgaKjabdkeacbaadaaeWbqaamaabmaabaGaemOwaO1aaSbaaSqaaiabd6gaUjabgkHiTiabdsgaKjabcYcaSiabdYgaSbqabaGccqGHsisldaWcaaqaaiabigdaXaqaaiabdkeacbaadaaeWbqaaiabdQfaAnaaBaaaleaacqWGUbGBcqGHsislcqWGKbazcqGGSaalcqWGRbWAaeqaaaqaaiabdUgaRjabg2da9iabigdaXaqaaiabdkeacbqdcqGHris5aaGccaGLOaGaayzkaaaaleaacqWGSbaBcqGH9aqpcqaIXaqmaeaacqWGcbGqa0GaeyyeIuoakmaaCaaaleqabaGaeGOmaidaaOGaeiilaWcaaa@5779@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where <it>B </it>is the total number of subsamples (<it>B </it>= 200), <it>Z</it><sub><it>n</it>-<it>d</it>, <it>j </it></sub>is the statistic <it>Z </it>evaluated at the <it>j</it>th delete-<it>d </it>jackknife subsample. The variance of the number of selected genes was used as a criterion of stability of the testing procedures under study. The corresponding distributions were also estimated. Another criterion was the selection stability for each individual gene measured by the frequency of selection conditional on the event of selection in at least one of the subsamples.</p>
         </sec>
         <sec>
            <st>
               <p>Selection of differentially expressed genes</p>
            </st>
            <p>When resorting to the Bonferroni adjustment, one needs to compute unadjusted <it>p</it>-values from the sampling distribution of the test statistic under consideration. For the <it>t</it>-test we used quantiles of the Student distribution. Among the distribution-free methods, the Cram&#233;r-von Mises test <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> represents an appealing alternative to the Kolmogorov-Smirnov test. The reason is that the granularity of the Cram&#233;r-von Mises statistic (which causes granularity of the corresponding <it>p</it>-values) is much smaller than that for the Kolmogorov-Smirnov test. As a result, the <it>p</it>-values corresponding to the critical region increase much more steeply for the Kolmogorov-Smirnov test than for the Cram&#233;r-von Mises test, thereby making the Kolmogorov-Smirnov test less powerful.</p>
            <p>To describe the Cram&#233;r-von Mises test, consider two independent samples <it>x</it><sub>1</sub>, <it>x</it><sub>2</sub>, ..., <it>x</it><sub><it>m </it></sub>and <it>y</it><sub>1</sub>, <it>y</it><sub>2</sub>, ..., <it>y</it><sub><it>n </it></sub>from distributions <it>F</it>(<it>x</it>) and <it>G</it>(<it>x</it>), respectively, and let <it>F</it><sub><it>m </it></sub>and <it>G</it><sub><it>n </it></sub>be their respective empirical distribution functions. We wish to test the following null hypothesis <b>H</b><sub>0</sub>:<it>F</it>(<it>x</it>) = <it>G</it>(<it>x</it>) for all <it>x </it>versus the alternative: <it>F </it>&#8800; <it>G</it>. The Cram&#233;r-von Mises test-statistic for the hypothesis <b>H</b><sub>0 </sub>is defined as the (squared) <it>L</it><sub>2 </sub>distance between <it>F</it><sub><it>m</it></sub>(<it>x</it>) and <it>G</it><sub><it>n</it></sub>(<it>x</it>):</p>
            <p>
               <m:math name="1471-2105-7-50-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msup>
                           <m:mi>W</m:mi>
                           <m:mn>2</m:mn>
                        </m:msup>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mi>m</m:mi>
                              <m:mi>n</m:mi>
                           </m:mrow>
                           <m:mrow>
                              <m:msup>
                                 <m:mrow>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>m</m:mi>
                                    <m:mo>+</m:mo>
                                    <m:mi>n</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                                 <m:mn>2</m:mn>
                              </m:msup>
                           </m:mrow>
                        </m:mfrac>
                        <m:mrow>
                           <m:mo>{</m:mo>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>i</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mi>m</m:mi>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:msup>
                                       <m:mrow>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:msub>
                                             <m:mi>F</m:mi>
                                             <m:mi>m</m:mi>
                                          </m:msub>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:msub>
                                             <m:mi>x</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msub>
                                             <m:mi>G</m:mi>
                                             <m:mi>n</m:mi>
                                          </m:msub>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:msub>
                                             <m:mi>x</m:mi>
                                             <m:mi>i</m:mi>
                                          </m:msub>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo stretchy="false">]</m:mo>
                                       </m:mrow>
                                       <m:mn>2</m:mn>
                                    </m:msup>
                                    <m:mo>+</m:mo>
                                    <m:mstyle displaystyle="true">
                                       <m:munderover>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mrow>
                                             <m:mi>j</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                          <m:mi>n</m:mi>
                                       </m:munderover>
                                       <m:mrow>
                                          <m:msup>
                                             <m:mrow>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:msub>
                                                   <m:mi>F</m:mi>
                                                   <m:mi>m</m:mi>
                                                </m:msub>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:msub>
                                                   <m:mi>y</m:mi>
                                                   <m:mi>j</m:mi>
                                                </m:msub>
                                                <m:mo stretchy="false">)</m:mo>
                                                <m:mo>&#8722;</m:mo>
                                                <m:msub>
                                                   <m:mi>G</m:mi>
                                                   <m:mi>n</m:mi>
                                                </m:msub>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:msub>
                                                   <m:mi>y</m:mi>
                                                   <m:mi>j</m:mi>
                                                </m:msub>
                                                <m:mo stretchy="false">)</m:mo>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                             <m:mn>2</m:mn>
                                          </m:msup>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mo>}</m:mo>
                        </m:mrow>
                        <m:mo>.</m:mo>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGxbWvdaahaaWcbeqaaiabikdaYaaakiabg2da9maalaaabaGaemyBa0MaemOBa4gabaGaeiikaGIaemyBa0Maey4kaSIaemOBa4MaeiykaKYaaWbaaSqabeaacqaIYaGmaaaaaOWaaiWaaeaadaaeWbqaaiabcUfaBjabdAeagnaaBaaaleaacqWGTbqBaeqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdMgaPbqabaGccqGGPaqkcqGHsislcqWGhbWrdaWgaaWcbaGaemOBa4gabeaakiabcIcaOiabdIha4naaBaaaleaacqWGPbqAaeqaaOGaeiykaKIaeiyxa01aaWbaaSqabeaacqaIYaGmaaGccqGHRaWkdaaeWbqaaiabcUfaBjabdAeagnaaBaaaleaacqWGTbqBaeqaaOGaeiikaGIaemyEaK3aaSbaaSqaaiabdQgaQbqabaGccqGGPaqkcqGHsislcqWGhbWrdaWgaaWcbaGaemOBa4gabeaakiabcIcaOiabdMha5naaBaaaleaacqWGQbGAaeqaaOGaeiykaKIaeiyxa01aaWbaaSqabeaacqaIYaGmaaaabaGaemOAaOMaeyypa0JaeGymaedabaGaemOBa4ganiabggHiLdaaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGTbqBa0GaeyyeIuoaaOGaay5Eaiaaw2haaiabc6caUaaa@722F@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The asymptotic theory for the Cram&#233;r-von Mises test was developed by Anderson and Darling <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, Rosenblatt <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, Anderson <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and Csorgo and Faraway <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The asymptotic approximation of the distribution of <it>W</it><sup>2 </sup>under <b>H</b><sub>0 </sub>is of little utility in microarray data analysis because it is very inaccurate whenever one works with extremely small <it>p</it>-values required by the FWER controlling multiple testing procedures. For example, when <it>n </it>= <it>m </it>= 43 and the exact <it>p</it>-value for the Cram&#233;r-von Mises test is equal to 3.928 &#215; 10<sup>-6</sup>, its asymptotic approximation gives 6.897 &#215; 10<sup>-6</sup>, a much larger <it>p</it>-value. The above-mentioned exact <it>p</it>-value of 3.928 &#215; 10<sup>-6</sup>corresponds to the adjusted (by the Bonferroni method) <it>p</it>-value of about 0.049 when testing 12558 hypotheses as in our application described in Section 3.1. Therefore, one needs an algorithm for computing exact quantiles of the Cram&#233;r-von Mises sampling distribution. We used the method proposed by Burr <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> for this purpose. It should be noted that the needed small <it>p</it>-values for the Cram&#233;r-von Mises test cannot be estimated with sufficient accuracy by permuting the test-statistics, because the required number of permuations is astronomical <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> and cannot be accomplished with present-day hardware.</p>
            <p>The Westfall-Young step-down algorithm <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> bypasses the stage of computing unadjusted <it>p</it>-values and goes directly to the estimation of adjusted <it>p</it>-values at a given level of the FWER. We carried out 10,000 permutations to model a null distribution of each test statistic. We also used the multiple testing adjustment proposed by Benjamini and Hochberg <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and its modification by Benjamini and Yekutieli <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The more conservative Benjamini and Yekutieli procedure is warranted with normalized data because the quantile normalization is known to induce negative correlations in microarray data <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The nonparametric empirical Bayes method by Efron et al. <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp> was one more method of choice in the present paper. We used kernel smoothing (with the Gaussian kernel) for density estimation to implement the empirical Bayes method. The threshold level of the posterior probability was set at 0.95.</p>
            <p>To distinguish between different statistical procedures, we use the following notation:</p>
            <p>B/t &#8211; <it>t</it>-test with Bonferroni adjustment;</p>
            <p>B/KS &#8211; Kolmogorov-Smirnov test with Bonferroni adjustment;</p>
            <p>B/CVM &#8211; Cram&#233;r-von Mises test with Bonferroni adjustment;</p>
            <p>WY/t &#8211; <it>t</it>-test with Westfall-Young algorithm;</p>
            <p>WY/KS &#8211; Kolmogorov-Smirnov test with Westfall-Young algorithm;</p>
            <p>WY/CVM &#8211; Cram&#233;r-von Mises with Westfall-Young algorithm;</p>
            <p>BH/t &#8211; <it>t</it>-test with Benjamini-Hochberg adjustment;</p>
            <p>BY/t &#8211; <it>t</it>-test with Benjamini-Yekutieli adjustment;</p>
            <p>EB/t &#8211; <it>t</it>-test with gene selection by nonparametric empirical Bayes method.</p>
         </sec>
         <sec>
            <st>
               <p>False discovery rate and power</p>
            </st>
            <p>We provide estimates of the FDR only for simulations. We do not report FDR estimates for biological data because only indirect methods <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp> are available in this case. Such methods introduce an additional variation in the estimates which is impossible to distinguish from that caused by a given selection procedure. In our simulation studies, the true FDR was estimated directly as the proportion of false discoveries among all discoveries. Then the sample mean (across the 200 samples) of this nonparametric estimate is reported together with the corresponding standard deviation. It happened only once (when applying the Kolmogorov-Smirnov test with Bonferroni adjustment to a sample of size <it>n </it>= 15) that we set the estimated FDR at zero (see <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> for the definition of the positive FDR). Since the expression levels of the 125 differentially expressed genes are identically distributed, the power can be defined as the expected proportion of correct discoveries among the 125 true alternative hypotheses. We provide the usual nonparametric estimates of the power thus defined and its standard deviation.</p>
         </sec>
         <sec>
            <st>
               <p>Software</p>
            </st>
            <p>The relevant software is included in the Additional Material Files [see <supplr sid="S1">additional file 1</supplr>].</p>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p>includes the executable programs employed in this study</p>
               </text>
               <file name="1471-2105-7-50-S1.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Analysis of biological data</p>
            </st>
            <p>Table <tblr tid="T1">1</tblr> presents the results of the delete-7-jackknife subsampling applied to the selected set of biological data. In this study, the FWER is controlled at the level of 0.05. Shown in the parentheses is the percentage of "stable" genes relative to the mean (over the 200 subsamples) number of selected genes computed under the additional requirement of at least 80% occurrence in the set of selected genes. This percentage remains practically unchanged when changing the FWER control level. The standard deviation of the number of selected genes is quite high for all the procedures studied. The proportion of highly stable (with at least 80% occurrence) genes appears to be virtually the same for all the tests and multiple testing procedures. However, the situation is not the same when looking at less frequent genes as discussed below.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Delete-<it>d</it>-jackknife subsampling for the biological data with <it>d </it>= 7.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Method</p>
                     </c>
                     <c cspan="3" ca="left">
                        <p>Leave-seven-out Jackknife</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Mean number of selected genes</p>
                     </c>
                     <c ca="left">
                        <p>Standard deviation</p>
                     </c>
                     <c ca="left">
                        <p>Mean number of stable genes and its proportion to the mean of selected genes</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/KS</p>
                     </c>
                     <c ca="left">
                        <p>622</p>
                     </c>
                     <c ca="left">
                        <p>71</p>
                     </c>
                     <c ca="left">
                        <p>504(80.99%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/CVM</p>
                     </c>
                     <c ca="left">
                        <p>1096</p>
                     </c>
                     <c ca="left">
                        <p>123</p>
                     </c>
                     <c ca="left">
                        <p>853(77.80%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/t</p>
                     </c>
                     <c ca="left">
                        <p>775</p>
                     </c>
                     <c ca="left">
                        <p>103</p>
                     </c>
                     <c ca="left">
                        <p>644(83.05%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/KS</p>
                     </c>
                     <c ca="left">
                        <p>685</p>
                     </c>
                     <c ca="left">
                        <p>153</p>
                     </c>
                     <c ca="left">
                        <p>533(77.82%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/CVM</p>
                     </c>
                     <c ca="left">
                        <p>889</p>
                     </c>
                     <c ca="left">
                        <p>124</p>
                     </c>
                     <c ca="left">
                        <p>711(79.98%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/t</p>
                     </c>
                     <c ca="left">
                        <p>876</p>
                     </c>
                     <c ca="left">
                        <p>110</p>
                     </c>
                     <c ca="left">
                        <p>726(82.89%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>EB/t</p>
                     </c>
                     <c ca="left">
                        <p>1867</p>
                     </c>
                     <c ca="left">
                        <p>438</p>
                     </c>
                     <c ca="left">
                        <p>1481(79%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BH/t</p>
                     </c>
                     <c ca="left">
                        <p>2726</p>
                     </c>
                     <c ca="left">
                        <p>445</p>
                     </c>
                     <c ca="left">
                        <p>2176(80%)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BY/t</p>
                     </c>
                     <c ca="left">
                        <p>1599</p>
                     </c>
                     <c ca="left">
                        <p>222</p>
                     </c>
                     <c ca="left">
                        <p>1282(80%)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Shown in Figure <figr fid="F1">1</figr> are the proportions of genes with different frequencies of selection among those genes that have been selected at least once in the course of delete-seven subsampling. It is seen from this figure that the histograms are <it>U</it>-shaped so that one can distinguish two extreme groups of genes characterized by high and low stability, respectively. The proportions of genes in each "intermediate-frequency" category are relatively small. This phenomenon persists for all the statistical tests under study when the FWER is controlled either by the Bonferroni adjustment or by the Westfall-Young permutation algorithm. It is clear from Figure <figr fid="F1">1</figr> that the population of genes selected at least once across all subsamples is heterogeneous with respect to their stability characterized by the frequency of selection.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Histograms of the frequency of occurence in the set of selected genes obtained by delete-7-jackknife subsampling from the SJCRH data</p>
               </caption>
               <text>
                  <p>Histograms of the frequency of occurence in the set of selected genes obtained by delete-7-jackknife subsampling from the SJCRH data.</p>
               </text>
               <graphic file="1471-2105-7-50-1"/>
            </fig>
            <p>To gain a better insight into this heterogeneity, it makes sense to look at the relationship between the frequency of occurrence and the corresponding <it>p</it>-values. To this end, we produced scatter-plots for the frequency of occurrence in the set of selected genes across the sub-samples and the original adjusted <it>p</it>-values determined by the application of each testing procedure to the whole set of arrays. The results for the <it>t</it>-test with Bonferroni adjustment are given in Figure <figr fid="F2">2</figr>. The leave-seven-out resampling reveals a non-linear (but still monotonic) pattern showing that the relationship in question may be quite complex. For comparison, we also present the result for the leave-one-out procedure, in which case the dependence appears to be almost linear but the scatter of points is wide because this procedure does not perturb the data sufficiently. In what follows, we will discuss only the observations resulted from the delete-7-jackknife subsampling.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>-test with Bonferroni adjustment</p>
               </caption>
               <text>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>-test with Bonferroni adjustment. Left panel: delete-1-jackknife subsampling, right panel: delete-7-jackknife subsampling.</p>
               </text>
               <graphic file="1471-2105-7-50-2"/>
            </fig>
            <p>The results for the <it>t</it>-test and the Cram&#233;r-von Mises test with Bonferroni adjustment are compared in Figure <figr fid="F3">3</figr>. It is clear that the genes selected by the Cram&#233;r-von Mises test are uniformly more stable than those selected by the <it>t</it>-test. The difference is much less pronounced with the Westfall-Young algorithm as evidenced by Figure <figr fid="F4">4</figr>. Both multiple testing procedures yield similar scatter plots for the <it>t</it>-test showing its overall poor stability in comparison to the Cram&#233;r-von Mises test (Figure <figr fid="F5">5</figr>). In contrast, the stability of the Cram&#233;r-von Mises test can be increased substantially when using the more conservative Bonferroni adjustment in place of the Westfall-Young procedure (Figure <figr fid="F6">6</figr>). These results show that the stability of gene selection provides an important additional information on each selected gene and this information can be extracted from real data by resorting to resampling techniques.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>- and Cram&#233;r-von Mises test with Bonferroni adjustment</p>
               </caption>
               <text>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>- and Cram&#233;r-von Mises test with Bonferroni adjustment.</p>
               </text>
               <graphic file="1471-2105-7-50-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>- and Cram&#233;r-von Mises test with Westfall-Young algorithm</p>
               </caption>
               <text>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the <it>t</it>- and Cram&#233;r-von Mises test with Westfall-Young algorithm.</p>
               </text>
               <graphic file="1471-2105-7-50-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-value for the <it>t</it>-test with Bonferroni adjustment and Westfall-Young algorithm</p>
               </caption>
               <text>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-value for the <it>t</it>-test with Bonferroni adjustment and Westfall-Young algorithm.</p>
               </text>
               <graphic file="1471-2105-7-50-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the Cram&#233;r-von Mises test with Bonferroni adjustment and Westfall-Young algorithm</p>
               </caption>
               <text>
                  <p>Frequency of occurrence in the set of selected genes versus adjusted <it>p</it>-values for the Cram&#233;r-von Mises test with Bonferroni adjustment and Westfall-Young algorithm.</p>
               </text>
               <graphic file="1471-2105-7-50-6"/>
            </fig>
            <p>The mean values and standard deviations of the number of genes selected by different multiple testing procedures are reported in Tables <tblr tid="T1">1</tblr>. It is also interesting to look at the shape of the corresponding distribution. Figure <figr fid="F7">7</figr> shows that this shape varies widely for different procedures. The nearly symmetric form of this distribution in combination with a relatively small variance is an appealing feature of the Cram&#233;r-von Mises test.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Histograms of the number of selected genes across 200 subsamples for different methods applied to the SJCRH data</p>
               </caption>
               <text>
                  <p>Histograms of the number of selected genes across 200 subsamples for different methods applied to the SJCRH data.</p>
               </text>
               <graphic file="1471-2105-7-50-7"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of simulated data</p>
            </st>
            <p>To demonstrate the effect of correlation between gene expression levels on the performance of gene selection procedures, we carried out simulation studies as described in Section 2.2. Table <tblr tid="T2">2</tblr> presents the most basic performance indicators for the sample size <it>n </it>= <it>m </it>= 15. Since the simulated data are normally distributed it comes as no surprise that the <it>t</it>-test proves itself as the most powerful one among those under study. With this small sample size, however, even the <it>t</it>-test tends to be underpowered when used in combination with the Bonferroni adjustment or Westfall-Young adjustments. The power of the <it>t</it>-test is much higher with the Benjamini-Hochberg and nonparametric empirical Bayes procedures. The variance of the estimated power as well as the number of selected genes increases dramatically with increasing correlation between gene expression signals.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Simulating the basic characteristics of gene selection procedures, 125 differentially expressed genes, 200 simulation runs, <it>n </it>= 15. The table presents mean values over simulation runs. Standard deviations are given in parentheses.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Method</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Number of Selected Genes</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>FDR</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Power</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>S15</p>
                     </c>
                     <c ca="left">
                        <p>S15COR</p>
                     </c>
                     <c ca="left">
                        <p>S15</p>
                     </c>
                     <c ca="left">
                        <p>S15COR</p>
                     </c>
                     <c ca="left">
                        <p>S15</p>
                     </c>
                     <c ca="left">
                        <p>S15COR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/KS</p>
                     </c>
                     <c ca="left">
                        <p>36(5.3)</p>
                     </c>
                     <c ca="left">
                        <p>34(18.8)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0006</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0001</p>
                     </c>
                     <c ca="left">
                        <p>0.28(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.27(0.15)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/CVM</p>
                     </c>
                     <c ca="left">
                        <p>80(4.9)</p>
                     </c>
                     <c ca="left">
                        <p>80(25.0)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0008</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0004</p>
                     </c>
                     <c ca="left">
                        <p>0.64(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.64(0.20)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/t</p>
                     </c>
                     <c ca="left">
                        <p>89(5.5)</p>
                     </c>
                     <c ca="left">
                        <p>88(24.7)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0007</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0008</p>
                     </c>
                     <c ca="left">
                        <p>0.71(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.70(0.20)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/KS</p>
                     </c>
                     <c ca="left">
                        <p>36(4.7)</p>
                     </c>
                     <c ca="left">
                        <p>53(26.3)</p>
                     </c>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0008</p>
                     </c>
                     <c ca="left">
                        <p>0.29(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.43(0.21)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/CVM</p>
                     </c>
                     <c ca="left">
                        <p>81(5.6)</p>
                     </c>
                     <c ca="left">
                        <p>90(21.25)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0003</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0008</p>
                     </c>
                     <c ca="left">
                        <p>0.65(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.72(0.17)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/t</p>
                     </c>
                     <c ca="left">
                        <p>90(5.4)</p>
                     </c>
                     <c ca="left">
                        <p>98(19.66)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0007</p>
                     </c>
                     <c ca="left">
                        <p>0.0009</p>
                     </c>
                     <c ca="left">
                        <p>0.72(0.04)</p>
                     </c>
                     <c ca="left">
                        <p>0.79(0.16)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BH/t</p>
                     </c>
                     <c ca="left">
                        <p>130(2.8)</p>
                     </c>
                     <c ca="left">
                        <p>139(73.5)</p>
                     </c>
                     <c ca="left">
                        <p>0.048(0.019)</p>
                     </c>
                     <c ca="left">
                        <p>0.051(0.135)</p>
                     </c>
                     <c ca="left">
                        <p>0.99(0.01)</p>
                     </c>
                     <c ca="left">
                        <p>0.99(0.03)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>EB/t</p>
                     </c>
                     <c ca="left">
                        <p>116(3.0)</p>
                     </c>
                     <c ca="left">
                        <p>141(100.0)</p>
                     </c>
                     <c ca="left">
                        <p>0.012(0.006)</p>
                     </c>
                     <c ca="left">
                        <p>0.052(0.157)</p>
                     </c>
                     <c ca="left">
                        <p>0.92(0.02)</p>
                     </c>
                     <c ca="left">
                        <p>0.96(0.07)</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Table <tblr tid="T3">3</tblr> shows the results for a larger sample size (<it>n </it>= <it>m </it>= 43). In this case, all the methods attain 100% power. For all the FWER controlling procedures, the mean number of selected genes is exactly 125 and the corresponding variance is quite small irrespective of the presence or absence of correlation between gene expression levels. The FDR estimates are also uniformly small for such procedures as indicated by Table <tblr tid="T3">3</tblr>. However, the effect of correlation on the standard deviation of the number of selected genes is still very strong (compare with Table <tblr tid="T2">2</tblr>) for the Benjamini-Hochberg and nonparametric empirical Bayes procedures, indicating the inherent instability of these procedures. It should be noted that there is also a dramatic effect of the correlation on the standard deviation of the FDR observed for the latter procedures (Table <tblr tid="T3">3</tblr>). The results for 1000 simulation runs were largely similar.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Simulating the basic characteristics of gene selection procedures, 125 differentially expressed genes, 200 simulation runs, <it>n </it>= 43. The table presents mean values over simulation runs. Standard deviations are given in parentheses.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Method</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Number of Selected Genes</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>FDR</p>
                     </c>
                     <c cspan="2" ca="left">
                        <p>Power</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>SIM43</p>
                     </c>
                     <c ca="left">
                        <p>SIM43CORR</p>
                     </c>
                     <c ca="left">
                        <p>SIM43</p>
                     </c>
                     <c ca="left">
                        <p>SIM43CORR</p>
                     </c>
                     <c ca="left">
                        <p>SIM43</p>
                     </c>
                     <c ca="left">
                        <p>SIM43CORR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/KS</p>
                     </c>
                     <c ca="left">
                        <p>125(0.3)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.5)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0003</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0001</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/CVM</p>
                     </c>
                     <c ca="left">
                        <p>125(0.3)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.3)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0005</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0005</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B/t</p>
                     </c>
                     <c ca="left">
                        <p>125(0.2)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.4)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0003</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0006</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/KS</p>
                     </c>
                     <c ca="left">
                        <p>125(0.4)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.4)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0002</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0003</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/CVM</p>
                     </c>
                     <c ca="left">
                        <p>125(0.3)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.4)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0006</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0010</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>WY/t</p>
                     </c>
                     <c ca="left">
                        <p>125(0.2)</p>
                     </c>
                     <c ca="left">
                        <p>125(0.3)</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0003</p>
                     </c>
                     <c ca="left">
                        <p>&lt;0.0008</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BH/t</p>
                     </c>
                     <c ca="left">
                        <p>131(2.7)</p>
                     </c>
                     <c ca="left">
                        <p>140(90.0)</p>
                     </c>
                     <c ca="left">
                        <p>0.0427(0.0192)</p>
                     </c>
                     <c ca="left">
                        <p>0.0356(0.1219)</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>EB/t</p>
                     </c>
                     <c ca="left">
                        <p>125(0.2)</p>
                     </c>
                     <c ca="left">
                        <p>133(60.0)</p>
                     </c>
                     <c ca="left">
                        <p>0.0082(0.0012)</p>
                     </c>
                     <c ca="left">
                        <p>0.0246(0.1024)</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>We also include the histograms for the number of selected genes resulted from our simulation studies [see <supplr sid="S2">additional file 2</supplr>]. Note that the high variance observed for the BH/t and EB/t procedures (Figures <figr fid="F2">2</figr> and <figr fid="F4">4</figr> in the <supplr sid="S2">Additional File 2</supplr>) is mainly attributable to outliers.</p>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p>four figures representing histograms for the number of selected genes pertaining to the simulation studies reported in Section 2.2.</p>
               </text>
               <file name="1471-2105-7-50-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Numerous publications have considered the utility of multiple testing procedures in the context of microarray data analysis (see <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> for a review). However, little attention has been given to the replication stability of such procedures which is related to the reproducibility of scientific results. Our study shows that the variance of the number of the genes declared differentially expressed can be very high for multiple testing procedures even with reasonably large sample sizes. Whenever this is the case, the stability of membership in the list of candidate genes should be expected to be low. However, the reverse is not true. If the variance of the total number of selected genes is low, there still can be tangible variations in the stability of selection for individual genes, thereby affecting the composition of the resultant list of candidate genes. This obviously can have a strong effect on the ranking of candidate genes based on purely statistical criteria.</p>
         <p>The present study demonstrates that the proportion of highly stable (with frequencies of more than 80%) genes appears to be almost the same for all the selection procedures under study. At the same time, the overall stability of gene selection varies among different methods. The Cram&#233;r-von Mises seems to be superior to other methods in this respect. It is difficult to control the stability of gene selection by an additional adjustment of <it>p</it>-values. Indeed, for the FWER-controlling procedures, the relationship between the original (adjusted for multiple testing) <it>p</it>-values and the selection frequency appears to be non-linear. However, resampling techniques represent a universal tool for assessing the stability in question with the data at hand. As was emphasized in Section 1, our simulation studies were designed to demonstrate the fact that the correlation between gene expression levels can affect the results of testing two-sample marginal hypotheses. The FDR-controlling procedures appear to be especially sensitive to this effect. Our recent study <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> pinpoints specific components of the empirical Bayes methodology where this effect manifests itself. The quantitative contribution of the correlation between gene expression levels to the outcomes of microarray data analysis is diffcult to estimate because no tools are available to model the actual correlation structure of such a large number of variables in computer simulations.</p>
         <p>Tables <tblr tid="T2">2</tblr> and <tblr tid="T3">3</tblr> also illustrate the importance of sample size. However, the number of genes selected by the Benjamini-Hochberg and nonparametric empirical Bayes procedures is very sensitive to correlations even when the power of these methods reaches 100%. The variance of the true FDR is also quite high for such procedures. Our simulations show that the FWER-controlling procedures are more stable to the effect of correlation and this stability increases with increasing sample size. Pavlidis et al. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> proposed to approach the sample size problem from the stability perspective and we find their idea very promising and deserving of further exploration.</p>
         <p>The distribution-free methods are generally more stable than the <it>t</it>-test. It is our firm belief that such methods will play an increasingly important role and gradually replace the <it>t</it>-test in microarray studies. Robust versions of two-sample tests in general and of the <it>t</it>-test <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> in particular can be quite competitive with distribution-free methods <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> and this avenue invites a special investigation.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>As larger sets of microarray gene expression data become more readily available, the stability of gene selection is becoming easier to assess using resampling techniques. We have found that some genes are selected much less frequently (across subsamples) than other genes with the same adjusted <it>p</it>-values. The relationship between the stability of gene selection and the original (adjusted) <it>p</it>-values may be rather complex but resampling techniques can advantageously be used to select the most stable genes. Using these techniques, it is also possible to assess variability of the number of selected genes. In reference to the latter indicator, all the selection procedures studied in the present paper appear to be highly unstable. For the FWER-controlling procedures, this property correlates well with the level of random fluctuations in the estimated power of a given procedure. The more conservative FWER-controlling procedures appear to be more stable to the effect of correlation than the FDR-based procedures. The stability characteristics discussed in this paper provide an additional information that should be utilized in gene selection procedures. We suggest that resampling techniques be routinely used for selection of individual genes whenever the sample size is not prohibitively small.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>The basic idea behind this study emerged from discussions between AY and AG. The detailed study design was developed by all the members of the research team. YX and XQ carried out the needed computations and simulations.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>7 Acknowledgements</p>
            </st>
            <p>We are grateful to the three anonymous reviewers for their insightful comments. This research is supported by NIH grant GM075299 (Yakovlev).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The effects of normalization on the correlation structure of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Qiu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Klebanov</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yakovlev</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>120</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1156869</pubid>
                  <pubid idtype="pmpid" link="fulltext">15904488</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-120</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>A bootstrapping resampling procedure for model building: application to the Cox regression model</p>
            </title>
            <aug>
               <au>
                  <snm>Sauerbrei</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Schumacher</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>1993</pubdate>
            <volume>11</volume>
            <fpage>2093</fpage>
            <lpage>2109</lpage>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The effect of replication on gene expression microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Pavlidis</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Noble</snm>
                  <fnm>WS</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1620</fpage>
            <lpage>1627</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg227</pubid>
                  <pubid idtype="pmpid" link="fulltext">12967957</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Gene selection in microarray data: the elephant, the blind men and our algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Stolovitzky</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Current Opinion in Structural Biology</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>370</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(03)00078-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12831889</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Large sample confidence regions based on subsamples under minimal assumptions</p>
            </title>
            <aug>
               <au>
                  <snm>Politis</snm>
                  <fnm>DN</fnm>
               </au>
               <au>
                  <snm>Romano</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <source>The Annals of Statistics</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>2031</fpage>
            <lpage>2050</lpage>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Resampling-Based Multiple Testing</p>
            </title>
            <aug>
               <au>
                  <snm>Westfall</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <publisher>Wiley, New York</publisher>
            <pubdate>1993</pubdate>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Empirical Bayes analysis of a microarray experiment</p>
            </title>
            <aug>
               <au>
                  <snm>Efron</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Tusher</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>J Amer Statist Assoc</source>
            <pubdate>2001</pubdate>
            <volume>96</volume>
            <fpage>1151</fpage>
            <lpage>1160</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1198/016214501753382129</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Robbins, empirical Bayes and microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Efron</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Ann Statist</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>366</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1214/aos/1051027871</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Large-scale simultaneous hypothesis testing: The choice of a null hypothesis</p>
            </title>
            <aug>
               <au>
                  <snm>Efron</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>J Amer Statist Assoc</source>
            <pubdate>2004</pubdate>
            <volume>99</volume>
            <fpage>96</fpage>
            <lpage>104</lpage>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Controlling the false discovery rate: A practical and powerful approach to multiple testing</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hochberg</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Roy Statist Soc Ser B</source>
            <pubdate>1995</pubdate>
            <volume>57</volume>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The control of the false discovery rate in multiple testing under dependency</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yekutieli</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Ann Statist</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>1165</fpage>
            <lpage>1188</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1214/aos/1013699998</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>St. Jude Children's Research Hospital (SJCRH) Database on childhood leukemia</p>
            </title>
            <url>http://www.stjuderesearch.org/data/ALL1/</url>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</p>
            </title>
            <aug>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Astrand</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>2</issue>
            <fpage>185</fpage>
            <lpage>193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.2.185</pubid>
                  <pubid idtype="pmpid" link="fulltext">12538238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>An R package for analyses of Affymetrix oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cope</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>The Analysis of Gene Expression Data</source>
            <publisher>Springer, New York</publisher>
            <editor>Parmigiani G, Garrett ES, Irizarry RA, Zeger SL</editor>
            <pubdate>2003</pubdate>
            <fpage>102</fpage>
            <lpage>119</lpage>
         </bibl>
         <bibl id="B15">
            <aug>
               <au>
                  <snm>Efron</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>An Introduction to the Bootstrap</source>
            <publisher>Chapman &amp; Hall/CRC, New York</publisher>
            <pubdate>1993</pubdate>
         </bibl>
         <bibl id="B16">
            <aug>
               <au>
                  <snm>Shao</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tu</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>The Jackknife and Bootstrap</source>
            <publisher>Springer Series in Statistics, Springer, New York</publisher>
            <pubdate>1995</pubdate>
         </bibl>
         <bibl id="B17">
            <aug>
               <au>
                  <snm>Conover</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Practical Nonparametric Statistics</source>
            <publisher>Wiley, New York</publisher>
            <edition>3</edition>
            <pubdate>1999</pubdate>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Asymptotic theory of certain "goodness of fit" criterion based on stochastic processes</p>
            </title>
            <aug>
               <au>
                  <snm>Anderson</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Darling</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>The Annals of Mathematical Statistics</source>
            <pubdate>1952</pubdate>
            <volume>23</volume>
            <fpage>193</fpage>
            <lpage>212</lpage>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Limit theorems associated with variants of the von Mises statistic</p>
            </title>
            <aug>
               <au>
                  <snm>Rosenblatt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>The Annals of Mathematical Statistics</source>
            <pubdate>1952</pubdate>
            <volume>23</volume>
            <fpage>617</fpage>
            <lpage>623</lpage>
         </bibl>
         <bibl id="B20">
            <title>
               <p>(1962) On the distribution of the two-sample Cram&#233;r-von Mises criterion</p>
            </title>
            <aug>
               <au>
                  <snm>Anderson</snm>
                  <fnm>TW</fnm>
               </au>
            </aug>
            <source>The Annals of Mathematical Statistics</source>
            <pubdate>1962</pubdate>
            <volume>33</volume>
            <fpage>1148</fpage>
            <lpage>1159</lpage>
         </bibl>
         <bibl id="B21">
            <title>
               <p>(1996) The exact and asymptotic distributions of Cram&#233;e-von Mises statistics</p>
            </title>
            <aug>
               <au>
                  <snm>Csorgo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Faraway</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
            <pubdate>1996</pubdate>
            <volume>58</volume>
            <fpage>221</fpage>
            <lpage>234</lpage>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Small-sample distribution of the two-sample Cram&#233;r-von Mises criterion for small equal samples</p>
            </title>
            <aug>
               <au>
                  <snm>Burr</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>The Annals of Mathematical Statistics</source>
            <pubdate>1963</pubdate>
            <volume>34</volume>
            <fpage>95</fpage>
            <lpage>101</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>A permutation test motivated by microarray data analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Klebanov</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Xiao</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yakovlev</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Comp Stat Data Anal</source>
            <inpress/>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Statistical significance for genomewide studies</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>9440</fpage>
            <lpage>9445</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">170937</pubid>
                  <pubid idtype="pmpid" link="fulltext">12883005</pubid>
                  <pubid idtype="doi">10.1073/pnas.1530509100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Identifying differentially expressed genes using false discovery rate controlling procedures</p>
            </title>
            <aug>
               <au>
                  <snm>Reiner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yekutieli</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>368</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btf877</pubid>
                  <pubid idtype="pmpid" link="fulltext">12584122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The positive false discovery rate: a Bayesian interpretation and the <it>q</it>-value</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Ann Statist</source>
            <pubdate>2004</pubdate>
            <volume>31</volume>
            <fpage>2013</fpage>
            <lpage>2035</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1214/aos/1074290335</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Multiple hypothesis testing in microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shaffer</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Boldrick</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Statistical Science</source>
            <pubdate>2003</pubdate>
            <volume>18</volume>
            <fpage>71</fpage>
            <lpage>103</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1214/ss/1056397487</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes, Statistical Applications in Genetics and Molecular Biology</p>
            </title>
            <aug>
               <au>
                  <snm>Qiu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Klebanov</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yakovlev</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <pubdate>2005</pubdate>
            <volume>4</volume>
            <fpage>Article 34</fpage>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Use of within-array replicate spots for assessing differential expression in microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Smyth</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Michaud</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>HS</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>2067</fpage>
            <lpage>2075</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti270</pubid>
                  <pubid idtype="pmpid" link="fulltext">15657102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <aug>
               <au>
                  <snm>Wilcox</snm>
                  <fnm>RR</fnm>
               </au>
            </aug>
            <source>Fundamentals of Modern Statistical Methods</source>
            <publisher>Springer, New York</publisher>
            <pubdate>2001</pubdate>
         </bibl>
      </refgrp>
   </bm>
</art>
