Open Access Methodology article

A benchmark for statistical microarray data analysis that preserves actual biological and technical variance

Benoît De Hertogh, Bertrand De Meulder, Fabrice Berger, Michael Pierre, Eric Bareke, Anthoula Gaigneaux and Eric Depiereux*

Author Affiliations

Unité de Recherche en Biologie Moléculaire, Facultés Universitaires Notre-Dame de la Paix (F.U.N.D.P.), Rue de Bruxelles, 61, B-5000 Namur, Belgium

For all author emails, please log on.

BMC Bioinformatics 2010, 11:17  doi:10.1186/1471-2105-11-17

Published: 11 January 2010



Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods.


Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality.


Performance analysis refined the results from benchmarks published previously.

We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better.


The R scripts used for the analysis are available at webcite.