Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Multivariate search for differentially expressed gene combinations

Yuanhui Xiao1, Robert Frisina2, Alexander Gordon1, Lev Klebanov13 and Andrei Yakovlev1*

Author Affiliations

1 Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Rochester, New York 14642, USA

2 Departments of Otolaryngology, Neurobiology and Anatomy, and Biomedical Engineering, University of Rochester, 601 Elmwood Avenue, Rochester, New York 14642, USA

3 Department of Probability and Statistics, Charls University, Sokolovska 83, Praha-8, CZ-18675, Czech Republic

For all author emails, please log on.

BMC Bioinformatics 2004, 5:164  doi:10.1186/1471-2105-5-164

Published: 26 October 2004



To identify differentially expressed genes, it is standard practice to test a two-sample hypothesis for each gene with a proper adjustment for multiple testing. Such tests are essentially univariate and disregard the multidimensional structure of microarray data. A more general two-sample hypothesis is formulated in terms of the joint distribution of any sub-vector of expression signals.


By building on an earlier proposed multivariate test statistic, we propose a new algorithm for identifying differentially expressed gene combinations. The algorithm includes an improved random search procedure designed to generate candidate gene combinations of a given size. Cross-validation is used to provide replication stability of the search procedure. A permutation two-sample test is used for significance testing. We design a multiple testing procedure to control the family-wise error rate (FWER) when selecting significant combinations of genes that result from a successive selection procedure. A target set of genes is composed of all significant combinations selected via random search.


A new algorithm has been developed to identify differentially expressed gene combinations. The performance of the proposed search-and-testing procedure has been evaluated by computer simulations and analysis of replicated Affymetrix gene array data on age-related changes in gene expression in the inner ear of CBA mice.