Multiple testing for gene sets from microarray experiments
-
* Corresponding author: Sin-Ho Jung sinho.jung@duke.edu
1 Biostatistics and Bioinformatics Center, Samsung Cancer Research Institute, Samsung Medical Center, Seoul, 137-710, Republic of Korea
2 Department of Biostatistics and Bioinformatics, Duke University Medical Center, NC 27710, USA
3 Department of Statistics, Seoul National University, Seoul 151-747, Republic of Korea
4 Department of Medicine, Division of Medical Oncology, Duke University, NC 27710, USA
BMC Bioinformatics 2011, 12:209 doi:10.1186/1471-2105-12-209
Published: 26 May 2011Abstract
Background
A key objective in many microarray association studies is the identification of individual genes associated with clinical outcome. It is often of additional interest to identify sets of genes, known a priori to have similar biologic function, associated with the outcome.
Results
In this paper, we propose a general permutation-based framework for gene set testing that controls the false discovery rate (FDR) while accounting for the dependency among the genes within and across each gene set. The application of the proposed method is demonstrated using three public microarray data sets. The performance of our proposed method is contrasted to two other existing Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA) methods.
Conclusions
Our simulations show that the proposed method controls the FDR at the desired level. Through simulations and case studies, we observe that our method performs better than GSEA and GSA, especially when the number of prognostic gene sets is large.