Gene set enrichment analysis for multiple continuous phenotypes
1 School of Public Health, University of Alberta, Edmonton, AB T6G 1C9, Canada
2 CR Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad AP 500046, India
3 Public Health Foundation of India, Delhi, India
BMC Bioinformatics 2014, 15:260 doi:10.1186/1471-2105-15-260Published: 3 August 2014
Gene set analysis (GSA) methods test the association of sets of genes with phenotypes in gene expression microarray studies. While GSA methods on a single binary or categorical phenotype abounds, little attention has been paid to the case of a continuous phenotype, and there is no method to accommodate correlated multiple continuous phenotypes.
We propose here an extension of the linear combination test (LCT) to its new version for multiple continuous phenotypes, incorporating correlations among gene expressions of functionally related gene sets, as well as correlations among multiple phenotypes. Further, we extend our new method to its nonlinear version, referred as nonlinear combination test (NLCT), to test potential nonlinear association of gene sets with multiple phenotypes. Simulation study and a real microarray example demonstrate the practical aspects of the proposed methods.
The proposed approaches are effective in controlling type I errors and powerful in testing associations between gene-sets and multiple continuous phenotypes. They are both computationally effective. Naively (univariately) analyzing a group of multiple correlated phenotypes could be dangerous. R-codes to perform LCT and NLCT for multiple continuous phenotypes are available at http://www.ualberta.ca/~yyasui/homepage.html webcite.