Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Gene set enrichment analysis for multiple continuous phenotypes

Xiaoming Wang1*, Saumyadipta Pyne23 and Irina Dinu1

Author Affiliations

1 School of Public Health, University of Alberta, Edmonton, AB T6G 1C9, Canada

2 CR Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad AP 500046, India

3 Public Health Foundation of India, Delhi, India

For all author emails, please log on.

BMC Bioinformatics 2014, 15:260  doi:10.1186/1471-2105-15-260

Published: 3 August 2014

Abstract

Background

Gene set analysis (GSA) methods test the association of sets of genes with phenotypes in gene expression microarray studies. While GSA methods on a single binary or categorical phenotype abounds, little attention has been paid to the case of a continuous phenotype, and there is no method to accommodate correlated multiple continuous phenotypes.

Result

We propose here an extension of the linear combination test (LCT) to its new version for multiple continuous phenotypes, incorporating correlations among gene expressions of functionally related gene sets, as well as correlations among multiple phenotypes. Further, we extend our new method to its nonlinear version, referred as nonlinear combination test (NLCT), to test potential nonlinear association of gene sets with multiple phenotypes. Simulation study and a real microarray example demonstrate the practical aspects of the proposed methods.

Conclusion

The proposed approaches are effective in controlling type I errors and powerful in testing associations between gene-sets and multiple continuous phenotypes. They are both computationally effective. Naively (univariately) analyzing a group of multiple correlated phenotypes could be dangerous. R-codes to perform LCT and NLCT for multiple continuous phenotypes are available at http://www.ualberta.ca/~yyasui/homepage.html webcite.

Keywords:
DNA microarrays; Gene expression; Linear combination test; Nonlinear combination test; Gene-set analysis