Integrative set enrichment testing for multiple omics platforms
1 Department of Public Health Sciences, Henry Ford Hospital, 1 Ford Place, Detroit, MI, 48202, USA
2 Department of Biostatistics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI, 48109-2029, USA
3 Departments of Statistics and Public Health Sciences, Penn State University, 514A Wartik Laboratory, University Park, PA, 16802, USA
BMC Bioinformatics 2011, 12:459 doi:10.1186/1471-2105-12-459Published: 25 November 2011
Enrichment testing assesses the overall evidence of differential expression behavior of the elements within a defined set. When we have measured many molecular aspects, e.g. gene expression, metabolites, proteins, it is desirable to assess their differential tendencies jointly across platforms using an integrated set enrichment test. In this work we explore the properties of several methods for performing a combined enrichment test using gene expression and metabolomics as the motivating platforms.
Using two simulation models we explored the properties of several enrichment methods including two novel methods: the logistic regression 2-degree of freedom Wald test and the 2-dimensional permutation p-value for the sum-of-squared statistics test. In relation to their univariate counterparts we find that the joint tests can improve our ability to detect results that are marginal univariately. We also find that joint tests improve the ranking of associated pathways compared to their univariate counterparts. However, there is a risk of Type I error inflation with some methods and self-contained methods lose specificity when the sets are not representative of underlying association.
In this work we show that consideration of data from multiple platforms, in conjunction with summarization via a priori pathway information, leads to increased power in detection of genomic associations with phenotypes.