Pathway analysis comparison using Crohn's disease genome wide association studies
1 Robert S. Boas Center for Human Genetics and Genomics, Feinstein Institute for Medical Research, Manhasset, New York, USA
2 IBD Center, Division of Gastroenterology, Department of Medicine, Yale University, New Haven, CT, USA
3 Department of Genetics, Yale University, New Haven, CT, USA
4 Department of Epidemiology and Public Health, Yale University, New Haven, CT, USA
BMC Medical Genomics 2010, 3:25 doi:10.1186/1755-8794-3-25Published: 28 June 2010
The use of biological annotation such as genes and pathways in the analysis of gene expression data has aided the identification of genes for follow-up studies and suggested functional information to uncharacterized genes. Several studies have applied similar methods to genome wide association studies and identified a number of disease related pathways. However, many questions remain on how to best approach this problem, such as whether there is a need to obtain a score to summarize association evidence at the gene level, and whether a pathway, dominated by just a few highly significant genes, is of interest.
We evaluated the performance of two pathway-based methods (Random Set, and Binomial approximation to the hypergeometric test) based on their applications to three data sets of Crohn's disease. We consider both the disease status as a phenotype as well as the residuals after conditioning on IL23R, a known Crohn's related gene, as a phenotype.
Our results show that Random Set method has the most power to identify disease related pathways. We confirm previously reported disease related pathways and provide evidence for IL-2 Receptor Beta Chain in T cell Activation and IL-9 signaling as Crohn's disease associated pathways.
Our results highlight the need to apply powerful gene score methods prior to pathway enrichment tests, and that controlling for genes that attain genome wide significance enable further biological insight.