This article is part of the supplement: Selected articles from the IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) 2011
Searching joint association signals in CATIE schizophrenia genome-wide association studies through a refined integrative network approach
1 Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
2 Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
3 Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
4 Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN, USA
BMC Genomics 2012, 13(Suppl 6):S15 doi:10.1186/1471-2164-13-S6-S15Published: 26 October 2012
Genome-wide association studies (GWAS) have generated a wealth of valuable genotyping data for complex diseases/traits. A large proportion of these data are embedded with many weakly associated markers that have been missed in traditional single marker analyses, but they may provide valuable insights in dissecting the genetic components of diseases. Gene set analysis (GSA) augmented by protein-protein interaction network data provides a promising way to examine GWAS data by analyzing the combined effects of multiple genes/markers, each of which may have only individually weak to moderate association effects. A critical issue in GSA of GWAS data is the definition of gene-wise P values based on multiple SNPs mapped to a gene.
In this study, we proposed an alternative restricted search approach based on our previously developed dense module search algorithm, and we demonstrated it in the CATIE GWAS dataset for schizophrenia. Specifically, we explored three ways of computing gene-wise P values and examined their effects on the resultant module genes. These methods calculate gene-wise P values based on all the SNPs, the top ranked SNPs, or the most significant SNP among all the SNPs mapped to a gene. We applied the restricted search approach and identified a module gene set for each of the gene-wise P value data set. In our evaluation using an independent method, ALIGATOR, we showed that although each of these input datasets generated a unique set of module genes, all of them were significant in the GWAS dataset. Further functional enrichment analysis of these module genes showed that at the pathway level, they were all consistently related to neuro- and immune-related pathways. Finally, we compared our method with a previously reported method.
Our results showed that the approaches to computing gene-wise P values in GWAS data are critical in GSA. This work is useful for evaluating key factors in GSA of GWAS data.