This article is part of the supplement: 22nd International Conference on Genome Informatics: Systems Biology
SNP-PRAGE: SNP-based parametric robust analysis of gene set enrichment
1 Department of Statistics, Seoul National University, San 56-1, Shilim-dong, Seoul, Korea
2 Medical Research Collaborating Center, Seoul National University Bundang Hospital, 166 Gumi-ro, Bundang-gu, Seongnam 463-707, Korea
3 Department of Biostatistics, University of Washington, Box 357232, Seattle, Washington 98195, USA
BMC Systems Biology 2011, 5(Suppl 2):S11 doi:10.1186/1752-0509-5-S2-S11Published: 14 December 2011
The current genome-wide association (GWA) analysis mainly focuses on the single genetic variant, which may not reveal some the genetic variants that have small individual effects but large joint effects. Considering the multiple SNPs jointly in Genome-wide association (GWA) analysis can increase power. When multiple SNPs are jointly considered, the corresponding SNP-level association measures are likely to be correlated due to the linkage disequilibrium (LD) among SNPs.
We propose SNP-based parametric robust analysis of gene-set enrichment (SNP-PRAGE) method which handles correlation adequately among association measures of SNPs, and minimizes computing effort by the parametric assumption. SNP-PRAGE first obtains gene-level association measures from SNP-level association measures by incorporating the size of corresponding (or nearby) genes and the LD structure among SNPs. Afterward, SNP-PRAGE acquires the gene-set level summary of genes that undergo the same biological knowledge. This two-step summarization makes the within-set association measures to be independent from each other, and therefore the central limit theorem can be adequately applied for the parametric model.
Results & conclusions
We applied SNP-PRAGE to two GWA data sets: hypertension data of 8,842 samples from the Korean population and bipolar disorder data of 4,806 samples from the Wellcome Trust Case Control Consortium (WTCCC). We found two enriched gene sets for hypertension and three enriched gene sets for bipolar disorder. By a simulation study, we compared our method to other gene set methods, and we found SNP-PRAGE reduced many false positives notably while requiring much less computational efforts than other permutation-based gene set approaches.