This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
SNP set analysis for detecting disease association using exon sequence data
1 Department of Statistics, University of California, Davis, CA 95616, USA
2 Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, PO Box 19024, Seattle, WA 98109, USA
BMC Proceedings 2011, 5(Suppl 9):S91 doi:10.1186/1753-6561-5-S9-S91Published: 29 November 2011
Rare variants are believed to play an important role in disease etiology. Recent advances in high-throughput sequencing technology enable investigators to systematically characterize the genetic effects of both common and rare variants. We introduce several approaches that simultaneously test the effects of common and rare variants within a single-nucleotide polymorphism (SNP) set based on logistic regression models and logistic kernel machine models. Gene-environment interactions and SNP-SNP interactions are also considered in some of these models. We illustrate the performance of these methods using the unrelated individuals data from Genetic Analysis Workshop 17. Three true disease genes (FLT1, PIK3C3, and KDR) were consistently selected using the proposed methods. In addition, compared to logistic regression models, the logistic kernel machine models were more powerful, presumably because they reduced the effective number of parameters through regularization. Our results also suggest that a screening step is effective in decreasing the number of false-positive findings, which is often a big concern for association studies.