This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Genome-wide case-control study in GAW17 using coalesced rare variants
Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
BMC Proceedings 2011, 5(Suppl 9):S110 doi:10.1186/1753-6561-5-S9-S110Published: 29 November 2011
Genome-wide association studies have successfully identified numerous loci at which common variants influence disease risks or quantitative traits of interest. Despite these successes, the variants identified by these studies have generally explained only a small fraction of the variations in the phenotype. One explanation may be that many rare variants that are not included in the common genotyping platforms may contribute substantially to the genetic variations of the diseases. Next-generation sequencing, which would better allow for the analysis of rare variants, is now becoming available and affordable; however, the presence of a large number of rare variants challenges the statistical endeavor to stably identify these disease-causing genetic variants. We conduct a genome-wide association study of Genetic Analysis Workshop 17 case-control data produced by the next-generation sequencing technique and propose that collapsing rare variants within each genetic region through a supervised dimension reduction algorithm leads to several macrovariants constructed for rare variants within each genetic region. A simultaneous association of the phenotype to all common variants and macrovariants is undertaken using a linear discriminant analysis using the penalized orthogonal-components regression algorithm. The results suggest that the proposed analysis strategy shows promise but needs further development.