This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Identifying rare disease variants in the Genetic Analysis Workshop 17 simulated data: a comparison of several statistical approaches
1 Department of Statistics, Columbia University, 1255 Amsterdam Avenue, MC 4690, New York, NY 10027, USA
2 Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY 10032, USA
BMC Proceedings 2011, 5(Suppl 9):S17 doi:10.1186/1753-6561-5-S9-S17Published: 29 November 2011
Genome-wide association studies have been successful at identifying common disease variants associated with complex diseases, but the common variants identified have small effect sizes and account for only a small fraction of the estimated heritability for common diseases. Theoretical and empirical studies suggest that rare variants, which are much less frequent in populations and are poorly captured by single-nucleotide polymorphism chips, could play a significant role in complex diseases. Several new statistical methods have been developed for the analysis of rare variants, for example, the combined multivariate and collapsing method, the weighted-sum method and a replication-based method. Here, we apply and compare these methods to the simulated data sets of Genetic Analysis Workshop 17 and thereby explore the contribution of rare variants to disease risk. In addition, we investigate the usefulness of extreme phenotypes in identifying rare risk variants when dealing with quantitative traits. Finally, we perform a pathway analysis and show the importance of the vascular endothelial growth factor pathway in explaining different phenotypes.