This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Application of collapsing methods for continuous traits to the Genetic Analysis Workshop 17 exome sequence data
Division of Biostatistics, Washington University School of Medicine, 660 S. Euclid Ave., St. Louis, MO 63110, USA
BMC Proceedings 2011, 5(Suppl 9):S121 doi:10.1186/1753-6561-5-S9-S121Published: 29 November 2011
Genetic Analysis Workshop 17 used real sequence data from the 1000 Genomes Project and simulated phenotypes influenced by a large number of rare variants. Our aim is to evaluate the performance of various collapsing methods that were developed for analysis of multiple rare variants. We apply collapsing methods to continuous phenotypes Q1 and Q2 for all 200 replicates of the unrelated individuals data. Within each gene, we collapse (1) all SNPs, (2) all SNPs with minor allele frequency (MAF) < 0.05, and (3) nonsynonymous SNPs with MAF < 0.05. We consider two tests when collapsing variants: using the proportion of variants and using the presence/absence of any variant. We also compare our results to a single-marker analysis using PLINK. For phenotype Q1, the proportion test for collapsing rare nonsynonymous SNPs often performed the best. Two genes (FLT1 and KDR) had statistically significant results. A single-marker analysis using PLINK also provided statistically significant results for some SNPs within these two genes. For phenotype Q2, collapsing rare nonsynonymous SNPs performed the best, with almost no difference between proportion and presence tests. However, neither collapsing methods nor a single-marker analysis provided statistically significant results at the true genes for Q2. We also found that a large number of noncausal genes had high correlations with causal genes for Q1 and Q2, which may account for inflated false positives.