This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Propensity score analysis in the Genetic Analysis Workshop 17 simulated data set on independent individuals
1 Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, 695 Charles E. Young Drive South, Los Angeles, CA 90095-1761, USA
2 Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, USA
BMC Proceedings 2011, 5(Suppl 9):S71 doi:10.1186/1753-6561-5-S9-S71Published: 29 November 2011
Genetic Analysis Workshop 17 provided simulated phenotypes and exome sequence data for 697 independent individuals (209 case subjects and 488 control subjects). The disease liability in these data was influenced by multiple quantitative traits. We addressed the lack of statistical power in this small data set by limiting the genomic variants included in the study to those with potential disease-causing effect, thereby reducing the problem of multiple testing. After this adjustment, we could readily detect two common variants that were strongly associated with the quantitative trait Q1 (C13S523 and C13S522). However, we found no significant associations with the affected status or with any of the other quantitative traits, and the relationship between disease status and genomic variants remained obscure. To address the challenge of the multivariate phenotype, we used propensity scores to combine covariates with genetic risk factors into a single risk factor and created a new phenotype variable, the probability of being affected given the covariates. Using the propensity score as a quantitative trait in the case-control analysis, we again could identify the two common single-nucleotide polymorphisms (C13S523 and C13S522). In addition, this analysis captured the correlation between Q1 and the affected status and reduced the problem of multiple testing. Although the propensity score was useful for capturing and clarifying the genetic contributions of common variants to the disease phenotype and the mediating role of the quantitative trait Q1, the analysis did not increase power to detect rare variants.