This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Principal components ancestry adjustment for Genetic Analysis Workshop 17 data
1 Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, NY 11733, USA
2 Henri Begleiter Neurodynamics Laboratory, Department of Psychiatry and Behavioral Sciences, SUNY Downstate Medical Center, 450 Clarkson Avenue, Box 1203, Brooklyn, NY 11203, USA
3 Seaver Autism Center and Department of Psychiatry, Mount Sinai School of Medicine, One Gustave L. Levy Place, Box 1668, New York, NY 10029, USA
BMC Proceedings 2011, 5(Suppl 9):S66 doi:10.1186/1753-6561-5-S9-S66Published: 29 November 2011
Statistical tests on rare variant data may well have type I error rates that differ from their nominal levels. Here, we use the Genetic Analysis Workshop 17 data to estimate type I error rates and powers of three models for identifying rare variants associated with a phenotype: (1) by using the number of minor alleles, age, and smoking status as predictor variables; (2) by using the number of minor alleles, age, smoking status, and the identity of the population of the subject as predictor variables; and (3) by using the number of minor alleles, age, smoking status, and ancestry adjustment using 10 principal component scores. We studied both quantitative phenotype and a dichotomized phenotype. The model with principal component adjustment has type I error rates that are closer to the nominal level of significance of 0.05 for single-nucleotide polymorphisms (SNPs) in noncausal genes for the selected phenotype than the model directly adjusting for population. The principal component adjustment model type I error rates are also closer to the nominal level of 0.05 for noncausal SNPs located in causal genes for the phenotype. The power for causal SNPs with the principal component adjustment model is comparable to the power of the other methods. The power using the underlying quantitative phenotype is greater than the power using the dichotomized phenotype.