A three-stage approach for genome-wide association studies with family data for quantitative traits
1 Department of Neurology and Framingham Heart Study, Boston University School of Medicine, Boston, MA, USA
2 The NHLBI's Framingham Heart Study, Framingham, MA, USA
3 Department of Mathematics and Statistics, Boston University, Boston, MA, USA
4 Genetic Epidemiology Program, Hebrew Senior Life Institute for Aging Research and Harvard Medical School, Boston, MA, USA
5 Molecular and Integrative Physiological Sciences Program, Harvard School of Public Health, Boston, MA, USA
6 Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
7 Clinical Research Program, Children's Hospital Boston, Boston, MA, USA
8 Program in Genomics, Department of Medicine, Children's Hospital Boston, Boston, MA, USA
9 Department of Pediatrics, Harvard Medical School, Boston, MA, USA
10 The Center for Population Studies, National Heart, Lung, and Blood Institute, Bethesda, MD, USA
BMC Genetics 2010, 11:40 doi:10.1186/1471-2156-11-40Published: 14 May 2010
Genome-wide association (GWA) studies that use population-based association approaches may identify spurious associations in the presence of population admixture. In this paper, we propose a novel three-stage approach that is computationally efficient and robust to population admixture and more powerful than the family-based association test (FBAT) for GWA studies with family data.
We propose a three-stage approach for GWA studies with family data. The first stage is to perform linear regression ignoring phenotypic correlations among family members. SNPs with a first stage p-value below a liberal cut-off (e.g. 0.1) are then analyzed in the second stage that employs a linear mixed effects (LME) model that accounts for within family correlations. Next, SNPs that reach genome-wide significance (e.g. 10-6 for 34,625 genotyped SNPs in this paper) are analyzed in the third stage using FBAT, with correction of multiple testing only for SNPs that enter the third stage. Simulations are performed to evaluate type I error and power of the proposed method compared to LME adjusting for 10 principal components (PC) of the genotype data. We also apply the three-stage approach to the GWA analyses of uric acid in Framingham Heart Study's SNP Health Association Resource (SHARe) project.
Our simulations show that whether or not population admixture is present, the three-stage approach has no inflated type I error. In terms of power, using LME adjusting PC is only slightly more powerful than the three-stage approach. When applied to the GWA analyses of uric acid in the SHARe project of FHS, the three-stage approach successfully identified and confirmed three SNPs previously reported as genome-wide significant signals.
For GWA analyses of quantitative traits with family data, our three-stage approach provides another appealing solution to population admixture, in addition to LME adjusting for genetic PC.