Multi-locus stepwise regression: a haplotype-based algorithm for finding genetic associations applied to atopic dermatitis
1 Max Delbrück Center for Molecular Medicine Berlin-Buch, Berlin, Germany
2 Pediatric Pneumology and Immunology, Charité Universitätsmedizin Berlin, Berlin, Germany
3 Institute of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
4 Institute for Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
5 Department of Dermatology and Allergy, Technische Universität München, Munich, Germany
6 Division of Environmental Dermatology and Allergy, Helmholtz Zentrum Munich and ZAUM-Center for Allergy and Environment, Technische Universität München, Munich, Germany
7 German Institute of Human Nutrition Potsdam-Rehbrücke, Department of Epidemiology, Nuthetal, Germany
Citation and License
BMC Medical Genetics 2012, 13:8 doi:10.1186/1471-2350-13-8Published: 27 January 2012
Genome-wide association studies (GWAS) provide an increasing number of single nucleotide polymorphisms (SNPs) associated with diseases. Our aim is to exploit those closely spaced SNPs in candidate regions for a deeper analysis of association beyond single SNP analysis, combining the classical stepwise regression approach with haplotype analysis to identify risk haplotypes for complex diseases.
Our proposed multi-locus stepwise regression starts with an evaluation of all pair-wise SNP combinations and then extends each SNP combination stepwise by one SNP from the region, carrying out haplotype regression in each step. The best associated haplotype patterns are kept for the next step and must be corrected for multiple testing at the end. These haplotypes should also be replicated in an independent data set. We applied the method to a region of 259 SNPs from the epidermal differentiation complex (EDC) on chromosome 1q21 of a German GWAS using a case control set (1,914 individuals) and to 268 families with at least two affected children as replication.
A 4-SNP haplotype pattern with high statistical significance in the case control set (p = 4.13 × 10-7 after Bonferroni correction) could be identified which remained significant in the family set after Bonferroni correction (p = 0.0398). Further analysis revealed that this pattern reflects mainly the effect of the well-known FLG gene; however, a FLG-independent haplotype in case control set (OR = 1.71, 95% CI: 1.32-2.23, p = 5.6 × 10-5) and family set (OR = 1.68, 95% CI: 1.18-2.38, p = 2.19 × 10-3) could be found in addition.
Our approach is a useful tool for finding allele combinations associated with diseases beyond single SNP analysis in chromosomal candidate regions.