This article is part of the supplement: Genetic Analysis Workshop 14: Microsatellite and single-nucleotide polymorphism
The effect of linkage disequilibrium on linkage analysis of incomplete pedigrees
1 Department of Psychiatry, University of Pennsylvania School of Medicine, 353 Market Street, Philadelphia, PA, USA
2 Department of Psychological Medicine, School of Medicine, Cardiff University, Cardiff, UK
BMC Genetics 2005, 6(Suppl 1):S6 doi:10.1186/1471-2156-6-S1-S6Published: 30 December 2005
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was ≤ 0.05. Additional simulation studies of affected sib pairs assuming uniform LD throughout a marker map demonstrated inflation of significance levels at r2 values greater than 0.05. When genotypes are available only from two affected siblings in many families in a sample, trimming SNP maps to limit r2 to 0–0.05 for all marker pairs will prevent inflation of linkage scores without sacrificing substantial linkage information. Simulation studies on the observed pedigree structures and map can also be used to determine the effect of LD on a particular study.