Both theoretical and applied studies have proven that the utility of single nucleotide polymorphism (SNP) markers in linkage analysis is more powerful and cost-effective than current microsatellite marker assays. Here we performed a whole-genome scan on 115 White, non-Hispanic families segregating for alcohol dependence, using one 10.3-cM microsatellite marker set and two SNP data sets (0.33-cM, 0.78-cM spacing). Two definitions of alcohol dependence (ALDX1 and ALDX2) were used. Our multipoint nonparametric linkage analysis found alcoholism was nominal linked to 12 genomic regions. The linkage peaks obtained by using the microsatellite marker set and the two SNP sets had a high degree of correspondence in general, but the microsatellite marker set was insufficient to detect some nominal linkage peaks. The presence of linkage disequilibrium between markers did not significantly affect the results. Across the entire genome, SNP datasets had a much higher average linkage information content (0.33 cM: 0.93, 0.78 cM: 0.91) than did microsatellite marker set (0.57). The linkage peaks obtained through two SNP datasets were very similar with some minor differences. We conclude that genome-wide linkage analysis by using approximately 5,000 SNP markers evenly distributed across the human genome is sufficient and might be more powerful than current 10-cM microsatellite marker assays.
In traditional linkage analysis for identifying genomic regions related to disease phenotypes, a whole-genome scan is usually performed using a set of 300–400 microsatellite markers evenly spaced across the genome. To maximize the chances of detecting linkage, the optimal amount of inheritance information is critical. This can be increased by genotyping more families and adding additional markers. With the rapid discovery of SNPs across the genome and the development of large-scale, high-throughput SNP genotyping approaches, high-density SNP assays throughout the genome may be a more rapid, powerful, and cost-effective tool than microsatellite marker assays in linkage analysis . Recently, both simulation and applied studies have shown that high-density SNPs across the genome may offer several advantages over a low density microsatellite marker set, including increased power to detect linkage [2-4] and more precise mapping of the disease phenotype susceptibility loci . The Collaborative Study on the Genetics of Alcoholism (COGA) data provided to participants in the Genetic Analysis Workshop 14 (GAW14) included one 10-cM microsatellite marker set and two high-density SNP genotype datasets, which offered a good opportunity to test the benefit of high-density SNPs relative to lower-density microsatellite markers in a whole-genome linkage scan.
The COGA dataset provided to participants in GAW14 was analyzed in this study. Only families with ethnicity self-reported as White, non-Hispanic were kept for analysis. Two diagnostic criteria for alcoholism were used in our analyses. For the first criterion, a diagnosis of alcoholism required positive diagnosis by the DSM-III-R criteria  and definite "alcoholism" by the Feighner criteria . This is referred to as the COGA criterion for ALDX1. For the second criterion, a diagnosis of alcoholism only required positive diagnosis by the DSM-IV criterion , which is referred to as the COGA criterion for ALDX2. For each criterion, we classified individuals who are coded as "pure unaffected" under the COGA definition as unaffected. Individuals who showed some alcohol-related syndromes, but did not meet the criterion for affected and those who never drank alcohol were classified as "affection status unknown."
Genetic maps and linkage disequilibrium
SNP genetic map positions were interpolated on the deCODE genetic map  through use of their physical positions (NCBI genome build 34.3); markers not placed were discarded. Since strong linkage disequilibrium (LD) might exist among some of the closely spaced SNPs and LD between SNPs might generate inflated linkage signals, we used Haploview (version 3.0)  to define LD blocks (default method) and selected only one tagging SNP with the highest heterozygosity among SNPs within each defined block.
We performed multipoint nonparametric linkage analysis using an affected-only allele-sharing method, which was implemented in the ALLEGRO (version 1.2c) software . We employed the Spairs scoring function , which performs well for all disease models, and the exponential allele-sharing model  to generate the relevant test statistics. Family scores were combined to obtain an overall score, using a weighting scheme that each family should be weighted proportionally to the standard deviation of the score function used, under the null hypothesis of no linkage, to the power 0.5, which is considered about midway between weighting each pair equally versus weighting each family equally .
We used 115 White, non-Hispanic families in our analysis. The total number of individuals was 1,245, of which 1,009 were genotyped. Linkage information content for two SNP datasets was very similar except that the less-dense Illumina set had lower linkage information content on the X chromosome due to its poor coverage (Figure 1). Both SNP datasets had significantly higher linkage information content and better coverage than microsatellite marker data throughout the entire genome (Table 1).
Figure 1. Linkage information content of high-density SNPs vs. microsatellites.
Table 1. Marker information.
For both definitions of alcohol dependence (ALDX1 and ALDX2), we found 12 genomic regions with nominally significant LOD scores (p < 0.05, Table 2). There was good concordance between the two SNP datasets in linkage peaks, except for the second peak on chromosome 6. We detected the linkage peaks discovered by the microsatellite marker assay with slightly higher LOD scores in both SNP datasets, with the exception of one peak on chromosome 21. We also detected two additional linkage peaks in both SNP datasets that were missed in microsatellite assay. This was likely due to low linkage information content (chromosome X) or poor coverage (chromosome 6).
Table 2. Maximal LOD scores for loci with increased allele sharing at p < 0.05.
Impact of the presence of LD was investigated by using the Affymetrix SNPs set, which had many LD blocks across the genome, and the results were not significantly changed when the analysis was restricted to SNPs in linkage equilibrium compared with the analysis without considering LD (Table 2).
This study supports the benefit of using of a high-density SNP marker set compared with a microsatellite marker assay in linkage analysis. Although there were only minor differences between the results from the two scans, the traditional microsatellite approach failed to detect some nominal linkage peaks due to lower linkage information content and poor coverage. The peaks on chromosome 6 (6q27) and X (Xp22) in the SNP assays were two examples of signals not detected in the microsatellite analyses. The good concordance between the two SNP marker sets (Affymetrix and Illumina) in both linkage information content and linkage findings suggests that >5,000 SNPs may be excessive for samples with structures similar to the COGA data, and a SNP scan with ~5,000 markers distributed evenly across the human genome is sufficiently dense and powerful in whole-genome linkage analysis. Also, with current technology SNP genotyping is more rapid, requires fewer samples, and is more accurate than microsatellite marker genotyping. High-density SNP marker sets also offer a better localization of linkage peaks, which may save work for fine mapping in regions showing linkage . Since bi-allelic SNP markers are less informative than polymorphic microsatellite markers, the multipoint method is a better choice for SNP assays. However, estimation of genetic maps for SNPs is less precise than for microsatellite markers due to their lower levels of heterozygosity . The computational burden increases dramatically as the number of markers increases. These disadvantages might limit the use of SNPs in whole-genome linkage scans.
Our analysis found nominal linkage for alcoholism to 12 genomic regions under both definitions for alcohol dependence (ALDX1 and ALDX2). The results for the two phenotype definitions are somewhat different. It is not clear which criterion is best for identifying genetic susceptibility loci for alcoholism. However, if one genomic region is associated with alcoholism, there should be similar statistical evidence under both criteria. Our finding on chromosome 2 overlaps with that of Reich et al. , who reported linkage of alcoholism to 2q13. Two important alcohol-related enzymes are located close to chromosomal regions where we found nominal linkage: the aldehyde dehydrogenase 2 family (ALDH2) is located on 12q24.2 and the cytochrome P450, family 2, subfamily E, polypeptide 1 (CYP2E1) is in 10q24.3–10q26.3 (Table 2). Our finding on chromosome X (Xp22), which showed evidence of linkage to mental retardation , sounds interesting for further investigation to explore gender differences for alcoholism.
We conclude that a high-density SNP scan may offer a more rapid, cost-effective and powerful tool in genome-wide linkage analysis compared to traditional 10-cM microsatellite marker scans. However, further investigation is warranted to explore the effects of genetic map and computational issues on the utility of high density SNP assays in linkage analysis.
COGA: Collaborative Study on the Genetics of Alcoholism
GAW14: Genetic Analysis Workshop 14
LD: Linkage disequilibrium
SNP: Single-nucleotide polymorphism
QM reconstructed the genetic map, carried out statistical analysis and drafted the manuscript. YY participated in genetic map reconstruction. YM and JF managed the data. LAF supported this study and helped to draft the manuscript. MAW conceived of the study, and participated in its design and helped to draft the manuscript. All authors read and approved the final manuscript.
Kennedy GC, Matsuzaki H, Dong S, Liu WM, Huang J, Liu G, Su X, Cao M, Chen W, Zhang J, Liu W, Yang G, Di X, Ryder T, He Z, Surti U, Phillips MS, Boyce-Jacino MT, Fodor SP, Jones KW: Large scale genotyping of complex DNA.
Middleton FA, Pato MT, Gentile KL, Morley CP, Zhao X, Eisner A, Brown A, Petryshen TL, Kirby AN, Medeiros H, Carvalho C, Macedo A, Dourado A, Coelho I, Valente J, Soares MJ, Ferreira CP, Lei M, Azevedo MH, Kennedy JL, Daley MJ, Sklar P, Pato CN: Genomewide linkage analysis of bipolar disorder by use of a high-density single-nucleotide-polymorphism (SNP) genotyping assay: a comparison with microsatellite marker assays and finding of significant linkage to chromosome 6q22.
John S, Shephard N, Liu G, Zeggini E, Cao M, Chen W, Vasavda N, Mills T, Barton A, Hinks A, Eyre S, Jones KW, Ollier W, Silman A, Gibson N, Worthington J, Kennedy GC: Whole-genome scan, in a complex disease, using 11,245 single-nucleotide polymorphisms: comparison with microsatellites.
Arch Gen Psychiatry 1972, 26:57-63. PubMed Abstract
Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurradrottior S, Barnard J, Hallbeck B, Masson G, Shlien A, Palsson ST, Frigge ML, Thorgeirsson TE, Gulcher JR, Stefansson K: A high resolution recombination map of the human genome.
Reich T, Edenberg HJ, Goate A, Williams JT, Rice JP, Eerdewegh PV, Foroud T, Hesselbrock V, Schuckit MA, Bucholz K, Porjesz B, Li TK, Conneally PM, Nurnberger JI Jr, Tischfield JA, Crowe RR, Cloninger CR, Wu W, Shears S, Carr K, Crose C, Willig C, Begleiter H: Genome-wide search for genes affecting the risk for alcohol dependence.
Claes S, Vogels A, Holvoet M, Devriendt K, Raeymaekers P, Cassiman JJ, Fryns JP: Regional localization of two genes for nonspecific X-linked mental retardation to Xp22.3-p22.2 (MRX49) and Xp11.3-p11.21 (MRX50).