Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

Toward accurate high-throughput SNP genotyping in the presence of inherited copy number variation

Laura E MacConaill12, Micheala A Aldred34, Xincheng Lu4 and Thomas LaFramboise4*

Author Affiliations

1 Dana-Farber Cancer Institute, 44 Binney Street, Boston, Massachusetts 02116, USA

2 The Broad Institute of Harvard and MIT, 7 Cambridge Center, Cambridge, Massachusetts 02141, USA

3 Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland Ohio 44195, USA

4 Department of Genetics, Case Western Reserve University, 10900 Euclid Avenue, Cleveland Ohio 44106, USA

For all author emails, please log on.

BMC Genomics 2007, 8:211  doi:10.1186/1471-2164-8-211

Published: 3 July 2007

Abstract

Background

The recent discovery of widespread copy number variation in humans has forced a shift away from the assumption of two copies per locus per cell throughout the autosomal genome. In particular, a SNP site can no longer always be accurately assigned one of three genotypes in an individual. In the presence of copy number variability, the individual may theoretically harbor any number of copies of each of the two SNP alleles.

Results

To address this issue, we have developed a method to infer a "generalized genotype" from raw SNP microarray data. Here we apply our approach to data from 48 individuals and uncover thousands of aberrant SNPs, most in regions that were previously unreported as copy number variants. We show that our allele-specific copy numbers follow Mendelian inheritance patterns that would be obscured in the absence of SNP allele information. The interplay between duplication and point mutation in our data shed light on the relative frequencies of these events in human history, showing that at least some of the duplication events were recurrent.

Conclusion

This new multi-allelic view of SNPs has a complicated role in disease association studies, and further work will be necessary in order to accurately assess its importance. Software to perform generalized genotyping from SNP array data is freely available online [1].