Open Access Highly Accessed Research article

Improved detection of global copy number variation using high density, non-polymorphic oligonucleotide probes

Fan Shen1, Jing Huang1, Karen R Fitch1, Vivi B Truong1, Andrew Kirby2, Wenwei Chen1, Jane Zhang1, Guoying Liu1, Steven A McCarroll3, Keith W Jones1 and Michael H Shapero1*

Author Affiliations

1 Affymetrix, Inc. 3420 Central Expressway; Santa Clara, CA 95051, USA

2 Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA

3 Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

For all author emails, please log on.

BMC Genetics 2008, 9:27  doi:10.1186/1471-2156-9-27

Published: 28 March 2008



DNA sequence diversity within the human genome may be more greatly affected by copy number variations (CNVs) than single nucleotide polymorphisms (SNPs). Although the importance of CNVs in genome wide association studies (GWAS) is becoming widely accepted, the optimal methods for identifying these variants are still under evaluation. We have previously reported a comprehensive view of CNVs in the HapMap DNA collection using high density 500 K EA (Early Access) SNP genotyping arrays which revealed greater than 1,000 CNVs ranging in size from 1 kb to over 3 Mb. Although the arrays used most commonly for GWAS predominantly interrogate SNPs, CNV identification and detection does not necessarily require the use of DNA probes centered on polymorphic nucleotides and may even be hindered by the dependence on a successful SNP genotyping assay.


In this study, we have designed and evaluated a high density array predicated on the use of non-polymorphic oligonucleotide probes for CNV detection. This approach effectively uncouples copy number detection from SNP genotyping and thus has the potential to significantly improve probe coverage for genome-wide CNV identification. This array, in conjunction with PCR-based, complexity-reduced DNA target, queries over 1.3 M independent NspI restriction enzyme fragments in the 200 bp to 1100 bp size range, which is a several fold increase in marker density as compared to the 500 K EA array. In addition, a novel algorithm was developed and validated to extract CNV regions and boundaries.


Using a well-characterized pair of DNA samples, close to 200 CNVs were identified, of which nearly 50% appear novel yet were independently validated using quantitative PCR. The results indicate that non-polymorphic probes provide a robust approach for CNV identification, and the increasing precision of CNV boundary delineation should allow a more complete analysis of their genomic organization.