Open Access Research article

Cheek swabs, SNP chips, and CNVs: Assessing the quality of copy number variant calls generated with subject-collected mail-in buccal brush DNA samples on a high-density genotyping microarray

Stephen W Erickson12*, Stewart L MacLeod2 and Charlotte A Hobbs2

Author Affiliations

1 Department of Biostatistics, College of Medicine, University of Arkansas for Medical Science, 4301 W. Markham Street, Mail Slot 781, Little Rock, AR, 72205-7199, USA

2 Department of Pediatrics, College of Medicine, University of Arkansas for Medical Sciences, Arkansas Children’s Hospital Research Institute, Little Rock, AR, USA

For all author emails, please log on.

BMC Medical Genetics 2012, 13:51  doi:10.1186/1471-2350-13-51

Published: 26 June 2012



Multiple investigators have established the feasibility of using buccal brush samples to genotype single nucleotide polymorphisms (SNPs) with high-density genome-wide microarrays, but there is currently no consensus on the accuracy of copy number variants (CNVs) inferred from these data. Regardless of the source of DNA, it is more difficult to detect CNVs than to genotype SNPs using these microarrays, and it therefore remains an open question whether buccal brush samples provide enough high-quality DNA for this purpose.


To demonstrate the quality of CNV calls generated from DNA extracted from buccal samples, compared to calls generated from blood samples, we evaluated the concordance of calls from individuals who provided both sample types. The Illumina Human660W-Quad BeadChip was used to determine SNPs and CNVs of 39 Arkansas participants in the National Birth Defects Prevention Study (NBDPS), including 16 mother-infant dyads, who provided both whole blood and buccal brush DNA samples.


We observed a 99.9% concordance rate of SNP calls in the 39 blood–buccal pairs. From the same dataset, we performed a similar analysis of CNVs. Each of the 78 samples was independently segmented into regions of like copy number using the Optimal Segmentation algorithm of Golden Helix SNP & Variation Suite 7.

Across 640,663 loci on 22 autosomal chromosomes, segment-mean log R ratios had an average correlation of 0.899 between blood-buccal pairs of samples from the same individual, while the average correlation between all possible blood-buccal pairs of samples from unrelated individuals was 0.318. An independent analysis using the QuantiSNP algorithm produced average correlations of 0.943 between blood-buccal pairs from the same individual versus 0.332 between samples from unrelated individuals.

Segment-mean log R ratios had an average correlation of 0.539 between mother-offspring dyads of buccal samples, which was not statistically significantly different than the average correlation of 0.526 between mother-offspring dyads of blood samples (p=0.302).


We observed performance from the subject-collected mail-in buccal brush samples comparable to that of blood. These results show that such DNA samples can be used for genome-wide scans of both SNPs and CNVs, and that high rates of CNV concordance were achieved whether using a change-point-based algorithm or one based on a hidden Markov model (HMM).

SNPs, Single nucleotide polymorphisms; CNVs, Copy number variants; NBDPS, National Birth Defects Prevention Study; Buccal brush