Open Access Research article

Comprehensive assessment of sequence variation within the copy number variable defensin cluster on 8p23 by target enriched in-depth 454 sequencing

Stefan Taudien1*, Karol Szafranski1, Marius Felder1, Marco Groth1, Klaus Huse1, Francesca Raffaelli1, Andreas Petzold1, Xinmin Zhang2, Philip Rosenstiel3, Jochen Hampe4, Stefan Schreiber34 and Matthias Platzer1

Author Affiliations

1 Leibniz Institute for Age Research - Fritz Lipmann Institute, Jena, Germany

2 Roche NimbleGen, Inc., Madison WI, USA

3 Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, Germany

4 Dept. of General Intermal Medicine, Christian-Albrechts-University Kiel, Germany

For all author emails, please log on.

BMC Genomics 2011, 12:243  doi:10.1186/1471-2164-12-243

Published: 18 May 2011



In highly copy number variable (CNV) regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS) approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach.


As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations.

Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs) including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods.


Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.