Comprehensive assessment of sequence variation within the copy number variable defensin cluster on 8p23 by target enriched in-depth 454 sequencing
1 Leibniz Institute for Age Research - Fritz Lipmann Institute, Jena, Germany
2 Roche NimbleGen, Inc., Madison WI, USA
3 Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, Germany
4 Dept. of General Intermal Medicine, Christian-Albrechts-University Kiel, Germany
BMC Genomics 2011, 12:243 doi:10.1186/1471-2164-12-243Published: 18 May 2011
In highly copy number variable (CNV) regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS) approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach.
As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations.
Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs) including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods.
Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.