Log on / register
Feedback | Support | My details
Open AccessResearch article

Polymorphic segmental duplications at 8p23.1 challenge the determination of individual defensin gene repertoires and the assembly of a contiguous human reference sequence

Stefan Taudien1 email, Petra Galgoczy1 email, Klaus Huse1 email, Kathrin Reichwald1 email, Markus Schilhabel1 email, Karol Szafranski1 email, Atsushi Shimizu2 email, Shuichi Asakawa2 email, Adam Frankish3 email, Ivan F Loncarevic4 email, Nobuyoshi Shimizu2 email, Roman Siddiqui1 email and Matthias Platzer1 email

1Genomanalyse, Institut für Molekulare Biotechnologie, Beutenbergstr. 11, D-07745 Jena, Germany

2Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo 160-8582, Japan

3Wellcome Trust Sanger Institute, Hinxton Cambridge CB10 1SA, UK

4Institut für Humangenetik und Anthropologie, Friedrich-Schiller-Universität Jena, Kollegiengasse 10, D-07743 Jena, Germany

author email corresponding author email

BMC Genomics 2004, 5:92doi:10.1186/1471-2164-5-92

Published: 10 December 2004

Abstract

Background

Defensins are important components of innate immunity to combat bacterial and viral infections, and can even elicit antitumor responses. Clusters of defensin (DEF) genes are located in a 2 Mb range of the human chromosome 8p23.1. This DEF locus, however, represents one of the regions in the euchromatic part of the final human genome sequence which contains segmental duplications, and recalcitrant gaps indicating high structural dynamics.

Results

We find that inter- and intraindividual genetic variations within this locus prevent a correct automatic assembly of the human reference genome (NCBI Build 34) which currently even contains misassemblies. Manual clone-by-clone alignment and gene annotation as well as repeat and SNP/haplotype analyses result in an alternative alignment significantly improving the DEF locus representation. Our assembly better reflects the experimentally verified variability of DEF gene and DEF cluster copy numbers. It contains an additional DEF cluster which we propose to reside between two already known clusters. Furthermore, manual annotation revealed a novel DEF gene and several pseudogenes expanding the hitherto known DEF repertoire. Analyses of BAC and working draft sequences of the chimpanzee indicates that its DEF region is also complex as in humans and DEF genes and a cluster are multiplied. Comparative analysis of human and chimpanzee DEF genes identified differences affecting the protein structure. Whether this might contribute to differences in disease susceptibility between man and ape remains to be solved. For the determination of individual DEF gene repertoires we provide a molecular approach based on DEF haplotypes.

Conclusions

Complexity and variability seem to be essential genomic features of the human DEF locus at 8p23.1 and provides an ongoing challenge for the best possible representation in the human reference sequence. Dissection of paralogous sequence variations, duplicon SNPs ans multisite variations as well as haplotypes by sequencing based methods is the way for future studies of interindividual DEF locus variability and its disease association.


© 1999-2008 BioMed Central Ltd unless otherwise stated