Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Copy number variation genotyping using family information

Jen-hwa Chu1*, Angela Rogers13, Iuliana Ionita-Laza5, Katayoon Darvishi2, Ryan E Mills6, Charles Lee2 and Benjamin A Raby134

Author Affiliations

1 Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston MA, USA

2 Department of Pathology, Molecular Genetic Research Unit, Brigham and Women’s Hospital, Boston MA, USA

3 Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston MA, USA

4 Center for Genomic Medicine, Brigham and Women’s Hospital, Boston MA, USA

5 Mailman School of Public Health, , NY, New York, USA

6 Department of Human Genetics, University of Michigan, MI, Ann Arbor, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14:157  doi:10.1186/1471-2105-14-157

Published: 9 May 2013

Abstract

Background

In recent years there has been a growing interest in the role of copy number variations (CNV) in genetic diseases. Though there has been rapid development of technologies and statistical methods devoted to detection in CNVs from array data, the inherent challenges in data quality associated with most hybridization techniques remains a challenging problem in CNV association studies.

Results

To help address these data quality issues in the context of family-based association studies, we introduce a statistical framework for the intensity-based array data that takes into account the family information for copy-number assignment. The method is an adaptation of traditional methods for modeling SNP genotype data that assume Gaussian mixture model, whereby CNV calling is performed for all family members simultaneously and leveraging within family-data to reduce CNV calls that are incompatible with Mendelian inheritance while still allowing de-novo CNVs. Applying this method to simulation studies and a genome-wide association study in asthma, we find that our approach significantly improves CNV calls accuracy, and reduces the Mendelian inconsistency rates and false positive genotype calls. The results were validated using qPCR experiments.

Conclusions

In conclusion, we have demonstrated that the use of family information can improve the quality of CNV calling and hopefully give more powerful association test of CNVs.