Open Access Highly Accessed Research article

Genomic characteristics of cattle copy number variations

Yali Hou12, George E Liu1*, Derek M Bickhart1, Maria Francesca Cardone3, Kai Wang4, Eui-soo Kim1, Lakshmi K Matukumalli15, Mario Ventura3, Jiuzhou Song2, Paul M VanRaden6, Tad S Sonstegard1 and Curt P Van Tassell1

Author affiliations

1 Bovine Functional Genomics Laboratory, ANRI, USDA-ARS, Beltsville, Maryland 20705, USA

2 Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA

3 Department of Genetics and Microbiology, University of Bari, Bari 70126, Italy

4 Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA

5 Bioinformatics and Computational Biology, George Mason University, Manassas, VA 20110, USA

6 Animal Improvement Programs Laboratory, ANRI, USDA-ARS, Beltsville, Maryland 20705, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2011, 12:127  doi:10.1186/1471-2164-12-127

Published: 23 February 2011

Abstract

Background

Copy number variation (CNV) represents another important source of genetic variation complementary to single nucleotide polymorphism (SNP). High-density SNP array data have been routinely used to detect human CNVs, many of which have significant functional effects on gene expression and human diseases. In the dairy industry, a large quantity of SNP genotyping results are becoming available and can be used for CNV discovery to understand and accelerate genetic improvement for complex traits.

Results

We performed a systematic analysis of CNV using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the pedigree information, we identified 682 candidate CNV regions, which represent 139.8 megabases (~4.60%) of the genome. Selected CNVs were further experimentally validated and we found that copy number "gain" CNVs were predominantly clustered in tandem rather than existing as interspersed duplications. Many CNV regions (~56%) overlap with cattle genes (1,263), which are significantly enriched for immunity, lactation, reproduction and rumination. The overlap of this new dataset and other published CNV studies was less than 40%; however, our discovery of large, high frequency (> 5% of animals surveyed) CNV regions showed 90% agreement with other studies. These results highlight the differences and commonalities between technical platforms.

Conclusions

We present a comprehensive genomic analysis of cattle CNVs derived from SNP data which will be a valuable genomic variation resource. Combined with SNP detection assays, gene-containing CNV regions may help identify genes undergoing artificial selection in domesticated animals.