Open Access Open Badges Software

CNVassoc: Association analysis of CNV data using R

Isaac Subirana123, Ramon Diaz-Uriarte4, Gavin Lucas2 and Juan R Gonzalez15*

Author affiliations

1 CIBER Epidemiology and Public Health (CIBERESP), Barcelona, Spain

2 Cardiovascular Epidemiology & Genetics group, Inflammatory and Cardiovascular Disease Programme, Institut Municipal d'Investigaci/'o Mèdica (IMIM), Barcelona, Spain

3 Statistics Department, University of Barcelona (UB), Barcelona, Spain

4 Structural Biology and Biocomputing Programme, Spanish National Cancer Centre (CNIO), Madrid, Spain

5 Center for Research in Environmental Epidemiology (CREAL), Barcelona, Spain

For all author emails, please log on.

Citation and License

BMC Medical Genomics 2011, 4:47  doi:10.1186/1755-8794-4-47

Published: 24 May 2011



Copy number variants (CNV) are a potentially important component of the genetic contribution to risk of common complex diseases. Analysis of the association between CNVs and disease requires that uncertainty in CNV copy-number calls, which can be substantial, be taken into account; failure to consider this uncertainty can lead to biased results. Therefore, there is a need to develop and use appropriate statistical tools. To address this issue, we have developed CNVassoc, an R package for carrying out association analysis of common copy number variants in population-based studies. This package includes functions for testing for association with different classes of response variables (e.g. class status, censored data, counts) under a series of study designs (case-control, cohort, etc) and inheritance models, adjusting for covariates. The package includes functions for inferring copy number (CNV genotype calling), but can also accept copy number data generated by other algorithms (e.g. CANARY, CGHcall, IMPUTE).


Here we present a new R package, CNVassoc, that can deal with different types of CNV arising from different platforms such as MLPA o aCGH. Through a real data example we illustrate that our method is able to incorporate uncertainty in the association process. We also show how our package can also be useful when analyzing imputed data when analyzing imputed SNPs. Through a simulation study we show that CNVassoc outperforms CNVtools in terms of computing time as well as in convergence failure rate.


We provide a package that outperforms the existing ones in terms of modelling flexibility, power, convergence rate, ease of covariate adjustment, and requirements for sample size and signal quality. Therefore, we offer CNVassoc as a method for routine use in CNV association studies.