A robust penalized method for the analysis of noisy DNA copy number data
1 Department of Mathematics and Statistics, Oakland University, Rochester, MI 48309, USA
2 Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52246, USA
3 Department of Biostatistics, University of Iowa, Iowa City, IA 52246, USA
BMC Genomics 2010, 11:517 doi:10.1186/1471-2164-11-517Published: 25 September 2010
Deletions and amplifications of the human genomic DNA copy number are the causes of numerous diseases, such as, various forms of cancer. Therefore, the detection of DNA copy number variations (CNV) is important in understanding the genetic basis of many diseases. Various techniques and platforms have been developed for genome-wide analysis of DNA copy number, such as, array-based comparative genomic hybridization (aCGH) and high-resolution mapping with high-density tiling oligonucleotide arrays. Since complicated biological and experimental processes are often associated with these platforms, data can be potentially contaminated by outliers.
We propose a penalized LAD regression model with the adaptive fused lasso penalty for detecting CNV. This method contains robust properties and incorporates both the spatial dependence and sparsity of CNV into the analysis. Our simulation studies and real data analysis indicate that the proposed method can correctly detect the numbers and locations of the true breakpoints while appropriately controlling the false positives.
The proposed method has three advantages for detecting CNV change points: it contains robustness properties; incorporates both spatial dependence and sparsity; and estimates the true values at each marker accurately.