seqCNA: an R package for DNA copy number analysis in cancer using high-throughput sequencing
1 CIC bioGUNE & CIBERehd, Technologic Park of Bizkaia, Building 502, 48160 Derio, Spain
2 Clinical Analyses Service at the San Carlos Clinical Hospital, Martin Lagos, 28040 Madrid, Spain
3 Dominion Pharmakine S.L., Technologic Park of Bizkaia, Building 801, 48160 Derio, Spain
BMC Genomics 2014, 15:178 doi:10.1186/1471-2164-15-178Published: 5 March 2014
Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.
We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.
seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.