Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

Inferring copy number and genotype in tumour exome data

Kaushalya C Amarasinghe1, Jason Li12, Sally M Hunter3, Georgina L Ryland3, Prue A Cowin4, Ian G Campbell356 and Saman K Halgamuge1*

Author Affiliations

1 Optimisation and Pattern Recognition group, Mechanical Engineering Department, Melbourne School of Engineering, The University of Melbourne, Parkville, Victoria 3010, Australia

2 Bioinformatics Core Facility, Peter MacCallum Cancer Centre, East Melbourne, Victoria 3002, Australia

3 Cancer Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, Victoria 3002, Australia

4 Cancer Genomics and Genetics Laboratory, Peter MacCallum Cancer Centre, East Melbourne, Victoria 3002, Australia

5 Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, Victoria 3010, Australia

6 Department of Pathology, The University of Melbourne, Parkville, Victoria 3010, Australia

For all author emails, please log on.

BMC Genomics 2014, 15:732  doi:10.1186/1471-2164-15-732

Published: 28 August 2014

Abstract

Background

Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation.

Results

We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure.

Conclusions

Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/ webcite.