Identification of somatic mutations in cancer through Bayesian-based analysis of sequenced genome pairs
1 Translational Genomics Research Institute, Neurogenomics Division, Phoenix, AZ 85004, USA
2 Department of Biomedical Informatics, Arizona State University, Tempe, AZ 85284, USA
3 Translational Genomics Research Institute, Integrated Cancer Genomics Division, Phoenix, AZ 85004, USA
4 Virginia G Piper Cancer Center, Scottsdale, AZ 85258, USA
5 Clinical Translational Research Division, Translational Genomics Research Institute, Scottsdale, AZ 85259, USA
BMC Genomics 2013, 14:302 doi:10.1186/1471-2164-14-302Published: 4 May 2013
The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations – changes specific to a tumor and not within an individual’s germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific.
We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity.
We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic webcite.