Open Access Highly Accessed Open Badges Methodology article

MutComFocal: an integrative approach to identifying recurrent and focal genomic alterations in tumor samples

Vladimir Trifonov12*, Laura Pasqualucci34, Riccardo Dalla Favera345 and Raul Rabadan12

Author Affiliations

1 Department of Biomedical Informatics, New York, NY, 10032, USA

2 Center for Computational Biology and Bioinformatics, New York, NY, 10032, USA

3 Institute for Cancer Genetics, Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA

4 Department of Pathology and Cell Biology, Columbia University, New York, NY, 10032, USA

5 Department of Genetics and Development, and Department of Microbiology and Immunology, Columbia University, New York, NY, 10032, USA

For all author emails, please log on.

BMC Systems Biology 2013, 7:25  doi:10.1186/1752-0509-7-25

Published: 25 March 2013



Most tumors are the result of accumulated genomic alterations in somatic cells. The emerging spectrum of alterations in tumors is complex and the identification of relevant genes and pathways remains a challenge. Furthermore, key cancer genes are usually found amplified or deleted in chromosomal regions containing many other genes. Point mutations, on the other hand, provide exquisite information about amino acid changes that could be implicated in the oncogenic process. Current large-scale genomic projects provide high throughput genomic data in a large number of well-characterized tumor samples.


We define a Bayesian approach designed to identify candidate cancer genes by integrating copy number and point mutation information. Our method exploits the concept that small and recurrent alterations in tumors are more informative in the search for cancer genes. Thus, the algorithm (Mutations with Common Focal Alterations, or MutComFocal) seeks focal copy number alterations and recurrent point mutations within high throughput data from large panels of tumor samples.


We apply MutComFocal to Diffuse Large B-cell Lymphoma (DLBCL) data from four different high throughput studies, totaling 78 samples assessed for copy number alterations by single nucleotide polymorphism (SNP) array analysis and 65 samples assayed for protein changing point mutations by whole exome/whole transcriptome sequencing. In addition to recapitulating known alterations, MutComFocal identifies ARID1B, ROBO2 and MRS1 as candidate tumor suppressors and KLHL6, IL31 and LRP1 as putative oncogenes in DLBCL.


We present a Bayesian approach for the identification of candidate cancer genes by integrating data collected in large number of cancer patients, across different studies. When trained on a well-studied dataset, MutComFocal is able to identify most of the reported characterized alterations. The application of MutComFocal to large-scale cancer data provides the opportunity to pinpoint the key functional genomic alterations in tumors.

Tumorigenic mutations; Driver genes; Data integration