This article is part of the supplement: SNP-SIG 2011: Identification and annotation of SNPs in the context of structure, function and disease
Domain landscapes of somatic mutations in cancer
- Equal contributors
1 Department of BiologicalSciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
2 Division of Imaging and Applied Mathematics, OSEL, CDRH, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993, USA
3 Department of Mathematics and Statistics, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
BMC Genomics 2012, 13(Suppl 4):S9 doi:10.1186/1471-2164-13-S4-S9Published: 18 June 2012
Large-scale tumor sequencing projects are now underway to identify genetic mutations that drive tumor initiation and development. Most studies take a gene-based approach to identifying driver mutations, highlighting genes mutated in a large percentage of tumor samples as those likely to contain driver mutations. However, this gene-based approach usually does not consider the position of the mutation within the gene or the functional context the position of the mutation provides. Here we introduce a novel method for mapping mutations to distinct protein domains, not just individual genes, in which they occur, thus providing the functional context for how the mutation contributes to disease. Furthermore, aggregating mutations from all genes containing a specific protein domain enables the identification of mutations that are rare at the gene level, but that occur frequently within the specified domain. These highly mutated domains potentially reveal disruptions of protein function necessary for cancer development.
We mapped somatic mutations from the protein coding regions of 100 colon adenocarcinoma tumor samples to the genes and protein domains in which they occurred, and constructed topographical maps to depict the “mutational landscapes” of gene and domain mutation frequencies. We found significant mutation frequency in a number of genes previously known to be somatically mutated in colon cancer patients including APC, TP53 and KRAS. In addition, we found significant mutation frequency within specific domains located in these genes, as well as within other domains contained in genes having low mutation frequencies. These domain “peaks” were enriched with functions important to cancer development including kinase activity, DNA binding and repair, and signal transduction.
Using our method to create the domain landscapes of mutations in colon cancer, we were able to identify somatic mutations with high potential to drive cancer development. Interestingly, the majority of the genes involved have a low mutation frequency. Therefore, themethod shows good potential for identifying rare driver mutations in current, large-scale tumor sequencing projects. In addition, mapping mutations to specific domains provides the necessary functional context for understanding how the mutations contribute to the disease, and may reveal novel or more refined gene and domain target regions for drug development.