Email updates

Keep up to date with the latest news and content from BMC Medical Genomics and BioMed Central.

Open Access Highly Accessed Research article

Discovering cancer genes by integrating network and functional properties

Li Li1, Kangyu Zhang1, James Lee2, Shaun Cordes2, David P Davis2 and Zhijun Tang1*

Author Affiliations

1 Department of Bioinformatics, Genentech Inc., 1 DNA Way, South San Francisco, CA 94080, USA

2 Department of Molecular Biology, Genentech Inc., 1 DNA Way, South San Francisco, CA 94080, USA

For all author emails, please log on.

BMC Medical Genomics 2009, 2:61  doi:10.1186/1755-8794-2-61

Published: 19 September 2009

Abstract

Background

Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI) data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO) annotations, to facilitate the identification of cancer genes.

Methods

Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1.

Results

Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs) with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1.

Conclusion

Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.