BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Methodology article

Incorporating gene co-expression network in identification of cancer prognosis markers

Shuangge Ma1,2*, Mingyu Shi3, Yang Li1,4, Danhui Yi4 and Ben-Chang Shia5

Author Affiliations

1 School of Public Health, Yale University, New Haven, CT 06520, USA

2 Clinical Epidemiology Research Center, VA CT Healthcare System, West Haven, CT 06516, USA

3 Standard and Poor's, New York, NY 10041, USA

4 School of Statistics, Renmin University, Beijing, China

5 Department of Statistics and Information Science & Applied Statistics, Fu Jen Catholic University, Taipei Hsien R.O.C

For all author emails, please log on.

BMC Bioinformatics 2010, 11:271 doi:10.1186/1471-2105-11-271

Published: 20 May 2010

Abstract

Background

Extensive biomedical studies have shown that clinical and environmental risk factors may not have sufficient predictive power for cancer prognosis. The development of high-throughput profiling technologies makes it possible to survey the whole genome and search for genomic markers with predictive power. Many existing studies assume the interchangeability of gene effects and ignore the coordination among them.

Results

We adopt the weighted co-expression network to describe the interplay among genes. Although there are several different ways of defining gene networks, the weighted co-expression network may be preferred because of its computational simplicity, satisfactory empirical performance, and because it does not demand additional biological experiments. For cancer prognosis studies with gene expression measurements, we propose a new marker selection method that can properly incorporate the network connectivity of genes. We analyze six prognosis studies on breast cancer and lymphoma. We find that the proposed approach can identify genes that are significantly different from those using alternatives. We search published literature and find that genes identified using the proposed approach are biologically meaningful. In addition, they have better prediction performance and reproducibility than genes identified using alternatives.

Conclusions

The network contains important information on the functionality of genes. Incorporating the network structure can improve cancer marker identification.