Constructing disease-specific gene networks using pair-wise relevance metric: Application to colon cancer identifies interleukin 8, desmin and enolase 1 as the central elements
- Equal contributors
1 College of Bioinformatics Science and Technology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, PR China
2 Department of Bioinformatics, Capital University of Medical Sciences, Beijing 100084, PR China
3 Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-Sen University, Guangzhou 510080, PR China
4 Department of Molecular Cardiology, Cleveland Clinic Foundation, 9500 Euclid Avenue, Cleveland, Ohio 44195, USA
5 Department of Computer Science, Harbin Institute of Technology, Harbin 150080, PR China
BMC Systems Biology 2008, 2:72 doi:10.1186/1752-0509-2-72Published: 10 August 2008
With the advance of large-scale omics technologies, it is now feasible to reversely engineer the underlying genetic networks that describe the complex interplays of molecular elements that lead to complex diseases. Current networking approaches are mainly focusing on building genetic networks at large without probing the interaction mechanisms specific to a physiological or disease condition. The aim of this study was thus to develop such a novel networking approach based on the relevance concept, which is ideal to reveal integrative effects of multiple genes in the underlying genetic circuit for complex diseases.
The approach started with identification of multiple disease pathways, called a gene forest, in which the genes extracted from the decision forest constructed by supervised learning of the genome-wide transcriptional profiles for patients and normal samples. Based on the newly identified disease mechanisms, a novel pair-wise relevance metric, adjusted frequency value, was used to define the degree of genetic relationship between two molecular determinants. We applied the proposed method to analyze a publicly available microarray dataset for colon cancer. The results demonstrated that the colon cancer-specific gene network captured the most important genetic interactions in several cellular processes, such as proliferation, apoptosis, differentiation, mitogenesis and immunity, which are known to be pivotal for tumourigenesis. Further analysis of the topological architecture of the network identified three known hub cancer genes [interleukin 8 (IL8) (p ≈ 0), desmin (DES) (p = 2.71 × 10-6) and enolase 1 (ENO1) (p = 4.19 × 10-5)], while two novel hub genes [RNA binding motif protein 9 (RBM9) (p = 1.50 × 10-4) and ribosomal protein L30 (RPL30) (p = 1.50 × 10-4)] may define new central elements in the gene network specific to colon cancer. Gene Ontology (GO) based analysis of the colon cancer-specific gene network and the sub-network that consisted of three-way gene interactions suggested that tumourigenesis in colon cancer resulted from dysfunction in protein biosynthesis and categories associated with ribonucleoprotein complex which are well supported by multiple lines of experimental evidence.
This study demonstrated that IL8, DES and ENO1 act as the central elements in colon cancer susceptibility, and protein biosynthesis and the ribosome-associated function categories largely account for the colon cancer tumuorigenesis. Thus, the newly developed relevancy-based networking approach offers a powerful means to reverse-engineer the disease-specific network, a promising tool for systematic dissection of complex diseases.