Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

MiningABs: mining associated biomarkers across multi-connected gene expression datasets

Chun-Pei Cheng12, Christopher DeBoever2, Kelly A Frazer234*, Yu-Cheng Liu5 and Vincent S Tseng16*

Author Affiliations

1 Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan

2 Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, USA

3 Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, USA

4 Institute for Genomic Medicine, University of California San Diego, La Jolla, California, USA

5 Department of Environmental and Occupational Health, National Cheng Kung University, Tainan, Taiwan

6 Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2014, 15:173  doi:10.1186/1471-2105-15-173

Published: 8 June 2014

Abstract

Background

Human disease often arises as a consequence of alterations in a set of associated genes rather than alterations to a set of unassociated individual genes. Most previous microarray-based meta-analyses identified disease-associated genes or biomarkers independent of genetic interactions. Therefore, in this study, we present the first meta-analysis method capable of taking gene combination effects into account to efficiently identify associated biomarkers (ABs) across different microarray platforms.

Results

We propose a new meta-analysis approach called MiningABs to mine ABs across different array-based datasets. The similarity between paired probe sequences is quantified as a bridge to connect these datasets together. The ABs can be subsequently identified from an “improved” common logit model (c-LM) by combining several sibling-like LMs in a heuristic genetic algorithm selection process. Our approach is evaluated with two sets of gene expression datasets: i) 4 esophageal squamous cell carcinoma and ii) 3 hepatocellular carcinoma datasets. Based on an unbiased reciprocal test, we demonstrate that each gene in a group of ABs is required to maintain high cancer sample classification accuracy, and we observe that ABs are not limited to genes common to all platforms. Investigating the ABs using Gene Ontology (GO) enrichment, literature survey, and network analyses indicated that our ABs are not only strongly related to cancer development but also highly connected in a diverse network of biological interactions.

Conclusions

The proposed meta-analysis method called MiningABs is able to efficiently identify ABs from different independently performed array-based datasets, and we show its validity in cancer biology via GO enrichment, literature survey and network analyses. We postulate that the ABs may facilitate novel target and drug discovery, leading to improved clinical treatment. Java source code, tutorial, example and related materials are available at “http://sourceforge.net/projects/miningabs/ webcite”.

Keywords:
Mining ABs; Associated biomarkers; Meta-analysis; Combination effects; Gene expression