MiningABs: mining associated biomarkers across multi-connected gene expression datasets
1 Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
2 Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, USA
3 Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, USA
4 Institute for Genomic Medicine, University of California San Diego, La Jolla, California, USA
5 Department of Environmental and Occupational Health, National Cheng Kung University, Tainan, Taiwan
6 Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
BMC Bioinformatics 2014, 15:173 doi:10.1186/1471-2105-15-173Published: 8 June 2014
Human disease often arises as a consequence of alterations in a set of associated genes rather than alterations to a set of unassociated individual genes. Most previous microarray-based meta-analyses identified disease-associated genes or biomarkers independent of genetic interactions. Therefore, in this study, we present the first meta-analysis method capable of taking gene combination effects into account to efficiently identify associated biomarkers (ABs) across different microarray platforms.
We propose a new meta-analysis approach called MiningABs to mine ABs across different array-based datasets. The similarity between paired probe sequences is quantified as a bridge to connect these datasets together. The ABs can be subsequently identified from an “improved” common logit model (c-LM) by combining several sibling-like LMs in a heuristic genetic algorithm selection process. Our approach is evaluated with two sets of gene expression datasets: i) 4 esophageal squamous cell carcinoma and ii) 3 hepatocellular carcinoma datasets. Based on an unbiased reciprocal test, we demonstrate that each gene in a group of ABs is required to maintain high cancer sample classification accuracy, and we observe that ABs are not limited to genes common to all platforms. Investigating the ABs using Gene Ontology (GO) enrichment, literature survey, and network analyses indicated that our ABs are not only strongly related to cancer development but also highly connected in a diverse network of biological interactions.
The proposed meta-analysis method called MiningABs is able to efficiently identify ABs from different independently performed array-based datasets, and we show its validity in cancer biology via GO enrichment, literature survey and network analyses. We postulate that the ABs may facilitate novel target and drug discovery, leading to improved clinical treatment. Java source code, tutorial, example and related materials are available at “http://sourceforge.net/projects/miningabs/ webcite”.