This article is part of the supplement: Proceedings from the Great Lakes Bioinformatics Conference 2011
Reordering based integrative expression profiling for microarray classification
- Equal contributors
1 School of Informatics, Indiana University, Indianapolis, IN 46202, USA
2 Indiana Center for Systems Biology and Personalized Medicine, Indiana University, Indianapolis, IN 46202, USA
3 MedeoLinx, LLC, Indianapolis, IN 46280, USA
BMC Bioinformatics 2012, 13(Suppl 2):S1 doi:10.1186/1471-2105-13-S2-S1Published: 13 March 2012
Current network-based microarray analysis uses the information of interactions among concerned genes/gene products, but still considers each gene expression individually. We propose an organized knowledge-supervised approach - Integrative eXpression Profiling (IXP), to improve microarray classification accuracy, and help discover groups of genes that have been too weak to detect individually by traditional ways. To implement IXP, ant colony optimization reordering (ACOR) algorithm is used to group functionally related genes in an ordered way.
Using Alzheimer's disease (AD) as an example, we demonstrate how to apply ACOR-based IXP approach into microarray classifications. Using a microarray dataset - GSE1297 with 31 samples as training set, the result for the blinded classification on another microarray dataset - GSE5281 with 151 samples, shows that our approach can improve accuracy from 74.83% to 82.78%. A recently-published 1372-probe signature for AD can only achieve 61.59% accuracy in the same condition. The ACOR-based IXP approach also has better performance than the IXP approach based on classic network ranking, graph clustering, and random-ordering methods in an overall classification performance comparison.
The ACOR-based IXP approach can serve as a knowledge-supervised feature transformation approach to increase classification accuracy dramatically, by transforming each gene expression profile to an integrated expression files as features inputting into standard classifiers. The IXP approach integrates both gene expression information and organized knowledge - disease gene/protein network topology information, which is represented as both network node weights (local topological properties) and network node orders (global topological characteristics).