This article is part of the supplement: IEEE 7th International Conference on Bioinformatics and Bioengineering at Harvard Medical School
A stable iterative method for refining discriminative gene clusters
1 Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA
2 Department of Computer Science, Southern Illinois University, Carbondale, IL 62901, USA
3 Department of Mathematics, National University of Singapore, 2 Science Drive 2, 117543, Singapore
BMC Genomics 2008, 9(Suppl 2):S18 doi:10.1186/1471-2164-9-S2-S18Published: 16 September 2008
Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.
We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.
This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.