This article is part of the supplement: IEEE 7th International Conference on Bioinformatics and Bioengineering at Harvard Medical School
Selecting informative genes for discriminant analysis using multigene expression profiles
1 Russell Investments, Tacoma, WA, USA
2 Department of Statistics, Columbia University, New York, NY, USA
BMC Genomics 2008, 9(Suppl 2):S14 doi:10.1186/1471-2164-9-S2-S14Published: 16 September 2008
Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis.
In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods.
We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.