Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Classification of microarray data using gene networks

Franck Rapaport12*, Andrei Zinovyev1, Marie Dutreix3, Emmanuel Barillot1 and Jean-Philippe Vert2

Author Affiliations

1 lnstitut Curie, Service de Bioinformatique, 26 rue d'Ulm, F-75248 Paris Cedex 05, France

2 Ecole des Mines de Paris, Centre for Computational Biology, 35 rue Saint-Honoré, 77300 Fontainebleau, France

3 lnstitut Curie, CNRS-UMR 2027, Bâtiment 110, Centre Universitaire, F-91405 Orsay, France

For all author emails, please log on.

BMC Bioinformatics 2007, 8:35  doi:10.1186/1471-2105-8-35

Published: 1 February 2007



Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks in order to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation.


We propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We illustrate the method with the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains.


Including a priori knowledge of a gene network for the analysis of gene expression data leads to good classification performance and improved interpretability of the results.