Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Discovering functional gene expression patterns in the metabolic network of Escherichia coli with wavelets transforms

Rainer König12*, Gunnar Schramm2, Marcus Oswald3, Hanna Seitz3, Sebastian Sager4, Marc Zapatka2, Gerhard Reinelt3 and Roland Eils12

Author Affiliations

1 Department of Bioinformatics and Functional Genomics, Institute for Pharmacy and Molecular Biotechnology, University of Heidelberg, 69120 Heidelberg, Germany

2 Theoretical Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany

3 Institute of Computer Science, University of Heidelberg, 69120 Heidelberg, Germany

4 Interdisciplinary Center for Scientific Computing, University of Heidelberg, 69120 Heidelberg, Germany

For all author emails, please log on.

BMC Bioinformatics 2006, 7:119  doi:10.1186/1471-2105-7-119

Published: 8 March 2006

Abstract

Background

Microarray technology produces gene expression data on a genomic scale for an endless variety of organisms and conditions. However, this vast amount of information needs to be extracted in a reasonable way and funneled into manageable and functionally meaningful patterns. Genes may be reasonably combined using knowledge about their interaction behaviour. On a proteomic level, biochemical research has elucidated an increasingly complete image of the metabolic architecture, especially for less complex organisms like the well studied bacterium Escherichia coli.

Results

We sought to discover central components of the metabolic network, regulated by the expression of associated genes under changing conditions. We mapped gene expression data from E. coli under aerobic and anaerobic conditions onto the enzymatic reaction nodes of its metabolic network. An adjacency matrix of the metabolites was created from this graph. A consecutive ones clustering method was used to obtain network clusters in the matrix. The wavelet method was applied on the adjacency matrices of these clusters to collect features for the classifier. With a feature extraction method the most discriminating features were selected. We yielded network sub-graphs from these top ranking features representing formate fermentation, in good agreement with the anaerobic response of hetero-fermentative bacteria. Furthermore, we found a switch in the starting point for NAD biosynthesis, and an adaptation of the l-aspartate metabolism, in accordance with its higher abundance under anaerobic conditions.

Conclusion

We developed and tested a novel method, based on a combination of rationally chosen machine learning methods, to analyse gene expression data on the basis of interaction data, using a metabolic network of enzymes. As a case study, we applied our method to E. coli under oxygen deprived conditions and extracted physiologically relevant patterns that represent an adaptation of the cells to changing environmental conditions. In general, our concept may be transferred to network analyses on biological interaction data, when data for two comparable states of the associated nodes are made available.