Machine learning based analyses on metabolic networks supports high-throughput knockout screens
1 Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
2 Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
3 Department of Discrete Optimization, Interdisciplinary Center for Scientific Computing, University of Heidelberg, Im Neuenheimer Feld 368, 69120 Heidelberg, Germany
4 Center for Molecular Biology Heidelberg (ZMBH), Im Neuenheimer Feld 282, 69120 Heidelberg, Germany
BMC Systems Biology 2008, 2:67 doi:10.1186/1752-0509-2-67Published: 24 July 2008
Computational identification of new drug targets is a major goal of pharmaceutical bioinformatics.
This paper presents a machine learning strategy to study and validate essential enzymes of a metabolic network. Each single enzyme was characterized by its local network topology, gene homologies and co-expression, and flux balance analyses. A machine learning system was trained to distinguish between essential and non-essential reactions. It was validated by a comprehensive experimental dataset, which consists of the phenotypic outcomes from single knockout mutants of Escherichia coli (KEIO collection). We yielded very reliable results with high accuracy (93%) and precision (90%). We show that topologic, genomic and transcriptomic features describing the network are sufficient for defining the essentiality of a reaction. These features do not substantially depend on specific media conditions and enabled us to apply our approach also for less specific media conditions, like the lysogeny broth rich medium.
Our analysis is feasible to validate experimental knockout data of high throughput screens, can be used to improve flux balance analyses and supports experimental knockout screens to define drug targets.