Efficient, sparse biological network determination
Department of Engineering Science, University of Oxford, Parks Road, Oxford OX1 3PJ, UK
BMC Systems Biology 2009, 3:25 doi:10.1186/1752-0509-3-25Published: 23 February 2009
Determining the interaction topology of biological systems is a topic that currently attracts significant research interest. Typical models for such systems take the form of differential equations that involve polynomial and rational functions. Such nonlinear models make the problem of determining the connectivity of biochemical networks from time-series experimental data much harder. The use of linear dynamics and linearization techniques that have been proposed in the past can circumvent this, but the general problem of developing efficient algorithms for models that provide more accurate system descriptions remains open.
We present a network determination algorithm that can treat model descriptions with polynomial and rational functions and which does not make use of linearization. For this purpose, we make use of the observation that biochemical networks are in general 'sparse' and minimize the 1-norm of the decision variables (sum of weighted network connections) while constraints keep the error between data and the network dynamics small. The emphasis of our methodology is on determining the interconnection topology rather than the specific reaction constants and it takes into account the necessary properties that a chemical reaction network should have – something that techniques based on linearization can not. The problem can be formulated as a Linear Program, a convex optimization problem, for which efficient algorithms are available that can treat large data sets efficiently and uncertainties in data or model parameters.
The presented methodology is able to predict with accuracy and efficiency the connectivity structure of a chemical reaction network with mass action kinetics and of a gene regulatory network from simulation data even if the dynamics of these systems are non-polynomial (rational) and uncertainties in the data are taken into account. It also produces a network structure that can explain the real experimental data of L. lactis and is similar to the one found in the literature. Numerical methods based on Linear Programming can therefore help determine efficiently the network structure of biological systems from large data sets. The overall objective of this work is to provide methods to increase our understanding of complex biochemical systems, particularly through their interconnection and their non-equilibrium behavior.