Instituto de Investigaciones Marinas, CSIC (Spanish Council for Scientific Research), C/Eduardo Cabello 6, 36208 Vigo, Spain

Abstract

Optimization aims to make a system or design as effective or functional as possible. Mathematical optimization methods are widely used in engineering, economics and science. This commentary is focused on applications of mathematical optimization in computational systems biology. Examples are given where optimization methods are used for topics ranging from model building and optimal experimental design to metabolic engineering and synthetic biology. Finally, several perspectives for future research are outlined.

Background

To optimize means to find the best solution, the best compromise among several conflicting demands subject to predefined requirements (called constraints). Mathematical optimization has been extremely successful as an aid to better decision making in science, engineering and economics.

Optimization and optimality are certainly not new concepts in biology. The structures, movements and behaviors of animals, and their life histories, have been shaped by the optimizing processes of evolution or of learning by trial and error

First, I will introduce several basic concepts that can help readers unfamiliar with mathematical optimization. The key elements of mathematical optimization problems are the

As an illustrative example, consider the "diet problem", one of the first modern optimization problems

The "diet problem" has certain interesting properties: it is a continuous problem where both the objective function (total cost, i.e. sum of the costs of each food purchased) and the constraints are linear with respect to the decision variables, so this problem belongs to the important class of linear programming, or LP (note that due to historical reasons, programming is used here in the sense of planning). These linear constraints define a feasible space (space of decision variables where constraints are satisfied) which is a convex polyhedron, so it is a convex problem. Convex optimization problems

Non linear programming (NLP) deals with continuous problems where some of the constraints or the objective function are nonlinear. In contrast to LP, NLP problems are much more difficult to solve. Further, the presence of nonlinearities in the objective and constraints might imply nonconvexity, which results in the potential existence of multiple local solutions (multimodality). Thus, in nonconvex problems one should seek the globally optimal solution among the set of possible local solutions. For the simple case of only two decision variables, one can visualize the objective function of a multimodal problem as a terrain with multiple peaks. Simple examples of unimodal and multimodal surfaces are presented in Figure

Simple examples (two decision variables, no constraints) of unimodal (1.a) and multimodal (1.b) surfaces, where the z-coordinate of the surface represents the value of the objective function for each pair of decision variables x and y

**Simple examples (two decision variables, no constraints) of unimodal (1.a) and multimodal (1.b) surfaces, where the z-coordinate of the surface represents the value of the objective function for each pair of decision variables x and y.**

The solution of multimodal problems is studied by the subfield of global optimization

Model-based optimization is a key methodology in engineering, helping in the design, analysis, construction and operation of all kind of devices. Since engineering approaches are playing a significant role in the rapid evolution of systems biology

In fact, optimization is already playing a key rôle. Examples of applications of optimization in systems biology, classified by the type of optimization problem, are given in Table

Examples of applications of optimization in systems biology, classified by type of optimization problem (note that several types overlap)

**Problem type or application**

**Description**

**Examples with references**

Linear programming (LP)

linear objective and constraints

maximal possible yield of a fermentation

Nonlinear programming (NLP)

some of the constraints or the objective function are nonlinear

applications to metabolic engineering and parameter estimation in pathways ^{13}C data

Semidefinite programming (SDP)

problems over symmetric positive semidefinite matrix variables with linear cost function and linear constraints

partitioning the parameter space of a model into feasible and infeasible regions

Bilevel optimization (BLO)

objective subject to constraints which arise from solving an inner optimization problem

framework for identifying gene knockout strategies

Mixed integer linear programming (MILP)

linear problem with both discrete and continuous decision variables

finding all alternate optima in metabolic networks

Mixed integer nonlinear programming (MINLP)

nonlinear problem with both discrete and continuous decision variables

analysis and design of metabolic reaction networks and their regulatory architecture

Parameter estimation

model calibration minimizing differences between predicted and experimental values

tutorial focused in systems biology

Dynamic optimization (DO)

Optimization with differential equations as constraints (and possible time-dependent decision variables)

discovery of biological network design strategies

Mixed-integer dynamic optimization (MIDO)

Optimization with differential equations as constraints and both discrete and continuous decision variables (possibly time-dependent)

computational design of genetic circuits

Optimization of biochemical reaction networks

Optimization methods have been applied in both metabolic control analysis

Metabolic engineering exploits an integrated, systems-level approach for optimizing a desired cellular property or phenotype

Coupling constraint-based analysis with optimization has been used to generate a consistent framework for the generation of hypotheses and the testing of functions of microbial cells using genome-scale models

A particularly interesting question in this context concerns the principles behind the optimal metabolic network operation,

Reverse engineering, modeling and experimental design

Reverse engineering in systems biology aims to reconstruct the biochemical interactions from data sets of a particular biological system. Optimization has been used for inferring important biomolecular networks, such as e.g. transcriptional regulatory networks

System identification

The problem of parameter estimation in biochemical pathways, formulated as a nonlinear programming problem subject to the pathway model acting as constraints, has also received great attention

Since biological experiments are both expensive and time consuming, it would be ideal if one could plan them in an optimal way, i.e. minimizing their cost while maximizing the amount of information to be extracted from such experiments. This is the purpose of optimal experimental design and optimal identification procedures

Conclusion

Although, as already mentioned, it would be desirable to formulate all the optimization problems as convex ones, in many occasions this is not possible, so we face the solution of global optimization problems, most of which belong to the class of NP-hard problems

Another important issue is the stochasticity that is inherent in biomolecular systems

As stated in

Moreover, optimization could also be used after the design and construction phases, inside a model predictive control framework

Finally, it should be recognized that standard optimization can be sometimes insufficient for gaining deeper insights regarding certain aspects of systems biology, such as in the evolution of biological systems. While evolving towards optimal properties, the environment may change or organisms may even change their own environment, which in turn alters the optimum. In an evolutionary system, continuing development is needed so as to maintain its fitness relative to the systems it is co-evolving with. In other words, everyone has to keep improving in order to survive, which is known as the "Red Queen" effect

Sutherland

Acknowledgements

The author would like to thank Matt Hodgkinson for his valuable comments, and acknowledges financial support from EU project BaSysBio LSHG-CT-2006-037469