The ability to perform quantitative studies using isotope tracers and metabolic flux analysis (MFA) is critical for detecting pathway bottlenecks and elucidating network regulation in biological systems, especially those that have been engineered to alter their native metabolic capacities. Mathematically, MFA models are traditionally formulated using separate state variables for reaction fluxes and isotopomer abundances. Analysis of isotope labeling experiments using this set of variables results in a non-convex optimization problem that suffers from both implementation complexity and convergence problems.
This article addresses the mathematical and computational formulation of 13C MFA models using a new set of variables referred to as fluxomers. These composite variables combine both fluxes and isotopomer abundances, which results in a simply-posed formulation and an improved error model that is insensitive to isotopomer measurement normalization. A powerful fluxomer iterative algorithm (FIA) is developed and applied to solve the MFA optimization problem. For moderate-sized networks, the algorithm is shown to outperform the commonly used 13CFLUX cumomer-based algorithm and the more recently introduced OpenFLUX software that relies upon an elementary metabolite unit (EMU) network decomposition, both in terms of convergence time and output variability.
Substantial improvements in convergence time and statistical quality of results can be achieved by applying fluxomer variables and the FIA algorithm to compute best-fit solutions to MFA models. We expect that the fluxomer formulation will provide a more suitable basis for future algorithms that analyze very large scale networks and design optimal isotope labeling experiments.
Metabolic Pathway Analysis
Metabolism is the complete set of chemical reactions taking place in living cells. These chemical processes form the basis of all life, allowing cells to grow, reproduce, maintain their structure and respond to environmental changes. Metabolic reactions are divided into groups called metabolic pathways, which are typically constructed heuristically according to their connectivity and presumed function . Each metabolic pathway is characterized by a set of chemical reactions that transform substrates into end products while generating intermediate byproducts. Due to its importance in medicine and biotechnology, metabolic pathway research has become a highly active field of investigation .
Initially, the structure of metabolic pathways was examined by identifying their intermediate compounds. Subsequently, the various biochemical reactions connecting these compounds were mapped. Due to the success of this research, the topological structure of many metabolic pathways is nowadays fully documented . The next step in the progression of metabolic pathway research involves quantification of the rates of these various chemical reactions, also known as "fluxes". The values of these rates are affected by various environmental conditions and can change rapidly in response to perturbations. Nevertheless, if the environmental parameters are held fixed and stable, the network can attain a steady state in which the concentrations of all network metabolites are assumed constant over time. This, of course, implies that the rates of their input and output reactions must balance. The latter imposes a set of linear constraints on the metabolic fluxes, known as "stoichiometric balance equations" . Unfortunately, since the number of unknown fluxes typically exceeds the number of independent stoichiometric balances, these constraints are insufficient to completely identify the metabolic network. In order to overcome this lack of information, additional constraints must be provided to the stoichiometric mathematical model to estimate the values of the network fluxes .
13C Isotope Labeling Experiments
Various experimental techniques have been developed to enable measurement of intracellular metabolic fluxes, either directly or indirectly. One of these approaches makes use of isotope labeling experiments. In this method, the metabolic system is administered a known amount of an isotopically labeled substrate (such as glucose labeled with 13C at specific atom positions). By measuring the resulting labeling patterns of intracellular metabolites after steady state has been achieved, additional flux information is obtained.
One major drawback of this experimental approach is the high complexity and computational intensity of the metabolic flux analysis (MFA) required to interpret these labeling measurements. In their series of articles, Wiechert et al. [6-9] constructed a systematic approach for performing this analysis. They show that measurements of the relative abundance of various isotope isomers, also known as "isotopomers", contain enough information to fully identify the metabolic fluxes of the network. Formulating the problem using isotopomer variables (or a transformed set of isotopomer variables referred to as "cumomers"), Wiechert et al. posed the flux estimation problem as a non-convex least-squares minimization, assuming random error is added to their isotopomer measurements. The resulting high-dimensional non-convex problem suffers from various drawbacks, such as slow convergence and notable probability of attaining local minima. Several optimization algorithms have been developed in order to address these drawbacks. Early approaches used iterative parameter-fitting algorithms , evolutionary algorithms  and simulated annealing . Furthermore, several investigations have been conducted in order to assess the accuracy of these results [9,12,13]. Recently, a novel method to decompose the metabolic network into Elementary Metabolite Units (EMUs) was introduced  and implemented into the OpenFLUX software package . This decomposition effectively reduces the size of the optimization problem by efficiently simulating only those isotopomers that contribute to the measurement residuals. Nevertheless, all of these algorithms suffer from several major drawbacks due to the standard isotopomer-flux variables used in formulating the optimization problem:
• Presence of unstable local minima: due to the non-convex nature of the objective function.
• Complex mathematical representation and computational implementation. This results in the need for ad-hoc algorithms and mathematical analysis, and long running times are required for reliable convergence.
The OpenFLUX implementation, for example, may require several dozens of convergence iterations with various initial values in order to achieve acceptable probability of obtaining the optimal set of fluxes in any one of its attempts. In addition, due to the chosen objective function, it is also commonly required to estimate scaling factors for each isotopomer measurement, because of the fact that the available experimental techniques are only capable of measuring isotopomer fractions up to a proportional scaling factor (see Mollney et al.  for further details).
This article introduces a new set of variables for simulating 13C isotope labeling experiments. The main idea underlying this reformulation is that, instead of treating fluxes and isotopomer variables separately, we identify a set of "isotopically labeled fluxes" as our state variables of interest. We refer to these variables as fluxomers. Fluxomers combine flux variables with isotopomer variables and consequently reduce the complexity and nonlinearity of the original isotopomer balance equations. In this article, we show that by reformulating the flux estimation problem in terms of fluxomer variables, it is possible to construct an algorithm that has the following key benefits:
• Provides efficient computation of all isotopomers in a metabolic pathway
• Is robust to measurement noise (i.e., suppresses the effects of measurement errors) and initial conditions
• Eliminates the need for measurement scaling factor estimation
• Poses the problem using simple mathematical expressions, allowing the use of generic optimization algorithms
The rest of the article is constructed as follows. The Results and Discussion section illustrates the advantage of our approach via simulation results comparing fluxomer variables to the commonly used cumomer approach and the more recently introduced EMU approach. The Methods section presents the detailed formulation of the fluxomers optimization problem and the fluxomers iterative algorithm (FIA) that provides a reliable and efficient method for solving it. All source code and executables for our algorithms are freely available at the author's website .
Results and Discussion
We compared our FIA algorithm to the widely used MFA software 13CFLUX , which relies on the cumomer approach, and to the more recent Open-FLUX  software, which is based on the EMU  approach. In order to compare the methods, we conducted flux estimations for various well-studied metabolic pathways. Our first example is based upon the tutorial which Wiechert et al. provide with their 13CFLUX software: the Embden-Meyerhof and Pentose Phosphate metabolic pathways of Escherichia coli . This example compares the running time and robustness of both algorithms in response to input noise. Our second example compares the results and performance of FIA to both an adhoc method and the OpenFLUX algorithm for the analysis of lysine production by C. glutamicum, as described by Becker et al.  and Quek et al. .
FIA vs. 13CFLUX Comparison: Embden-Meyerhof and Pentose Phosphate Pathways
In this section we examine a network representing the Embden-Meyerhof and Pentose Phosphate pathways of E. coli, which is based upon the tutorial supplied by Wiechert et al. as part of their 13CFLUX software package. Since our FIA implementation natively supports 13CFLUX input files (i.e. "FTBL" files), the same input files can be used for both algorithms. (Note, however, that FIA does not require definition of free fluxes nor initial values, and thus these are simply ignored when imported). Figure 1 shows the simple network used along with the nomenclature used in previous publications. In addition to the network structure, the models are provided with flux and isotopic measurements as shown in Table 1.
Figure 1. E.Coli EMP and PPP Metabolic Pathways. The Embden-Meyerhof and Pentose Phosphate metabolic pathways of Escherichia coli.
Table 1. EMP & PPP simulation data
First, we examined the output of the two algorithms for the traditional "noiseless" input file. In order to run the analysis, 13CFLUX requires the user to define a set of "free fluxes" along with their associated initial values . Note that a bad choice of free fluxes or their associated values can result in poor algorithmic performance (both in computation time and accuracy). In fact, under various initial guesses the algorithm did not converge at all. As for FIA, none of the above is required. Since the network along with the given measurements are well defined, in the noiseless case the two algorithms returned similar values for unidirectional fluxes, as can be seen in Table 2. Some slight disagreements were observed for the bi-directional fluxes, which are more poorly identified.
Table 2. Comparison of FIA with 13CFLUX for the simple E.coli metabolic network
We next compared the algorithms' sensitivities to noise. In a series of 10 experiments, white Gaussian noise was added to all of the measured isotopomer values, and the outputs and computation times for both algorithms were recorded. As can be seen in Figure 2, unidirectional fluxes remain quite constant and hardly suffer from the added experimental error (for both algorithms). However, the bi-directional fluxes are affected by the added noise. 13CFLUX suffers from a higher variance spread of the estimated values than FIA (thus is more sensitive to the added measurement noise). Note that the difference arises not only due to the mathematical model used, but also due to the stability properties of the optimization method chosen.
Figure 2. Measured fluxes values. Bidirectional fluxes calculated using FIA and 13CFLUX for noisy measurement set.
We next examined the computational performance of the two methods. Table 3 shows the algorithm running time for convergence (in seconds). The average running time for 13CFLUX was 133 seconds, while for FIA this time was 7 seconds. The running time ratio (13CFLUX/FIA) for individual experiments varied between ×9 to ×75.
Table 3. Algorithm running time comparison for FIA vs. 13CFLUX
FIA vs. OpenFLUX Comparison: Lysine Production by C. glutamicum
In this section we examine the analysis of the central metabolism of two lysine-overproducing strains of Corynebacterium glutamicum: ATCC 13032 (lysCfbr) and its PEFTUfbp mutant. Both express feedback-resistant isoforms of the aspartokinase enzyme lysC, while the latter is additionally engineered to overexpress the glycolytic enzyme fructose-1,6-bisphosphatase. The example is based upon the measurements provided by Becker et al. , who implemented an ad-hoc program to estimate the values of various metabolic fluxes. In their more recent article introducing the OpenFLUX software package , Quek et al. chose to compare their results to those of Becker et al. Therefore, we will expand upon their comparison using our FIA implementation. The input file for FIA was constructed using the measurements and pathway structure given in  and . As described in , the published mass isotopomer fractions were modified for mass interference from non-carbon backbone isotopes using the molecular formula of the amino acid fragments. FIA supports automatic generation of the naturally occurring isotopes correction matrix when the measured molecular formulas are supplied. This adjusts the measured fluxomers vector appearing in the objective function during the process of optimization. If necessary, it is possible not to use this feature but instead to directly supply the algorithm with the corrected measurement values.
When comparing the running times of FIA with OpenFLUX, the different algorithmic approaches of the two must be kept in mind. While OpenFLUX requires the user to supply it with sets of free fluxes, FIA requires no free fluxes nor initial values. Open-FLUX rapidly evaluates dozens of different optimization cycles with random initial values and seeks the best fitting result among them, while FIA uses only one single (longer) run. As such, the convergence probability of OpenFLUX depends on the number of attempts and random values generated during its operation, while the FIA results do not depend on any random value. Furthermore, in its analysis, EMU based algorithms evaluate only the fluxes necessary for measurement comparison, and thus their running time depends both on the metabolic network structure and the amount and location of the given measurements. FIA, on the other hand, can supply the entire set of metabolic fluxes at any given time, with no additional computation requirement (which depends mainly on the network structure).
Measured fluxes as constants
First, we ran the exact same simulation as Quek et al. performed in their article. They supply very accurate (mean error in the order of 0.15 mol%) values for the label measurements, and used the given measured fluxes as if they were noiseless measurements (thus as constants). We start by comparing the simulation time for this simple case. According to  and as validated by us using our computer, OpenFLUX required 50 iterations of about 16 seconds each in order to find a decent minimal point, hence about 800 seconds in total. While so, the FIA analysis took 60 seconds for initial analysis and matrices creation, and 300 further seconds for convergence, thus 360 seconds as a whole. Regarding the simulation results, as one can see in Table 4 and Table 5 the fluxes are very close to those calculated before, and the estimated fluxes FIA returned had the lowest residual value compared to the other methods.
Table 4. Relative mass isotopomer fractions comparison for wild-type and mutant C. glutamicum
Table 5. Metabolic fluxes comparison for wild-type and mutant C. glutamicum
Measured fluxes as measurements
We can also run the same optimization, but weight the given flux measurements by their variances. When running this optimization using OpenFLUX (again using 50 iterations), the amount of time was greatly increased, and ended in around 48 minutes. For FIA, on the other hand, the running time was the same as before, thus about 6 minutes. Comparing the results of the algorithms, OpenFLUX suffered from severe convergence problems. Most of its iterations ended without converging at all, while those that did converge yielded useless results, far from the measurements. FIA, on the other hand, succeeded in converging for all scenarios. For the wildtype lysine producing pathway, the results were very close to the ones before (since the fluxes and measurements were quite accurate). For the mutant example, which was less accurate, a reduction of the residual value was achieved by small changes to the measured fluxes. fluxes and residual values can be examined in Table 4 and Table 5.
Using non-normalized MS measurements
We now show that FIA can easily use incomplete or non-normalzied measurements by examining its performance in the example above. The supplied MS measurements were normalized to the n +1 backbone carbon atoms of the measured metabolites. Instead of using the supplied normalized data, we multiply each set of metabolite measurements by a random constant number. By doing so, we simulate the case in which only the first 3 (2 for GLY) MS peaks were measured, and had not been normalized. The original and supplied non-normalized measurement values can be found in Table 4. Note that the values were corrected by the molecular formulas of the measured fragments (again, can be automatically performed by FIA). In the absence of normalized data, FIA gave estimated fluxes very close to the previous cases, with very low residual values, as can be seen in Table 5. The running time of the algorithm was not affected by the change.
The main contribution of this article is the introduction of fluxomers--a new set of state variables used to simulate 13C metabolic tracer experiments. The fluxomers approach allows the central optimization problem of MFA to be reformulated as a sequence of quadratic programs, which form the basis of the fluxomers iterative algorithm (FIA). Both fluxomers and FIA result in several important benefits compared to flux-isotopomer variables. Among these advantages are (i) a reduction in algorithm running time required for simulation of isotopomer distributions and metabolic flux estimation, (ii) reduced sensitivity to measurement noise and initial flux values and (iii) availability of complete isotopomer information for a given network (as opposed to the EMU approach, which only supplies partial information) without the need for user specification of free fluxes or initial flux values. Additionally, the error model used by the FIA algorithm has the advantage that it depends solely upon isotopomer ratios rather than complete isotopomer fractions, and therefore it eliminates the need to estimate a normalization factor for each measured isotopomer distribution. Our current results show significant improvements even with regards to simplistic tracer experiments (the running times have been improved by an order of ×3 to ×20 compared to the 13CFLUX algorithm, and about ×2 to ×8 compared to the OpenFLUX implementation). It is important to note that the total time required to obtain an MFA solution is controlled both by (i) the time of each iteration and (ii) the number of optimization iterations that are required to achieve a reliable solution. While a single OpenFLUX iteration is certainly faster than a single iteration of FIA, the FIA algorithm was expressly constructed to provide high reliability in achieving the optimal solution. Therefore, FIA was able to consistently find a better optimal solution in less total time in comparison to the other algorithms examined. Furthermore, extending the fluxomers formulation to other global optimization techniques is straightforward. We expect that reformulating more sophisticated MFA problems--for example, involving optimal experimental design or large-scale metabolic networks--in terms of fluxomer variables will lead to dramatic enhancements of algorithmic efficiency and robustness.
In the following, we show how to construct and solve MFA problems using fluxomer variables. First we define and explain the basic properties of fluxomers. Then we show how to express MFA balance equations and measurements in terms of fluxomers. Finally, we formulate the MFA optimization problem and present the FIA algorithm for solving it. Throughout this section we use boldface uppercase letters A to denote matrices, lowercase boldface letters x to denote vectors, and lowercase letters u for scalars. We use the < ○ > product z = x○y to represent the element-wise product vector, i.e. zi = xiyi. The model formulation will be illustrated using the simple metabolic network shown in Figure 3.
Figure 3. Simple metabolic network. (a) Standard network representation. Carbon atoms are drawn explicitly with arrows to indicate atom transitions. Unidirectional arrows represent unidirectional fluxes while bidirectional fluxes (such as flux 5) are represented by bidirectional arrows. (b) Fluxomers representation. Each arrow is a group of fluxomers. X's appear on the appropriate atom positions to indicate summation of divergent fluxomers.
Traditional MFA approaches construct distinct variables for each flux and for each possible labeling state (isotopomer) associated with all metabolites in the network. Fluxomers, on the other hand, are a composite of these two and therefore allow the network state to be described using only one variable type.
Definition 1 (Fluxomer) A fluxomer is the rate that a metabolic reaction transfers labeling from one or more specific substrate isotopomers into product isotopomers.
Taking each fluxomer to be a transformation from one set of labeled atoms into others, we can write its labeling state as an array of binary elements representing the state of each atom it consumes (0 representing an unlabeled atom and 1 representing a labeled atom). Thus, fi(1001) is a fluxomer of reaction i consuming 4 atoms, with its first and last atoms labeled and two middle atoms unlabeled. When using x as an index for one (or more) of the atoms, we denote a sum of fluxomers where the indicated atom can be either labeled or unlabeled (e.g., fi(1x01) is the sum of fi(1001) and fi(1101)). See Figure 3b for a detailed example.
Traditional metabolic fluxes and isotopomer variables can be easily expressed using fluxomers. We start with metabolic fluxes, which are just a sum of their associated fluxomers. For the simple network in Figure 3b we have:
We can also express isotopomer abundances in terms of fluxomer variables for the same example. Because of the assumption that enzymes do not differentiate between the various isotopomers of a given metabolite, the isotopomers within each metabolite pool are distributed uniformly across the outgoing fluxes emanating from that pool. Therefore, the fractional abundance of a given isotopomer within a metabolite pool will determine the fractional contribution of its corresponding fluxomers to the fluxes leaving that pool:
Fluxomer balance equations
We now examine the fluxomer balance equations that describe how fluxomers are propagated through the metabolic network. These balance equations represent the main mathematical device for calculating steady-state fluxomer values for a given network. For ease of notation, let us define the vector of metabolic fluxes in our system by and the vector of fluxomers as . As shown above, the metabolic fluxes are calculated from a linear transformation of the fluxomers. Denoting this linear transformation matrix as U, we can write u = Ux. We now assume that we are given a certain u vector and wish to calculate the fluxomers in our system. We start by considering balances on "simple fluxomers", i.e. those that originate exclusively from a single metabolite pool. (An example of a simple fluxomer is f5(01) in Figure 3, which derives solely from pool B.) Under conditions of metabolic and isotopic steady state, the rate of 01-labeled molecules entering pool B must balance the rate that 01-labeled molecules leave that pool. Therefore, we can construct a balance on fluxomers around pool B as
where g(u) is a function of u alone, and h is a constant vector. Thus, for this simple case we can solve for the outgoing fluxomer f5(01) directly in terms of the fluxomers entering pool B and the total fluxes f2 and f5 leaving pool B.
We now turn to the more complex situation in which the output fluxomer originates from more than one metabolic pool. For example, consider fluxomer f4(0001) coming from pools C and D. Here, the fraction of 0001-labeling carried by flux f4 is proportional to the abundance of 01-labeling in C and 00-labeling in D:
As before, the outgoing fluxomer f4(0001) can be expressed solely in terms of g--a pure function of u (always a rational function of outgoing fluxes)--and a product of linear projections of x.
Without loss of generality, we restrict ourselves to fluxes coming from at most 2 metabolic pools (referred to subsequently as the "left" and "right" pools). When the system reaches steady state, we have
where g is a function , and (H1, H2) are two m × m matrices. This equation allows for the output fluxomers emanating from a specific metabolite pool to be expressed in terms of the total flux vector u and the fluxomers entering the pool. This enables each outgoing fluxomer to be solved "locally" for the incoming fluxomers. Note that this local calculation does not involve any matrix inversions or other expensive computational procedures. If there are no recycle loops in the network so that all possible paths through the network are non-selfintersecting, this equation can be used to solve sequentially for all "downstream" fluxomers in terms of previously calculated "upstream" fluxomers. In the presence of recycle loops an iterative approach can be constructed to solve for the fluxomers while still avoiding repeated matrix inversions.
Constructing the system matrices
In matrix form,
Isotopomer measurement formulation
In the following, we develop a systematic method for expressing measured isotopomer variables using fluxomer notation. The final result of the analysis shows that isotopomer measurements can be written simply as the norm of a linear transformation of fluxomers, thus Err ~ ||Ax||2. First, we briefly summarize the available isotopomer measurements provided by Nuclear Magnetic Resonance (NMR) and Mass Spectrometry (MS) methods. We then discuss the mathematical modeling of these measurements using fluxomer variables.
Available isotopomer measurements
MFA experiments are typically carried out by (i) introducing a labeled substrate into a cell culture at metabolic steady state, (ii) allowing the system to reach an isotopic steady state, and (iii) measuring isotopomer abundances of metabolic intermediates and byproducts using either MS or NMR analysis. These two measurement techniques provide qualitatively different information about isotopic labeling.
• 1H NMR: Measures the fractional 13C enrichment of each proton-bound carbon atom, irrespective of the labeling of its neighboring carbon atoms. Both 12C and 13C atoms are distinguishable in the same spectrum, and therefore the peak areas corresponding to different carbon isotopes can be normalized directly.
• 13C NMR: Quantifies isotopomers based on the presence of multiplet peaks (e.g., doublets, triplets, doublet doublets, etc.) in the spectrum caused by two or more neighboring 13C atoms. Because 12C atoms are undetectable by 13C NMR, it is impossible to quantify the overall fraction of each isotopomer unless 1H NMR spectra are simultaneously obtained. Instead, only the relative ratio of different isotopomers can be assessed by 13C NMR.
• MS: This technique is usually preceded by some form of chromatographic separation (GC or LC) to resolve mixtures into their individual components. These components are then ionized and fragmented in the MS ion source. The ionized particles are separated according to their masses by an electromagnetic filter, and a detector measures the relative abundance of each mass isotopomer. These abundances can be normalized to a fractional scale if all MS peaks corresponding to a particular ion are simultaneously measured.
Previous studies based on flux-isotopomer variables have modeled the measurement error as Gaussian noise added to the fractional isotopomer enrichments. Therefore, if is the vector of measured isotopomer fractions, this model states that , where e is the Gaussian error term. However, a more accurate error model would add the measurement noise directly to the physically measured values. The motivation for the traditionally chosen error model is its relative simplicity when expressed using flux-isotopomer variables. Furthermore, since some isotopomers of a specific metabolite may be unmeasurable, the isotopomer fractions cannot be experimentally determined in many cases. This implies the need for an alternative error model that avoids these shortcomings.
Measurement Error Model
We denote the measured isotopomer abundances by a vector . For NMR analysis, the elements of are proportional to the areas under the different spectral peaks. For MS, they are proportional to the integrated ion counts associated with each mass isotopomer. Since is the measured quantity, the correct error model is an addition of Gaussian noise so that , where m is the "true" measurement value. The measured isotopomer fractions are then expressed as
where εj is a sum of Gaussian variables. Noting that each measurement mj is simply proportional to a linear combination of fluxomers, the residual expression eq. 12 takes the form
where T and V are transformation matrices needed to convert fluxomers to isotopomer measurements and the diag operator converts its vector argument into a diagonal matrix. The resulting expression is both a simple sum of Gaussian vectors and affine in x.
The advantage of this objective function is that it only depends upon the relative isotopomer intensities in the vector but does not depend upon how these intensities have been normalized (as long as the transformation matrix T is constructed accordingly). This eliminates the need to estimate optimal normalization factors that are required by previous algorithms in order to convert experimental measurements into isotopomer fractions. This is true for both MS and NMR measurements, either when conducted alone or used together in the same experiment.
The MFA optimization problem using fluxomers
Now that we have defined both the isotopomer measurements and the feasible solution set, we can formulate the least-squares MFA optimization problem in terms of fluxomer variables. Our objective is to find the flux vector u that minimizes the measurement error. In addition to the fluxomer balances, usually upper bounds uub are provided for all fluxes. As has been proven by Wiechert et al. [6-9], once the inputs to the system and u are set, the solution (x, u) is unique. In other words, the steady-state fluxomer balance equation, eq. 6, is actually an implicit definition of x(u). With this in mind, the MFA optimization problem can be simply defined as
where A selects the measured elements of the fluxomers vector x(u), b contains their associated values, and S is the stoichiometric matrix of the reaction network. Note that b may contain non-zero elements only when associated with measurements of absolute flux values. For isotopomer measurements, the associated elements of b are zero.
Eq. 14 can be solved using various non-convex global optimizing techniques. These optimizers typically require the user to provide subroutines for computing the value of the objective function and its first derivatives at various points along their convergence path. Furthermore, evaluation of the function x(u) and its derivatives are the main (practically only) time-consuming procedures when solving the optimization problem in eq. 14. The mathematical formulation of eq. 14 is similar to the optimization problem resulting when using the labels and fluxes variables, with one exception - the implicit formula for x(u). As shown above, using fluxomers we are able to formulate the propagation equation (and thus solving x(u)) as a multiplication of homogeneous functions of fluxes, and second order functions of fluxomers. Using labels and fluxes, formulating the same equation results in a sum of functions of the same structure, and the homogeneous separation property vanishes. The following sections exploit this unique property of the fluxomers propagation equation in order to achieve great reduction in the system computational complexity, leading to the FIA algorithm.
Fluxomers Iterative Algorithm (FIA)
This section deals with the evaluation of x(u) along with its gradient using the fluxomer formulation. First, we show that x(u) can be calculated iteratively while avoiding repeated matrix inversions. Then, we demonstrate how the number of iterations can be reduced using a Newton-type gradient-based algorithm. Finally, we explain how it is possible to greatly increase the sparsity of the system using a simple linear transformation of variables, which further reduces the number of iterations needed for convergence.
Solving the fluxomer balance equations
A simple approach for computing x given u is to imitate nature. Once a metabolic network reaches steady state (namely, when u is constant), changing its input labeling does not affect its flux values u, but only influences the labeling of its intermediate metabolite pools. The metabolite labeling patterns become gradually mixed and propagated throughout the network until isotopic equilibrium is reached. Accordingly, a simple approach for solving eq. 6 is by using its iterative representation (which is similar to the process taking place in nature):
where xt is the fluxomer vector at iteration t and xt+1 is the fluxomer vector at iteration t + 1. In order to simulate the steady-state labeling, we initialize the system with the vector x0 in which only the input fluxomers are labeled and all others are unlabeled. By recursively substituting x back into the equation, steady state is eventually reached and the final value of x is obtained. (This equation represents a non-linear time-invariant Markov chain.) For the Embden-Meyerhof and Pentose Phosphate Pathway example in the Results and Discussion section, it takes a few hundred iterations to achieve complete stability of the solution (maximal fluxomer value change on the order of 1e-17). Algorithm convergence for a given input vector is retrieved exactly as in the real biological system, and thus a unique solution always exists (for realistic metabolic networks).
We now show it is possible to reach pathway convergence in much fewer iterations. First, we write eq. 6 as
Now, in order to find the values of (x, u) one needs to solve F(x, u) = 0 while holding u constant. We choose to use one of the classic and powerful algorithms for finding roots of an equation, the well known Newton-Raphson [19-21] method. Roughly speaking, the change of the x vector at each iteration is calculated by
In order to determine the root of the propagation equation, FIA starts with an iteration or two using Newton's correction and then continues with the simple "natural" approach. Applying this method to the Embden-Meyerhof and Pentose Phosphate Pathway example in the Results and Discussion section, only a few dozen iterations are now needed. In the next section we show how to reduce both the number of variables and the number of iterations required for convergence by another order of magnitude, without affecting system convergence stability.
Reducing system complexity
The following section introduces a mathematical approach for reducing the number of nonzero elements in our system. Variable reduction techniques such as the recently developed Elementary Metabolite Unit (EMU) network decomposition  were developed for application to systems that are modeled using flux-isotopomer variables. Fluxomers and the FIA algorithm, as opposed to prior approaches, allow us to effectively reduce the number of system variables using a simple linear transformation on x. Our main goal here is to find a transformation for the fluxomer vector x, y = Kx that:
• Reduces the number of its nonzero elements.
• Reduces the computational complexity of solving eq. 16.
• Eases the evaluation of eq. 15.
From eq. 16 we see that the greatest expense is due to inversion of a sum of two linear transformations (H1 and H2) of x. From the iterative propagation equation, eq. 15, we see that x is iteratively calculated by computing the product of the same two matrices. Had it been possible to find a sparse, close-to-diagonal representation for both H1 and H2 by simply multiplying them by the matrix from the right, both problems would be solved.
In order to acomplish the above, we examine the properties of the concatenation of these two matrices which we denote by H. Next we find the LU factorization of H,
with LH lower triangular and UH upper triangular matrices. The matrix LH1 contains the first m rows of LH and LH2 contains the last m rows of LH. Our new set of variables now becomes y = UHx, and the new propagation equation is
When expressed in terms of the variable y, our system becomes much more sparse. This is illustrated in Figure 4 which shows H1, H2, LH1, LH2 and UH for the Embden-Meyerhof and Pentose Phosphate Pathway example. The transformation has two essential benefits:
Figure 4. System matrices complexity reduction. H1, H2, LH1, LH2 and UH for the simple E. coli example. A substantial reduction in nonzero elements between the H and L matrices can clearly be seen.
2. Fewer iterations needed for convergence.
As a matter of fact, this transformation reduced the number of iterations needed for convergence of the simple E. coli example to a total of 5.
As discussed above, our optimization problem seeks the minimum of ||Ax(u) - b||2. In order to converge rapidly, the gradient of the objective function must be computed at each iteration of the algorithm. The key step for computing it is the evaluation of (the derivative of the fluxomers as a function of the metabolic fluxes). Since we have an implicit function F(x, u) along with a valid solution for F(x, u) = 0, the implicit function theorem [22,23] can be used to compute . Because F(x, u) is continuously differentiable around its root, we can write
The initial point
The generation of the initial point for the FIA algorithm is very similar to the standard method used by many iterior point algorithms for finding a valid initial point over a convex linear set. We added a fluxes-measurement regularization factor λ in order to generate an initial point closer to the final estimation (and thus speed up the convergence process). The initial point is generated by solving the following simple convex optimization problem:
with A a matrix that selects the measurable elements of u, the meaured elements of u (if they exist), 0 a vector of zeros, and ub a vector of the flux upper bounds. The regularization factor λ starts with some large value, and if necessary is reduced until the optimal value of s is greater than 0. Note that when λ → 0 the problem reduces to finding a feasibile solution of u, and thus always has a solution (for well-structured networks).
Summarizing the above discussion leads to the following algorithm for efficient solution of the MFA optimization problem using fluxomers:
i. Matrix preparation (run once per network):
ii. Call the optimizer in order to solve
When requested by the optimizer, return x(u) and its first derivative:
1. Set y0 = yinit.
2. Set i = 1.
yi = UH [g(u) ○ (LH1yi - 1) ○ (LH2yi - 1)].
4. If ||yi - yi - 1||2 > εN
(b) Set yi = yi - r according to Newton's method.
5. If ||yi - yi - 1||2 > εe go to 3.
6. Calculate x = [g(u) ○ (LH1yi) ○ (LH2yi)].
The supplied software uses a variant of the "sequential least-squares" algorithm [24,25] for solving the non-convex optimization problem in eq. 14. This essentially breaks the problem into a sequence of convex optimization problems for which the solution can be readily computed. Note that other algorithms can be easily used with the same procedures described above.
OS developed and implemented the theory, algorithm, simulations and software. YCE and JDY initiated and supervisied the study, and contributed to writing the manuscript. All authors reviewed, edited and approved the final manuscript.
JDY is supported by an NSF CAREER Award (CBET-0955251) and by the Vanderbilt Discovery Award program.
Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P: [http:/ / www.amazon.com/ exec/ obidos/ redirect?tag=citeulike07-20&path=AS IN/ 0815332181] webcite
Steuer R: Review: On the analysis and interpretation of correlations in metabolomic data. [http://bib.oxfordjournals.org/cgi/content/abstract/7/2/151] webcite
Berg JM, Tymoczko JL, Stryer L: [http:/ / www.amazon.com/ exec/ obidos/ redirect?tag=citeulike07-20&path=AS IN/ 0716746840] webcite
Nat Biotech 1994, 12(10):994-998. Publisher Full Text
Marx A, de Graaf AA, Wiechert W, Eggeling L, Sahm H: Determination of the Fluxes in the Central Metabolism of Corynebacterium Glutamicum by Nuclear Magnetic Resonance Spectroscopy Combined with Metabolite Balancing.
Schmidt K, Norregaard LC, Pedersen B, Meissner A, Duus JO, Nielsen JO, Villadsen J: Quantification of intracellular metabolic fluxes from fractional enrichment and 13C-13C coupling constraints on the isotopomer distribution in labeled biomass components.
Quek LE, Wittmann C, Nielsen L, Kromer J: Open-FLUX: efficient modelling software for 13C-based metabolic flux analysis. [http://www.microbialcellfactories.com/content/8/1/25] webcite
Becker J, Klopprogge C, Zelder O, Heinzle E, Wittmann C: Amplified expression of fructose 1,6-bisphosphatase in Corynebacterium glutamicum increases in vivo flux through the pentose phosphate pathway and lysine production on different carbon sources.
SIAM Review 1995, 37(4):531-551. Publisher Full Text
Deuflhard P: [http:/ / www.amazon.com/ Newton-Methods-Nonlinear-Problems-D euflhard/ dp/ 3540210997%3FSubscriptionId%3D0JYN1 NVW651KCA56C102%26tag%3Dtechkie-20% 26linkCode%3Dxm2%26camp%3D2025%26cr eative%3D165953%26creativeASIN%3D35 40210997] webcite
Ortega JM, Rheinboldt WC: [http:/ / www.amazon.com/ Iterative-Nonlinear-Equations-Varia bles-Mathematics/ dp/ 0898714613] webcite
Jittorntrum K: An implicit function theorem. [http://www.springerlink.com/content/x746u2457466u402/] webcite
Journal of Optimization Theory and Applications 1978, 25(4):575-577. Publisher Full Text
Krantz SG, Parks HR: [http:/ / www.amazon.com/ Implicit-Function-Theorem-History-A pplications/ dp/ 0817642854%3FSubscriptionId%3D0JYN1 NVW651KCA56C102%26tag%3Dtechkie-20% 26linkCode%3Dxm2%26camp%3D2025%26cr eative%3D165953%26creativeASIN%3D08 17642854] webcite
ACM Trans Math Softw 1994, 20(3):262-281. Publisher Full Text