Abstract
Background
This paper presents a method for modelling dynamical biochemical networks with intrinsic time delays. Since the fundamental mechanisms leading to such delays are many times unknown, non conventional modelling approaches become necessary. Herein, a hybrid semiparametric identification methodology is proposed in which discrete time series are incorporated into fundamental material balance models. This integration results in hybrid delay differential equations which can be applied to identify unknown cellular dynamics.
Results
The proposed hybrid modelling methodology was evaluated using two case studies. The first of these deals with dynamic modelling of transcriptional factor A in mammalian cells. The protein transport from the cytosol to the nucleus introduced a delay that was accounted for by discrete time series formulation. The second case study focused on a simple network with distributed time delays that demonstrated that the discrete time delay formalism has broad applicability to both discrete and distributed delay problems.
Conclusions
Significantly better prediction qualities of the novel hybrid model were obtained when compared to dynamical structures without time delays, being the more distinctive the more significant the underlying system delay is. The identification of the system delays by studies of different discrete modelling delays was enabled by the proposed structure. Further, it was shown that the hybrid discrete delay methodology is not limited to discrete delay systems. The proposed method is a powerful tool to identify time delays in illdefined biochemical networks.
Background
Time delays play a very important role in genetic regulatory systems. Gene regulation and signal transduction as a whole involves the synthesis and maturation of complex proteins. Their synthesis and transport takes a considerable amount of time, which introduces delays in the overall regulation chain. At a process level, metabolic time delays can be observed macroscopically by recognizing a certain time delay between substrate uptake and the corresponding biomass growth or product formation as in cultivations of Saccharomyces cerevisiae [1] or Pichia pastoris [2]. The nature of time delays in regulatory networks is twofold [3]. They are either related to a process that takes an intrinsic time to be accomplished, i.e. some reactions, such as translational or transcriptional reactions, take a significant amount of time to be completed, or as a consequence of the modelling approach used, i.e. lumping a sequence of events might lead to an apparent time delay.
The bottomup systems biology approach for building dynamic network models can be too cumbersome due to their complex nature and lack of fundamental knowledge [47]. Typical limitations are the involvement of large scale kinetic models with poorly defined kinetic parameters, limited generalization capacity and their cost expansive development. In this paper we propose the use of mathematical hybrid semiparametric systems as a cost effective alternative to model biochemical networks with intrinsic time delays, since it is not likely to know in advance which fundamental mechanisms cause such delays. Hybrid semiparametric systems combine fundamental (parametric) biological constraints with more empirical databased (nonparametric) constraints. Mechanistic and nonparametric models can therein be arranged in two possible ways: parallel or serial [4,812]. In the serial structure, which has been the one applied in this study, the biological system dynamics are described by time differentials of classifying variables, while the unknown metabolic functions with intrinsic delays are handled by a nonparametric structure.
A general mathematical representation of delayed dynamics is given by Retarded Functional Differential Equations (RFDE) [13]. After applying certain simplifications, some special concepts arise such as models considering either discrete time delays, [1416], distributed time delays, [1,17,18] or ordinary differential equations (ODE) of kinetic rates [2]. Although with varying performance, these models are shown to be capable of explaining the stability of biochemical networks [1,13,1820].
Similar simplifications of RFDE as reported for parametric models can also be applied to hybrid semiparametric models, i.e. either discrete delays or distributed delays of state variables in the kinetics or differential equations of the kinetics. The latter is not well suited for hybrid modelling, because neither kinetic function nor the kinetic values are known a priori and thus a solution or estimation of the kinetics is not straightforward. Distributed delays are also rather unlikely to be used, because one would have to introduce some function accounting for the delay, which is generally not known. Furthermore, some mathematical postulation of arbitrarily large delays for unknown weighting functions of the delayed variable would have to be assumed and this mathematical convenience is in limit biologically unrealistic (see [13]). Instead, the use of discrete delays in the inputs to the nonparametric structure is proposed herein. This is analogous to the application of discrete time series, namely AutoRegressive (eXogenous), (AR(X)), models. This presents no limitation for application, as it is mathematically clear that a weighted discrete time series is equivalent to the integration of a time delay weighting function and thus analogous to the application of the distributed delay framework.
Unfortunately, the theoretically endless number of time lagged values of one variable would in practice lead to high computational times and identification problems of the network structure and parameters (see [21,22]). In theory, an optimal number of time lagged values exists given by the ratio between redundancy and additional gain of information in the inputs. Several methods for identification of the optimal number of time lagged values such as autocorrelation, crosscorrelation, or partial mutual information, have been proposed [21,22]. However, they either assume that inputs are linearly correlated or are based on maximum information transfer and thus require known outputs, which unfortunately are not directly available in hybrid models.
Hence these methods cannot be applied here and thus the number of lagged values is rather chosen by trial and error, as done by several other authors [23,24]. However, choice by trial and error is not a disadvantage, when (i) the delay is an important property of the system and when (ii) series of delays are systematically studied, since it can be expected that models that account for time delays perform the best when the studied and the "true" delays are congruent.
In this paper hybrid delay differential equations with discrete time series was determined to be a powerful method to identify delayed dynamics of illdefined biochemical networks. This technique is described in detail in the results section. The technique was applied to a typical gene regulatory system where the transport of macromolecules between the cytosol and nucleus introduce strong delay dynamic effects. In addition, heterologous protein expression by recombinant Pichia pastoris was studied by assuming a hypothetical network with distributed time delays.
Results & Discussion
Delay Differential Equation Hybrid Model (DDEHM)
Material balances over intracellular metabolites can be generically stated by the following dynamical equation
where c_{int }is a vector comprising the concentrations of intracellular metabolites, K_{int }a m × q stoichiometric matrix of m metabolites and q metabolic reactions, r_{int }a vector of q kinetic rates, b_{int }a vector of transport fluxes across the cellular membrane and μ the specific growth rate.
If a macroscopic bioreactor model is formulated accounting only for the unbalanced extracellular metabolites, a similar equation is obtained which accounts for the volume dilution term (D·c_{ext }) in substitution of the cell growth dilution term (μ·c_{int }),
Here c_{ext }is a vector of concentrations of extracellular metabolites, K_{ext }a matrix of stoichiometric coefficients, D is the dilution rate, u_{ext }is a vector of volumetric feeding rates, and r_{ext }is the kinetic rate vector.
All the results, presented from this point forward, are derived from eq. (2), which can however be automatically extended to eq. (1).
Delayed reaction kinetics
As suggested by [10], the vector of kinetic rates can be described either mechanistically, statistically or as a mixture of both types of models depending on the a priori knowledge about the metabolic network. A general definition is to state every metabolic flux as the multiplication of a mechanistic term (ψ) with an unknown nonparametric term (ρ) representing the unknown phenomena that must be identified from data.:
with X a vector of input variables and w a vector of empirical parameters. When no a priori mechanistic knowledge is available then the ψ term is dropped and eq. (3) reduces to
As stated previously, the intrinsic causes of delays are the occurrence of several serial reaction steps with slow kinetics. To mimic this effect, and analogous to AR(X)models, both the ψ and ρ kinetic terms are modelled as a function of X , which includes discrete past values of metabolite concentrations, c ,(that can be intracellular or extracellular, depending on the application of eqs. 1 or 2) and/or exogenous inputs:
Here c_{i }means value i of vector c , τ_{i }is the associated time lag, N_{i }defines the number of time lags assumed for each value c_{i }of vector c , s_{j }is the j^{th }exogenous input, τ_{j }the associated time lag and its lag number is defined by M_{j }. Note that the time lags and the numbers of time lags, τ_{i }, τ_{j }, N_{i }, and M_{j }can be chosen independently. However, it might be advantageous to model a time series around rough estimates of the "true" delays.
After considering eq. (2)  (5), it becomes clear that the model equations are Delay Differential Equations (DDE) in which the "retarded" or "lagged" phenomena are accounted by the reaction term, eq. (4).
Several linear or nonlinear regression methods can be used to formulate the unknown nonparametric kinetic function ρ . Here we adopted a three layer back propagation neural network with hyperbolic tangential activation function for the sake of comparability with other hybrid modelling studies since this method is the most reported in the literature [4,812]:
where w , the parameter vector, comprises the weights and biases, w_{1 }, w_{2 }, and b_{1 }, b_{2 }, respectively. The hyperbolic tangential activation function g(·) is,
Note that the incorporation of AR into the hybrid approach results in delay differential equations, which is why the proposed hybrid model is referred to as the Delay Differential Equation Hybrid Model.
Nonparametric structure identification
The identification of the best network architecture by means of a tradeoff between residual minimization, quantity of data and quantity of parameters is a central question when nonparametric models find application. This tradeoff is due to the fact that more parameters on one hand will improve the fitting of the model to the data, but on the other hand might result in parameter overfitting, leading to a degradation of model robustness or/and, even worse, in the addition of synthetic noise to the estimates [21,22].
The architecture of the Artificial Neural Network (ANN) structure involves the variation of the number of layers and the number of nodes. This variability is in this study, prior to application, already reduced by the selection of three layers, namely input, hidden and output layer. The application of three layers is usually sufficient if nonlinear continuous functions are sought to be modelled [22]. Remaining in terms of structural variability is such the evaluation of the variation of numbers of nodes for each hybrid model setup.
Parameter identification
For each nonparametric structure, the respective parameters w must be estimated from data. In this paper a weighted least squares criteria of model residuals in concentrations is adopted:
where P is the number of samples, n is the number of state variables, c_{m, l, i }are measured state variables, c_{l, i}(t, w) are calculated state variables and c_{σ, i }are the standard deviations. The serial hybrid structure, consisting of an ANN and material balances, was shown to be trained best by using the sensitivity approach along with analytical gradients [10]. Here we extended the sensitivity equations to the DDEHM case. The sensitivities equations are derived by differentiating eq. (2) with respect to w while taking into account the time lagged differential variables, which then reads as follows,
where
With ρ and ψ depending on the time lagged concentrations and where
For comparison of timedelay gradients for network training see [4,21].
This least square problem is solved by using the "lsqnonlin" Matlab function which uses a subspace trust region method and is based on the interiorreflective Newton method (Matlab Optimization toolbox) [25]. The sensitivity equations are integrated along with the delay differential model equations. This can either be accomplished using the dde23 Matlab algorithm, which integrates the delay differential equations with the explicit RungeKutta (2,3) pair and interpolant, or by using linear approximation of the differential equations for integration with storage of the respective delay values, which results in a time inexpensive algorithm. For the latter case, unfortunately, some error is introduced along with this simplification. However if average kinetic rates are estimated for each time step, the error is significantly diminished. Initial state values, c(t_{0}), are problem dependent (for instance the initial concentration of biomass or substrate in a bioreactor). The initial values of the sensitivity equations are however zero , because the initial state values, c(t_{0}), are independent of model parameters w . The residual gradients are then obtained using the corresponding sensitivity values. Notice that the lagged values of both state variables and exogenous inputs are assumed to be equal to the initial values c(t_{0}) for all tN_{i}·τ < t_{0 }.
Identification is initialized from a random selection of weight values as usually done for ANNs. The solution space is spanned by these weights and the identification, i.e. the objective to reduce the model residual, is a nonlinear optimization problem. Therefore, one cannot expect to obtain the global minimum as the result of the model's residuals minimum found from one random weight initialization. Instead, several iterations of the same setup with random initialization should be carried out. The greater the number of random initializations, the greater the statistical confidence of the solution [21,22].
However, parameter identification is an iterative process which should be stopped when the model exhibits the best generalization of the target functions [21,22]. This is usually accomplished using two independent data sets: one for identification (also called training) which contains about 2/3 of all data points and another data set for validation with the remaining data. For these data sets some error criteria such as the Mean Least Square Error or the Bayesian Information Criterion (described in detail below) is calculated for the model residuals. Along the iterations, the best parameters are the ones where the selected criterion of the validation data set has its "best" value. A test data set can be used to additionally exploit the generalization capabilities.
Model performance criteria
The model residual, also addressed as the goodness of fit of the model estimates and the data, can be assessed with the Mean Square Error, MSE. The MSE decreases the better the fit and is defined as:
This criterion is directly linked to the least square error which is used for parameter identification.
Due to the reason mentioned above, the MSE criterion is not addressed when it comes to architecture, structure, model comparison or selection. Appropriate criteria are (i) the Akaike Information Criteria, AIC, which is wildly used or (ii) the Bayesian Information Criteria, BIC, which is more appropriate for datasets with more than 46 data points [11,26,27]. Therefore the BIC is applied for model comparison and selection in this study. The BIC is defined as:
where the term in the first bracket is the logarithmic maximum likelihood, π is the number "Pi" and n_{w }is the total number of parameters/weights. In terms of the BIC, the model to be selected is the one that exhibits the larger BIC value for the validation set, see [11,26,27].
Case Study I: Transcription Factor A (TFA) dynamics with discrete time delay
Genetic regulatory systems are built on signal transduction pathways through which specific transcription factors (TF) are phosphorylated. The phosphorylated TFs are then able to bind to responsive DNA sequences thereby regulating the transcription of nearby genes. Herein we consider the example of the TFA model reported by [15] and [16] (see Fig. 1A). In this case, the TF activates its own transcription according to a typical positive feedback loop.
Figure 1. Network Structures. Delay TFA transcription model. (A) true network structure (B) DDEHM network without prior knowledge, (C) DDEHM network with some prior knowledge. In structures (B) and (C), the ANN comprises three layers. The nodes of the input and output layer have linear transition functions, except for the input node of the time which has a hyperbolic tangential transition function as do the nodes of the hidden layer.
The translocation of macromolecules between cytosol and nucleus have a tremendous impact on gene regulation dynamics. Herein we consider a discrete delay for the translocation of TFA as suggested by [15] and [16], giving rise to the following single delay differential equation describing the dynamics of the TFA monomeric concentration in the nucleus, x :
The first term on the righthand side of eq. (14) is the rate of TFA transcription in the cytosol which in the perspective of nucleus is affected by the translocation delay, τ =120 min . The second term refers to TFA dissociation in the nucleus and the third term to a basal transcription rate, R_{bas }, observed at very low TFA concentrations.
Figure 2 shows the simulation of model eq. (14) with the parameters proposed by [15]. The TFA dynamics are of a typical bistable system induced by the increase of the cytosol synthesis rate, k_{f }, at time t = 200 min , forcing the system to jump to another state. The effect of the time delay can be assessed by comparing the fullline (with delay) with the dashedline (without delay). The main consequence of the delay is that the TFA concentration exhibits a "staircase" transition between the steadystates.
Figure 2. Impact of delays on the TFA profile. Demonstration of the impact of the delay on the trajectory of TFA transcription model over time. The TFA model trajectory without delay is the blue dashed line while the TFA trajectory with delay is the green continuous line.
The main goal in this case study is to investigate if the TFA delay dynamics, shown in Fig. 2, can be properly identified by the DDEHM framework proposed in this paper. With this goal in mind, 6 data sets of TFA concentration in the nucleus over time with varying initial concentrations were generated (3 data sets with "clean" data, which were corrupted with white noise in order to obtain the training, validation and test set data).
Formulation and discrimination of a suitable DDEHM structure
The two DDEHM structures, shown in Fig. 1B and 1C, were identified from the simulation data. In the former structure, no prior knowledge about the TFA network is incorporated while in the latter case some prior knowledge inspired in eq. (14) and in autoregulated systems is considered.
In preliminary studies, we concluded that structure (1c) leads to both a faster convergence and improved results than structure (1b) (results not shown). This observation is in line with the study reported by [10], where it was shown that including a priori knowledge in the hybrid structure generally improves their identification capacity.
A selection of results obtained with structure (1c) are presented in Table 1 showing model performance criteria for the training, validating and testing data sets (namely MSE and BIC) over structure parameters. Overall, it can be observed that structures without time delays are in general outperformed by those containing time delays if one of the effectual delays is close to the "true" delay, i.e. a model with a delay mismatch as high as 10% still gives an improved performance in comparison to no delay at all (Table 1). It can also be noticed that the MSE values for the case of one delay tend to improve the closer the effectual delay gets to the "true" delay, peaking when the effectual is the true delay. Also, it strikes that the best models (highest BIC values for the validation set) are obtained mostly for 47 nodes in the hidden layer, an observation that reflects the complexity of the addressed system. Owed to this complexity, are also the strayed deviations in the overall consistent performance in terms of BIC. The consideration of series of delays also gives rise to consistent models, especially if only two delays are considered. When three delays are considered, model performance increases with decreasing number of nodes, which contrasts with the results obtained for one or two delays. Even so, the best values therewith are obtained with 4 numbers of nodes. While the good model performances are due to the fact that the "true" delay is present in the applied models, the slightly worse performance when compared to the single delay models sources from the evitable, additional information. Evitable (correlated) information hampers the model structure identification [21,22], which explains why the model performance for three considered delays decreases with the increasing number of nodes.
Table 1. Results for Case Study I
The most consistent structure with highest predictive power has 5 nodes in the hidden layer and a single delay coincident to the "true" delay of 120 min. The respective BIC value was 5489 while the MSE value was 0,0071 for the test data set. The best structure without time delays, which had also 5 nodes in the hidden layer, showed a fourfold increase in the MSE value for the test data set (0,0210) and a considerably lower BIC value (5997). This result clearly demonstrates the advantage of the delay hybrid modelling approach proposed in this paper.
Comparison of best structures with and without time delays
Figure 3 compares the modelling results for the two best structures with delay τ =120 min and without delay for the validation and test data sets. It can be seen that the model without time delay provides a very smooth transition between the two steadystates. However, the "true" dynamics, i.e. the staircase transition function, of the measured data are not met. In contrast, the model with τ =120 min was able to capture these "staircase" dynamics. The curves both increase slightly in the beginning until a time value of about 320 min, where the first "stair" appears in the data.
Figure 3. Qualitative results on the time curse of TFA. TFA modelling results for the two runs of test data. On the left side the whole simulation region of the data set is shown while on the right side the most interesting section of the respective data set is highlighted: red circles are "measured" TFA data over time, green line are the identification results by model structure (1C) with 5 hidden nodes and τ =120 minutes; blue dashed line are the identification results by model (1C) with 5 hidden nodes and no time delay.
Thereafter, a significant increase can be noticed until a time of about 390 min, where the estimate has a first maximum peak. Then the concentration estimation decreases until a time point of about 450 min. Subsequently, in both cases, the data points are almost completely met by the estimates in the time interval from 450 until 650 min.
Case Study II: Heterologous protein expression by MUT+ Pichia pastoris
In methanol utilizing MUT+ Pichia pastoris strains, fast phenotype, foreign protein expression is controlled by the promoter of the alcohol oxidase gene 1 (AOX1). In typical culture conditions, the yeast cells are first grown on glycerol to reach a certain optimal cell density. Glycerol and most carbon sources other than methanol strongly repress AOX1, thus product is not formed in this phase. Then methanol feeding induces AOX1 over 1000fold [28] thereby initiating foreign protein expression. The transition between glycerol and methanol phases can take between 1 to 4 hours depending on the strategy for methanol feeding. This transition phase corresponds to the time needed by the cells to express the alcohol oxidase enzyme, which is an essential enzyme for the cells to metabolize methanol. Apart from this delay in the transition phase, time delays between methanol uptake, biomass growth and product formation were also observed during the postinduction phase, [2]. In the paper by Bellgardt and coworkers [2], a so called extended regulator model was adopted which is somewhat analogous to a linear distributed time delay kernel of the specific protein synthesis rate over the specific growth rate. The inclusion of such a delay model was essential to fit their data, although the underlying biological fundamentals are not clearly understood. The effects causing such time delays seem to be a principal part of the Pichia pastoris systems. However, they are poorly studied [29] and mechanistically not understood. Thus the nature of the apparent delays can mechanistically not be precisely defined (i.e. as a discrete delay model), wherefore a distributed delay model is the most appropriate representation.
The main goal in this section is to determine if the hybrid methodology proposed herein is able to effectively identify such unknown distributed time delay dynamics in P. pastoris. The P. pastoris network shown in Fig. 4A was used as a case study. This network includes a quadratic distributed delay kernel (Eqs. A5 and A6 of Table 2), which is considered as a strong delay kernel, see Fig. 5. In this case, the cell growth rate and the foreign protein expression rate are delayed in relation to the methanol uptake. The corresponding model equations are listed in Table 2. Note that the reactor balance equations are also listed since they are an important element to generate consistent simulation data.
Figure 4. Pichia pastoris network with delay dynamics. (A) network with a quadratic distributed time delay kernel of cell growth and protein expression over methanol uptake. The respective equations are listed in Table 2. This network was used to generate simulation data (B) Approximation of network (A) by a hybrid network. Structure (B) was investigated to see if the novel framework is able to identify unknown distributed delay dynamics.
Table 2. Mathematical model for data generation
Figure 5. Impact of delays on the specific biomass growth rate. Green full line, is the specific growth rate when considering the network shown in Fig. 4A; blue dashed line is the specific growth rate when no delay between substrate uptake and biomass growth is considered.
Identification of distributed time delays by DDEHM
The hybrid model structure shown in Fig. 4B was adopted to identify the network, Fig. 4A. The neural network assumes no prior knowledge about Fig. 4B and uses as the external excitation signal present delayed methanol concentration values. The specific methanol uptake rate, r_{S}(t), specific growth rate, μ(t), and specific product synthesis rate, r_{P}(t), are the target kinetic variables that need to be identified. Note that in the real system μ(t) and r_{P}(t) are delayed in relation to r_{S}(t) according to a quadratic distributed delay kernel (see Eqs. A9A12 in Table 2). These three kinetic rates are passed to the macroscopic reactor material balances for the calculation of the respective concentrations.
Table 3 shows model performance parameters (BIC and MSE) for the hybrid model structure (Fig. 4B) with varying number of hidden nodes (between 2 and 8) and different series of lagged input variables (between 0 and 4 with intervals of 2 or 2.5 hours). In general, the BIC values for the two series of lagged variables are much better than when no lagged variables are considered. This again confirms the advantages of the DDEHM methodology proposed herein for identifying delayed dynamics.
Table 3. Results for Case Study II
The effect of discrete time delays
It can be further noticed that BIC values of the time series with a time delay of 2.5(h) are slightly better when compared to the ones for time series with a 2(h) time delay. This observation agrees with the results of the previous case study where the performance of the hybrid model peaked the closer the model delay was to the true delay. In this case the maximal weighted delay is 5(h). Moreover, the BIC values tend to improve with increasing number of lagged variables. The increasing number of input lagged variables, which are weighted by the neural network, seem to result in more accurate discrete time approximations of the continuous distributed time delays. In contrast, it can also be observed in Table 3 that with increasing number of delays the best BIC value is more likely to be found for a lower number of nodes in the hidden layer of the ANN. However, this was expected as the BIC is constrained by the number of model parameters. Nevertheless, the same observation is made for the MSE values. Furthermore, it was observed that significantly more random changes of the parameter values were required when the numbers of incorporated delays increased in order to achieve results which were coherent with the ones obtained for smaller numbers of delays.
Comparing standard and DDE hybrid models
Figure 6 shows nonnoisy simulation data and the best modelling results of hybrid models with and without delays for the concentrations of biomass, substrate and product in a fedbatch of the test set. In the figures of biomass and product concentrations, predictions of the hybrid models with delays are practically congruent to the true process behaviour. The intrinsic dynamics of the organism are perfectly met. In clear contrast are the predictions of the hybrid standard model without time delays. Biomass and product concentrations are underpredicted for a time span between 50 to 85 h, then followed by overprediction from 85 h till the end. Before 50 h only insignificant differences between predictions and data are visible. For the substrate concentration, the DDEHM model shows a significant amount of error in a short time window from about 48 to 55 h, which is coincident to a fast decrease in the substrate concentration. It should be noticed that such fast dynamics are rather challenging to integrate (see comments below). As for the standard hybrid model without delays, it predicts accurately substrate dynamics only at the beginning, i.e. from 0 to 30 h. Thereafter, between 30 and 50 h, the model underpredicts substrate concentration, and after 50 h, it overpredicts the substrate concentration.
Figure 6. Qualitative results for trajectories of concentrations. Pichia Pastoris distributed delay modelling result for a fedbatch of the test data set: red circles are "measured" data over time; green line are the identification results by model structure (4B) with 5 hidden nodes and a series of 4 time lagged variables of 2.5 hours; blue dashed line are the identification results by model (4B) with 7 hidden nodes and no time delay.
Note about numerical integration of DDEs
The integration of the hybrid model differential equations using the builtin MATLAB solvers (dde23, ode23) showed to be computationally intensive. A typical training run took 4 to 5 days. Moreover, convergence was sometimes not accomplished due to the limitation of the integration step size and accuracy. On the other hand, the integration of the hybrid model differential equations with the linear approximation only lead to small discrepancies for substrate concentrations if the step size was chosen adequately small, i.e. between 0.050.1 h. Convergence was tested by decreasing even further the integration step without significant changes in the integration results. This approach lead to a reduction of computation of about 75% (i.e. 3 days) when compared to the MATLAB solvers.
Conclusions
Timedelays have a profound impact on cellular regulatory mechanisms. Therefore, their modelling is essential in metabolic engineering and process optimization studies. The detailed mechanisms behind observed time delays are often unknown. The required "omic" data for a fundamental mathematical modelling of such phenomena is generally unavailable at the required time resolution and accuracy. As a result, biochemical delayed dynamics are many times only "measurable" through their external consequences in terms of extracellular properties. We propose herein a hybrid semiparametric modelling method to identify such delayed dynamics. The principle is probing from outside to understand the inner workings. The concept was applied to two illustrative case studies. The overall results show that significantly better prediction qualities of the novel hybrid model when compared to the traditional approach were obtained in all case studies, being the more distinctive the more significant the underlying system delay is. When system and model delay are identical the model quality peaked but even with a delay mismatch as high as 10% in the TFA generegulatory network, modelling results were significantly enhanced in comparison to no delay at all. These results support a system delay identification strategy by studies of different discrete delays in the input variables. For the studies on Pichia pastoris with intrinsic distributed time delays significant enhancements were introduced by the DDEHM model. This suggests that even though the proposed structure bases on discrete time delays directly of external excitation variables, it poses no limitation of applicability. In conclusion the method proposed herein is a powerful tool to identify time delays in illdefined biochemical networks.
Authors' contributions
The project was conceived by RO, JP and SFA. The algorithms were developed by RO and MVS. Data analysis was performed by MVS and RO and manuscript writing were performed by MVS, JP, SFA and RO. All authors read and approved the final manuscript.
Acknowledgements
Sincere thanks for the financial support to the Fundação para a Ciência e a Tecnologia (scholarship reference no.: SFRH/BD/36990/2007) and to Anthony Danko for his corrections on the manuscript.
References

Daugulis AJ, McLellan PJ, Li J: Experimental Investigation and Modeling of Oscillatory Behavior in the Continuous Culture of Zymomonas mobilis.
Biotechnol Bioeng 1997, 56:99105. PubMed Abstract  Publisher Full Text

Ren HT, Yuan JQ, Bellgardt KH: Makrokinetic model for methylotrophic Pichia pastoris based on stoichiometric balance.
J Biotechnol 2003, 106:5368. PubMed Abstract  Publisher Full Text

Nikolov S, Vera J, Kotev V, Wolkenhauer O, Petrov V: Dynamic properties of a delayed protein cross talk model.
BioSystems 2008, 91:5168. PubMed Abstract  Publisher Full Text

Teixeira AP, Carinhas N, Dias JML, Cruz P, Alves PM, Carrondo MJT, Oliveira R: Hybrid semiparametric mathematical systems: Bridging the gap between systems biology and process engineering.
J Biotechnol 2007, 32:418425. Publisher Full Text

Wang YC, Chen BS: Integrated cellular network of transcription regulations and proteinprotein interactions.
BMC Systems Biology 2010, 420. PubMed Abstract  PubMed Central Full Text

Yang SK, Wang YC, Chao CC, Chuang YJ, Lan CY and Chen BS: Dynamic crosstalk analysis among TNFR, TLR4 and IL1R signalings in TNFalphainduced inflammatory responses.
BMC Medical Genomics 2010, 319. PubMed Abstract  PubMed Central Full Text

Wang YC, Lan CY, Hsieh WP, Murillo L, Agabian N, Chen BS: Global screening of potential Candida albicans biofilmrelated transcription factors via network comparison.
BMC Bioinformatics 2010, 1153. PubMed Abstract  PubMed Central Full Text

Schubert J, Simutis R, Dors M, Havlik I, Lubbert A: Hybrid modeling of yeast production processes combination of apriori knowledge on different levels of sophistication.
Chem Eng Technol 1994, 17:1020. Publisher Full Text

Preusting H, Noordover J, Simutis R, Lubbert A: The use of hybrid modelling for the optimization of the penicillin fermentation process.

Oliveira R: Combining first principles modelling and artificial neural networks: a general framework.
Comput Chem Eng 2004, 28:755766. Publisher Full Text

Peres J, Oliveira R, Feyo de Azevedo S: Bioprocess hybrid parametric/nonparametric modelling based on the concept of mixture of experts.
Biochem Eng J 2008, 39:190206. Publisher Full Text

Psichogios DD, Ungar LH: A hybrid neural networkFirst principles approach to process modelling.
AIChE J 1992, 38:14991511. Publisher Full Text

Bocharov GA, Rihan FA: Numerical modelling in biosciences using delay differential equations.
J Comput Appl Math 2000, 125:183199. Publisher Full Text

Wolkowicz GSK, Xia H: Global Asympotic Behavior of Chemostat Model with Discrete Delay.

Smolen P, Baxter DA, Byrne JH: Effects of macromolecular transport and stochastic fluctuations on dynamics of genetic regulatory systems.

Tian T, Burrage K, Burrage PM, Carletti M: Stochastic delay differential equations for genetic regulatory networks.
J Comput Appl Math 2007, 205:696707. Publisher Full Text

Wolkowicz GSK, Xia H, Ruan S: Competition in Chemostat: A Distributed Delay Model and its Global asymptotic Behavior.

Rateitschak K, Wolkenhauer O: Intracellular delay limits cyclic changes in gene expression.
Math Biosci 2007, 205:163179. PubMed Abstract  Publisher Full Text

Chen BS, Chang YT: A systematic molecular circuit design method for gene networks under biochemical time delays and molecular noises.
BMC Systems Biology 2008, 2103. PubMed Abstract  PubMed Central Full Text

Chen BS and Chen PW: On the estimation of robustness and filtering ability of dynamic biochemical networks under process delays, internal parametric perturbations and external disturbances.
Math Biosci 2009, 222:92108. PubMed Abstract  Publisher Full Text

Haykin S: Neural Networks  A Comprehensive Foundation (2nd edition). Prentice Hall Inc., Upper Saddle River, New Jersey, America; 1999.

Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press Inc., New York; 1995.

Parlos AG, Rais OT, Atiya AF: Multistepahead prediction using dynamic recurrent neural networks.
Neural Networks 2000, 13:765786. PubMed Abstract  Publisher Full Text

Qin SJ and McAvoy TJ: Nonlinear FIR Modelling via a Neural Net PLS Approach.

Coleman TF, Li Y: On the Convergence of Reflective Newton Methods for LargeScale Nonlinear Minimization Subject to Bounds.
Math Program 1994, 67:189224. Publisher Full Text

Burnham KP, Anderson DR: Multimodel inference  understanding AIC and BIC in model selection.
Sociol Method Res 2004, 33:261304. Publisher Full Text

Leonard T, Hsu J: Bayesian Methods. Cambridge University Press, New York; 1999.

Khatri N, Hoffmann F: Oxygenlimited control of methanol uptake for improved production of a singlechain antibody fragment with recombinant Pichia pastoris.
Appl Microbiol Biotechnol 2006, 72:492498. PubMed Abstract  Publisher Full Text

Yamashita S, Yurimoto H, Murakami D, Yoshikawa M, Oku M, Sakai Y: Lagphase autophagy in the methylotrophic yeast Pichia pastoris.
Genes Cells 2009, 14:861870. PubMed Abstract  Publisher Full Text