Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, 39106 Magdeburg, Germany

Abstract

Background

Combinatorial complexity is a challenging problem in detailed and mechanistic mathematical modeling of signal transduction. This subject has been discussed intensively and a lot of progress has been made within the last few years. A software tool (BioNetGen) was developed which allows an automatic rule-based set-up of mechanistic model equations. In many cases these models can be reduced by an exact domain-oriented lumping technique. However, the resulting models can still consist of a very large number of differential equations.

Results

We introduce a new reduction technique, which allows building modularized and highly reduced models. Compared to existing approaches further reduction of signal transduction networks is possible. The method also provides a new modularization criterion, which allows to dissect the model into smaller modules that are called layers and can be modeled independently. Hallmarks of the approach are conservation relations within each layer and connection of layers by signal flows instead of mass flows. The reduced model can be formulated directly without previous generation of detailed model equations. It can be understood and interpreted intuitively, as model variables are macroscopic quantities that are converted by rates following simple kinetics. The proposed technique is applicable without using complex mathematical tools and even without detailed knowledge of the mathematical background. However, we provide a detailed mathematical analysis to show performance and limitations of the method. For physiologically relevant parameter domains the transient as well as the stationary errors caused by the reduction are negligible.

Conclusion

The new layer based reduced modeling method allows building modularized and strongly reduced models of signal transduction networks. Reduced model equations can be directly formulated and are intuitively interpretable. Additionally, the method provides very good approximations especially for macroscopic variables. It can be combined with existing reduction methods without any difficulties.

Background

Modeling of signaling pathways

Systems biology aims at a holistic understanding of cellular processes. Mathematical models that integrate the current state of knowledge are analyzed to understand system properties that are not apparent from the characteristics of their components.

Different approaches exist to model and analyze signal transduction systems. Qualitative modeling uses solely structural information about the network and performs no quantitative statements. Examples for qualitative modeling techniques applied to signal transduction pathways are Petri nets

Inside the cell, signals propagate in time and space. Therefore, partial differential equations should be used to describe them exactly. This continuous spatial behavior can often be modeled by assuming fast distribution of all species inside a cellular compartment and transport reactions between the compartments. This leads to model equations in the form of ordinary differential equations (ODEs). If processes are modeled on a molecular level mass action kinetics are a good description of the chemical processes and are frequently used. This view is the basis for the work presented here and also used by many others

The more information about the system is available, the more powerful quantitative modeling techniques become, as prediction accuracy of models grows.

Combinatorial variety

Modeling with ODEs is challenged by biological reality. In signal transduction, association and modification of a relatively small number of different molecules usually give rise to an enormous amount of possible protein complexes

Many deterministic models of large signaling networks that have been published within the last years neglect the combinatorial variety of protein complexes. Their focus is on small subsets of the occurring reactions and complexes

An alternative approach was suggested by Blinov et al.

The exact domain-oriented lumping technique proposed by Conzelmann et al.

Insulin signaling and combinatorial variety

The insulin signaling system is of high medical interest and therefore well studied

Defects in the insulin signaling system give rise to insulin resistance, obesity and type II diabetes mellitus

In a simplified form the insulin signaling system will serve as demonstration object for combinatorial complexity. The insulin receptor is a transmembrane protein that is constitutively dimerized

However, first we analyze complexity on a virtual monomer. The receptor monomer (which consists of an

Combinatorial complexity

**Combinatorial complexity**. The full combinatorial complexity of the described parts of insulin signaling is demonstrated. The insulin receptor can bind Insulin, Shc and IRS. IRS can bind four PI3K molecules, SHP2 and Grb2. Grb2 can bind SOS and phosphorylated SOS. This results in 3^{5}·5 = 1215 different complexes with IRS. For Shc and insulin binding to the receptor monomer there are seven and two possibilities, respectively. Altogether there are ^{5}·5 + 2) = 17038 different complexes of the receptor monomer. As the receptor is a dimer (^{8 }different combinations. Free species contribute another 1215 + 10 + 1 + 1 + 1 = 1228 possible species: 1215 for the free IRS complexes, 10 for all combinations of Shc, Grb2 and SOS and one for insulin, PI3K and SHP2 each.

Results and discussion

Introduction of the layer based approach

The two most important mathematical tools to tackle the enormous complexity of models describing biological reaction networks are model reduction and modularization techniques. Here, we introduce a new systematic approach which allows to create considerably reduced models of signaling networks and also provides a new modularization criterion. This new modularization criterion suggests to separate molecular processes (bindings and post-translational modifications) depending on the types of interactions between them.

The basic idea is that three different types of interactions between two processes in signal transduction networks exist. The first interaction type is called all-or-none interaction. In this case, there exists a causal relationship between the two processes, which means that one process must occur before the second process. Most frequently, this kind of interaction is between binding site phosphorylation and effector binding. These two processes can be divided into four molecular events (phosphorylation, dephosphorylation, binding, dissociation). The effector can only bind if the binding site is phosphorylated and dephosphorylation is only possible in the absence of effector. This means that phosphorylation is required for binding and dissociation is required for dephosphorylation.

The second interaction type is called graded interaction. There, two processes influence each other mutually, as it is the case for ligand binding and autophosphorylation of a receptor. This means that kinetic parameters for one process are influenced by the other process. Note that this allows unidirectional interactions, where the first process influences the second, while the second does not influence the first, and bidirectional interactions, where both processes influence each other.

The third type are non-interacting, independent processes. They do not influence each other directly. Therefore this type represents in fact no interaction. The concept of all-or-none interactions and graded interactions is demonstrated in Figure

Graded and all-or-none interactions

**Graded and all-or-none interactions**. **A) **The reaction cycle of ligand binding and phosphorylation consists of four species that are connected by four reactions. This is a general property of graded interactions. **B) **The processes are coupled via an all-or-none interaction. Therefore the species **C) **All-or-none interaction between binding site phosphorylation and effector binding. The species **D) **The reaction cycle degenerates to a reaction chain which is a hallmark of all-or-none interactions. **E) **In this case, nomenclature can be simplified as the two sites on the receptor are essentially one.

According to our new modularization criterion, no graded interactions are allowed between different modules. Processes of different modules, which we call layers, interact only via all-or-none interactions. Interestingly, the so defined layers only exchange information. The number of molecules in each layer stays constant when no synthesis or degradation is considered. This is a main difference compared to metabolic pathways where mass flows occur and represents the characteristic signal flow within a signal transduction network. Considering the fact that in signaling an extra-cellular signaling molecule causes transmission of a signal to the nucleus of a cell without passing the cell membrane, this difference is most obvious. The signals that are exchanged between layers correspond to a very restricted number of macroscopic variables like levels of occupancy or phosphorylation as they are used by Borisov et al.

We also contribute a new aspect to the discussion about modularization of biological networks and the optimal criterion for modularization (see also

In detailed modeling binding or modification events are represented by a huge number of reactions, since the involved proteins can exist in a high number of feasible configurations. The sum of a certain subset of these reaction rates defines a gross reaction rate of binding or modification. The sum of a certain subset of species corresponds to macroscopic quantities as degrees of phosphorylation or occupancy. In the following we show that in most frequent biological scenarios gross rate kinetics can be formulated using macroscopic variables and have a quite simple structure. Mathematical analysis of detailed and reduced models shows that the dynamics of essential macroscopic quantities is highly preserved in most cases. We give qualitative and also some quantitative information about the approximation quality for varying kinetic parameters. A basic advantageous feature of models that are created with the layer based approach presented here is intuitive interpretability. Additionally, preceding generation of detailed model equations is not necessary, as reduced models can be built directly by an intuitive procedure, for which a step by step procedure is given. There exists also a mathematical formalism to derive the reduced model equations. Thus, one can access model generation intuitively or by a mathematical formalism. Since both approaches are equivalent, understanding of the mathematical part is not necessary to create reduced models.

The approach combines qualitative and quantitative system descriptions. The first, qualitative step identifies processes and their interactions to define modules. Inside these modules, processes are described highly reduced using quantitative techniques. All methods of quantitative system analysis then can be applied to the model.

A general problem in modeling of signal transduction is availability of experimental data. Especially kinetic data is difficult to measure. Obviously, the layer based approach has also to face this problem. However, the same kinetic parameters can be taken for the macroscopic quantities that are states of the model as in detailed mechanistic modeling. Therefore, the problem does not become worse when using the reduced modeling technique.

The layer based method can also be combined with the exact domain-oriented lumping technique

Necessary equations for modeling the insulin signaling system

Scenario

Detailed modeling

Exact lumping

Layer based reduction

Combination

1

145 156 468

145 156 468

214

214

2

145 156 468

212

214

56

In two scenarios layer based reduction, domain-oriented lumping and their combination are compared with respect to the number of necessary differential equations. The insulin signaling system is used as example system. Insulin concentration is assumed to be constant. The two binding sites on the receptor for Shc and IRS in each case are assumed to be equivalent. **1)**: All binding site phosphorylations and corresponding bindings are all-or-none interactions. All binding sites on the same molecule can perform graded interactions. **2)**: All binding site phosphorylations and corresponding bindings are all-or-none interactions. The phosphorylation state of binding sites does not influence phosphorylation of other binding sites on the same molecule. See Additional files

The approach does not have to be used rigorously throughout the model. Subsystems can also be modeled in the detailed formalism, which simplifies integration of existing models.

A simple example system

In the following the reduction method is presented by introducing general principles, which are exemplified considering a strongly simplified model of insulin receptor signaling (see Figure

Visualization of the detailed kinetic model

**Visualization of the detailed kinetic model**. All possible reactions of the detailed model for the small example system are shown. The model equations are shown in Equation 1.

_{0 }and _{0 }are the total concentrations of receptor

Definitions

a) Molecules and complexes

In order to provide a complete and general description of our method, we introduce a formal nomenclature. Due to its generality this nomenclature might appear cumbersome. A simplified nomenclature can be used in most cases because many examples only comprise a small subset of all formally possible cases. In our examples we consistently use a simplified denotation.

Consider a general signaling protein _{i, a }(e.g. not modified) to other states _{i, b }(e.g. phosphorylated). We denote the molecule with a certain configuration of domain modifications as _{R}[_{1}..._{n}]. We distinguish between the molecule _{R}[_{1}..._{n}] and its concentration _{R}_{1}..._{n}. In a second step we also consider binding of other molecules to a certain binding site of _{E}[_{1}..._{k}], which also provides a number of sites. One of them can bind to _{R}[_{1}..._{n}], _{E}[_{1}..._{k}]}.

The general rule for representation of complexes is as follows. All molecules within a complex are listed comma separated within curly brackets. On each occupied binding site the name of the binding partner is indicated. _{R}[_{1}..._{n}], _{E}[_{1}..._{k}], _{F}[_{1}..._{q}]}. If the same molecule occurs more often than once within the complex, indices have to be used. A graph-oriented molecule representation is suited to solve the problem of correct denotation in difficult cases.

Since our method often works with lumped states comprising a number of molecular species, we introduce the symbol 'X', which is a replacement character for each possible state of a binding site. An 'X' within the site definitions of the molecules indicates all possible modifications on one site. A sequence of three dots following 'X' or before 'X' indicates that there may be additional sites and abbreviates a sequence of 'X'. A sequence of dots between site configurations other than 'X' indicates other sites with distinct configurations (not 'X'). _{R}[_{R }[_{R}[_{R}[_{R}[_{R}[_{R}[

b) Rules and reactions

According to Blinov et al.

_{R}[_{R}[

and a binding reaction as

_{R}[_{E}[_{R}[_{E}[

These rules shall be interpreted as a set of elementary reactions, which all can be modeled using the mass action law. If we consider the simple example defined above a possible reaction rule is

which defines all three elementary reactions describing _{1}, _{3 }and _{7 }in Equation 1 and Figure

are equivalent, since both describe the reactions _{2 }and _{4 }in Equation 1 and Figure

Our goal is to formulate reduced order models in terms of macroscopic chemical species (denoted by a leading R) and macroscopic gross reactions (denoted by r) converting them. The phosphorylation of a binding site may be described in a reduced manner by the gross reaction

_{R}_{R}

with the rate

_{R}_{R}

The corresponding submodel consists of the two ODEs for the macroscopic variables _{R}_{R}_{j }in dependence of the concentrations of the macroscopic species. In this work we suggest a method for deriving approximative expressions for gross reaction rates.

c) Different types of interactions

As introduced earlier, there exist three structurally different types of interactions. Now we discuss the three types by means of our example (Figure

We start considering the special case of non-interacting processes. If the kinetic parameters describing the reactions _{1 }and _{3 }are identical and the parameters of _{2 }and _{4 }are also identical, the two processes – in this case phosphorylation and binding of

A structurally different type of interaction is given if there is a causal relationship between the two processes, which means that one processes must occur before the second process. In many cases this probably will only be an approximation to the real interaction. However, it allows to strongly simplify the model. One can assume that receptor phosphorylation only occurs if ligand is bound to the receptor and the species _{2 }and _{3 }in Figure

d) Modularization: layers

We define that all binding and modification events coupled by graded interactions form a module which we call a layer. Note that the definition of layers is according to interactions of processes and not according to molecules. Roughly speaking, a layer contains processes, not molecules. However, layers are often dominated by the reactions of a single molecule.

Diffierent layers are only coupled by all-or-none interactions. The definition of layers and resulting modularization is demonstrated for two example systems in Figure

Interactions define layers

**Interactions define layers**. Processes (white boxes) are coupled by graded (red lines) and all-or-non interactions (green lines). All processes that are coupled by graded interactions are merged into the same module, which is called layer (blue boxes). Therefore, layers are only connected by all-or-none interactions. **A) **A small example system that is discussed in detail. Occurring processes are ligand binding, binding site phosphorylation and effector binding. The reaction scheme is shown in Figure 5. **B) **A larger example system that is discussed in Additional file

An important feature of this modularization is that it only depends on the definition of interactions. The introduction of additional graded interactions between processes of different layers leads to disruption of the modular structure. In most cases these layers now form one larger common layer.

The terminus 'layer' results from the typical structure of the modules. As described, usually binding of a signaling molecule and its modifications (usually phosphorylation) form a layer. There is also the possibility that a layer comprises only one process, typically a binding process, as can be seen for

The layers are interconnected by signal flows. This means that information about macroscopic quantities of a layer is exchanged with other layers (signal flow). No mass flows cross layer boundaries. This means, that no reaction exists that transports substance from one layer to the other. Therefore, in the absence of protein synthesis or degradation, the sum of concentrations of all species for each molecule remains constant within the layers. Within layers there are mass flows defined by reaction equations and corresponding rates as in detailed modeling.

Altogether, layer based reduced modeling and modularization results in a highly structured reduced model that is characterized by mass conservation within layers and signal flow between layers.

Description of the reduction method

First, we consider the more academic case that all processes within a reaction network either do not interact with each other or provide an all-or-none interaction. Afterwards we will consider general networks also including graded interactions. General considerations are illustrated using the simple example defined above.

a) Networks without graded interactions

**General considerations: **We assume that within a reaction network all occurring processes do not interact, except domain phosphorylation and subsequent effector bindings that are characterized by all-or-none interactions. Phosphorylation usually can be considered as an essential precondition for binding an effector protein. The reverse modification (dephosphorylation) is prevented by binding of this molecule. Both conditions have to hold, otherwise the interaction between phosphorylation and binding will be graded.

We dissect the pathway into layers as defined above. Since the whole network does not contain any graded interaction, each process can be described in a separate layer. Interestingly, the network has another nice property. All rates _{i }that describe one of the occurring processes, e.g. ligand binding, are parametrized by the same kinetic constants in the detailed model. The sum of all corresponding rates _{i }defines a gross rate _{tot }(Equation 8), which in this case can be interpreted as macroscopic mass action kinetics.

Herein, _{gross }is the set of all reactions describing the binding of _{R }[_{R}[_{R}_{R}_{E}

_{R}_{R}_{R}_{E}

Obviously, the same considerations can be made concerning modification processes. So, it is only necessary to balance capital letter species.

Gross rates of bindings as well as modifications can be formulated using the mass action formalism, as the parameters of all single elementary processes are equal. This is also discussed by Borisov et al.

**Example: **The example (Figure

Two binding layers describing ligand and effector binding, and one modification layer describing receptor phosphorylation. Ligand binding can be described by the reaction rule

which can be expanded to the reaction rates _{1}, _{3 }and _{7}. Since all of these reactions are parametrized by the same kinetic parameters we combine them all to one single gross reaction

with the gross reaction rate

The gross reaction

describes binding site phosphorylation and corresponds to the reaction rule

which can be expanded to the reaction rates _{2 }and _{4}.

The corresponding gross reaction rate is given by

in which the lowercase letter concentration

The gross reaction

describes effector binding and corresponds to the rule

which can be expanded to the reaction rates _{5 }and _{6}. The gross reaction rate is

Note that _{E }= _{5 }and _{-E }=

with _{0 }and _{0 }as total concentrations of

Note that instead of

b) Networks including all kind of interactions

To introduce the general reduced modeling concept, where all kinds of interactions are allowed, we start with an example to illustrate the main features.

**Example: **Again we consider the simplified insulin model introduced above. Now, we assume that ligand binding unidirectionally influences receptor phosphorylation which in turn is an essential precondition for effector binding. Ligand binding and effector binding do not interact directly. From this it follows that the reaction rates _{1}, _{3 }and _{7 }are all parametrized by the same kinetic rate constants. The same holds true for the reaction rates _{5 }and _{6}.

In the reduced model there are two layers. The receptor layer describes ligand binding and receptor phosphorylation, the effector layer effector binding (Figures

Visualization of the reduced kinetic model

**Visualization of the reduced kinetic model**. In this small example system the left module is the receptor layer which includes ligand binding and phosphorylation of the binding site on the receptor. The right module is the effector layer, where binding of

Observe, that these six equations are linearly dependent.

The connection between the two layers, i.e. the information exchange, is given by

If we compare the reactions of the reduced model (Figure

As already mentioned, the reaction rates that are merged together (_{3 }and _{7 }as well as _{5 }and _{6}) have the same kinetic rate constants.

Our model shall provide equations for all variables that are given in Equation 22. Hence, all the _{1}, _{3 }and _{E }can be written using the reduced _{2 }and _{4 }one requires the micro-states

with _{I }is the fraction of unoccupied sites from all phosphorylated sites (bound and unoccupied). Note that _{2 }and _{4 }read as in Equation 26a. As occupied binding sites cannot be dephosphorylated, the sum of occupied binding sites (

The approximation bases on the assumption that ligand and effector binding are completely independent.

Accordingly, the calculus of probability suggests that the ratio equation

is fulfilled

Hence, it is possible to reconstruct all micro-states of the detailed model. The accuracy of this reconstruction highly depends on the validity of the ratio equation (Equation 27). Observe, that an erroneous ratio equation may not have similar strong impact on the accuracy of the reduced model states, like e.g.

Kinetic parameters and initial conditions for the small example system

Parameter

Literature value

Unit

Source

_{1}

0.001

^{-1}
^{-1}

[52]

_{-1}

4·10^{-4}

^{-1}

[52]

_{2}

0

^{-1}

ass.

_{-2}

0.00385

^{-1}

[51]

_{4}

0.0231

^{-1}

[50]

_{-4}

0.00385

^{-1}

[51]

_{5}

0.033

^{-1}
^{-1}

[53]

_{-5}

0.113

^{-1}

[53]

Initial conditions were 40 ^{-20 }_{1 }= _{3 }= _{7}, _{-1 }= _{-3 }= _{-7}, _{5 }= _{6 }= _{E }and _{-5 }= _{-6 }= _{-E}.

Simulation results: comparison reduced and detailed model

**Simulation results: comparison reduced and detailed model**. For parameter values from literature (Table 2), the deviations of the lumped states from the corresponding sums of states of the detailed model (Equation 22) are negligible. Reconstitution of states of the detailed model is possible with high accuracy. The axis of abscissae is given in

**Summary: **Only one variable has actually been eliminated by model reduction, but the example serves to illustrate the main elements of the method. These are:

• Two kinds of real interactions between two processes exist: all-or-none interactions and graded interactions. The third possibility is that these two processes do not interact.

• Modularization is achieved by analyzing interactions between processes. No graded interactions are allowed between modules which are called layers. Layers are only connected by all-or-none interactions.

• Each layer is modeled independently of the others. Gross reactions are formulated that correspond to reactions of macrostates.

• Dephosphorylation reactions of binding sites need special attention, as approximation of lowercase species is necessary. All other reactions can be formulated as in detailed mechanistic modeling, however, using also macroscopic species.

• Concentrations of macroscopic species (or sums of them) are transfered between the layers. Between layers there is only signal flow, but no mass flow.

• Combinatorial complexity is decreased, as all-or-none interactions between layers reduce the number of binding events and bound species by introducing a macroscopic description of the processes.

• Reduced model equations can be obtained directly without previous generation of detailed mechanistic model equations.

• Model equations can be understood and interpreted intuitively. Model variables are macroscopic quantities that are converted by rates following simple kinetics.

**General considerations: **In reaction networks including graded interactions the formulation of gross reaction rates as defined above becomes more difficult than only with all-or-none interactions. We again start by dissecting the whole network into layers. Processes coupled by graded interactions are merged into one layer. All binding and modification processes within a layer must be directly or indirectly linked by graded interactions, while the different layers are only coupled by all-or-none interactions. The coupled layers only exchange information about macroscopic variables, like phosphorylation degrees and levels of occupancy.

Now we assume that the processes of each layer form an isolated network and formulate a detailed kinetic model of solely these processes. Since processes from other layers are neglected in these considerations combinatorial variety is highly reduced. Each state of the submodel can be interpreted as a sum of states of the complete model, and each single reaction in the submodel represents a number of reactions in the complete network. Interestingly, all reactions of the complete network corresponding to a certain reaction in the subnetwork are parameterized by the same kinetic parameters. The reason for this can be found in the definition of layers. Since there are no graded interactions between layers, alterations in other layers do not change the kinetic properties inside the considered layer.

Observe, that we define the reactions of the isolated partial network as gross reactions. We stated earlier that if all reactions forming a gross reaction have the same kinetic parameters it can be formulated using the law of mass action. However, one has to accommodate these rates by including mass conservation relations to eliminate lowercase species from the description (as in Equation 9). As processes within a layer can be modeled separately each layer has to fulfill certain conservation relations. The receptor layer e.g. is characterized by a conservation relation for

_{R}_{1}..._{n }≈ _{I}·_{R}_{1}...._{n}, 0 ≤ _{I }≤ 1

where _{I }is a correction term that is the fraction of phosphorylated binding sites that is unoccupied.

It is assumed that this fraction is identical for all species _{R}[_{1}...._{n}]. Note that the factor _{I }is time dependent as the fraction of phosphorylated binding sites that is unoccupied may change over time. A detailed discussion of approximation quality will be given in the mathematical background section. The considered gross rates for the phosphorylation of _{I }(Equations 29 and 30).

Consistent initial values

When comparing the reduced and detailed models the initial conditions have to be chosen that transformation equations hold for the starting point. This is guaranteed for the example system (Figure

Another aspect of choosing proper initial conditions arises from approximation of lowercase species (Equation 29). As there is division by _{R}_{I }(Equation 30), all _{R}_{R}_{R}_{R}

Mathematical background

In this section we analyze the presented reduced modeling method from a mathematical point of view. First, we introduce some general mathematical considerations about model reduction. Afterwards we will show that the layer based reduction method also fits into this general procedure. This will help us to evaluate the method and make statements about approximation errors.

General considerations

The layer based approach allows to directly generate reduced model equations, a step by step procedure for this is given after the mathematical background. However, in common model reduction techniques the starting point is a detailed mechanistic model of the form

Herein **R**^{n }denotes the state vector, **R**^{m }the system inputs and **R**^{q }the system outputs. Now, the objective is to find another mathematical representation of the dynamic model which allows to approximately describe the output variables by a reduced state vector. In order to achieve this reduction one has to transform the original dynamic system to new coordinates

If

The relevant states

Now, we show that the layer based reduction method fits into the previously introduced general pattern of model reduction. Hence, we first have to define the two set of states

If we consider the example shown in Figure _{n}) have to be chosen such that the resulting transformation

which transforms the states

In the transformed system the structure of the ODEs for

However, the transformed algebraic equations now also have a different form:

If one now replaces

Approximation of neglected states

After having defined the new coordinates

Borisov et al. showed that if this equation is fulfilled at a point of time _{0 }it will be fulfilled for all times _{0}. This equation can be simplified by elementary transformations to

which is equivalent to the ratio Equation 27.

In both the examples here and in Figure

The assumption of rapid equilibrium

Approximation quality

In order to mathematically analyze approximation of these ratio equations, we consider a reaction cycle with four different influxes _{i }(see Figure _{i }as shown in Figure

Typical reaction cycle with input fluxes

**Typical reaction cycle with input fluxes**. The in-fluxes _{i }result from indirect interactions between processes of different layers. As the processes are assumed to be independent, the rates _{1 }and _{3 }are parametrized by _{1 }and _{-1}. Additionally, the rates _{2 }and _{4 }are parametrized by _{2 }and _{-2}.

Interestingly, this error function fulfills the linear differential equation

with

_{1 }+ _{-1 }+ _{2 }+ _{-2}

and

_{1}·_{2}·_{3}·_{4}·

where the rates _{1 }and _{3 }are parametrized by _{1 }and _{-1 }and the rates _{2 }and _{4 }are parametrized by _{2 }and _{-2}.

It is apparent that the error will completely vanish if

and the dynamic error as

In order to provide at least a rough estimation of the maximal error we assume that _{max }= max (

These equations show that both the steady state error as well as the maximal dynamic error decrease for increasing values of _{1}, _{-1}, _{2 }and _{2}, and is zero for one of these values going to infinity. Note, that even for a large error

Transformation and stationary error

Click here for file

Transformation and stationary error

Click here for file

The modeling procedure – step by step

The following procedure provides a guide to build reduced models step by step. This procedure is used to generate a large model of insulin signaling that covers all processes discussed in the introduction (Additional file

Layer based reduced modeling of insulin signaling

Click here for file

1. Identify all processes (Figure

2. Define layers: all processes that are coupled by graded interactions are within the same layer. Layers are coupled by all-or-none interactions or do not interact (Figure

3. Model each layer individually.

(a) Define all sums of phosphorylated binding sites _{i }(e.g. _{i}

(b) Define the concentrations of all unoccupied phosphorylated binding sites that are needed as binding partners within the considered layer (e.g.

(c) Define rules and reactions (including dephosphorylation of binding sites) as if there were no other layers and in particular as if there was no binding of effectors (see Figure

(d) Translate each reaction into the corresponding rate by using the desired kinetic law. This step is analogous to detailed mechanistic modeling. For dephosphorylation reactions of binding sites multiply the expression describing dephosphorylation with (_{i }- _{i}_{i }using the appropriate _{i}. This ensures that only unoccupied binding sites are dephosphorylated (see Equation 26a).

(e) Optional: for each molecule that is not degraded or synthesized a conservation relation can be formulated (e.g. _{0 }-

(f) Construct ODEs as a sum of rates for each species that is used in this layer and not defined by an algebraic equation (see Equation 26c).

4. Additional information transfer between layers is allowed, as long as no additional graded interactions are introduced (e.g. _{activ }in Additional file

Documentation and analysis of the larger example system

Click here for file

This procedure also outlines how automation of the modeling procedure can be achieved. Steps 1) to 3c) most probably will be performed by the user, whereas expansion of rules and generation of rates and ODEs could be automated. The modeling procedure then remains in close similarity to automated rule based building of detailed mechanistic models by BioNetGen

A larger example system

To demonstrate the method on a more realistic example we study an extended subsystem of insulin signaling. This subsystem was also employed to demonstrate the modularization criterion (Figure ^{3 }= 8 differential equations (2 possibilities each for insulin binding, binding site and regulatory phosphorylation), the _{activ }is transfered to the

Layer based reduced model of the larger example system

Click here for file

Layer based modularization of the larger example system

**Layer based modularization of the larger example system**. In the larger example system the left module describes the receptor layer. The output of this layer is the sum of all species which are phosphorylated on the binding site for _{activ}. All reactions are reversible, arrows definne directions of positive rates. For equations see Additional file

Simulation results: comparison reduced and detailed model, larger example system

**Simulation results: comparison reduced and detailed model, larger example system**. For parameter values from literature (Additional file

Optimization study of the larger example system

An optimization study was performed to analyze the worst case scenario within physiologic parameter ranges. Measures for reduction quality are the errors in

Synthesis, degradation and transport of proteins

Up to now it was assumed that there is no protein synthesis or degradation. However, synthesis and degradation of free unmodified proteins is easy to handle. The rate of synthesis or degradation just has to be included in the differential equation for the free species. Degradation or synthesis of complexes is also possible though often not being as easy to realize. If a scaffold protein or even the receptor is to be degraded one has to observe that there exist lumped states in several layers that correspond to complexes with this effector or the receptor. In this case, the rate of degradation or synthesis has to be considered in different layers to guarantee consistency. Now more communication between the layers is necessary. The same rates that are modified by different correction terms to reflect complex composition occur in distinct layers. Transport between different compartments can be handled as degradation in one compartment and synthesis of the same complex in the other. Therefore, synthesis, degradation and transport of complexes is possible with the layer based formalism. Concise modular structure of the model with a minimum of information transfer between the layers is yielded when these processes are limited to free protein species.

Outline: application on insulin signaling

A model for the insulin signaling system which includes all events that were mentioned in the introduction can be built with 64 + 128 + 4 + 11 + 5 + 2 = 214 differential equations instead of 1.5·10^{8 }in the detailed case. The 214 equations for the reduced model are composed as follows. 2^{6 }= 64 equations in the receptor layer describe binding of two insulin molecules and phosphorylation of four binding sites (two for Shc and two for IRS). The 2^{7 }= 128 equations of the IRS- layer derive from six binding sites, each of them can be phosphorylated and unphosphorylated. IRS can be bound to the receptor and unbound. 4 equations are needed for the Shc layer (Shc binding to the receptor and becoming phosphorylated). SOS and Grb2 are merged into one layer. This allows SOS binding to Grb2 influencing Grb2 binding to IRS and Shc. The corresponding layer is described by 11 equations (six equation describing binding of complexes of Grb2 and SOS to IRS and Shc, five equations for free species, remember that SOS can be phosphorylated). The PI3K layer contains 5 differential equations (binding to four binding sites) and the SHP2 layer 2 (binding to IRS). Free species are considered. The two binding sites on the receptor for Shc and IRS in each case are assumed to be equivalent. Other underlying assumptions are specified in Table

Reduction ratios and combination with domain-oriented lumping

As shown above for insulin signaling combinatorial complexity increases as the effector is subject of additional modification and binding events. The reduction potential of the layer based reduced modeling method strongly increases with increasing combinatorial complexity. For the small example system the number of equations is reduced by 20%, for the larger example system by 48%. The fraction of necessary equations for the signaling system as described in the introduction is reduced by 99.9999 %. This illustrates that even large systems can be described with a number of equations that can be handled by manual modeling. However, molecules with many binding sites or regulatory phosphorylation sites are still difficult to handle. Here combination with the domain-oriented lumping technique

Layer based reduced model of insulin signaling (214 ODEs)

Click here for file

Application of the domain oriented approach to the insulin model with 214 ODEs

Click here for file

Reduced model resulting from combination of both approaches (56 ODEs)

Click here for file

So, combination of layer based reduced modeling with the domain-oriented approach

Conclusion

We present a reduced modeling approach which allows to tackle the problem of combinatorial complexity in signal transduction and regulation networks. For physiologic systems combinatorial complexity is dramatically decreased, as demonstrated on insulin signaling. Similarly to Pawson and Nash

A modularization principle is introduced. There, one has to distinguish between graded interactions and all-or-none interactions. Modules, which correspond to layers of signal transduction, are chosen such that they only interact via all-or-none interactions. All processes inside layers are connected directly or indirectly via graded interactions. For each layer we formulate a detailed reaction scheme comprising all processes of the current layer but neglecting all other processes. In the subsequent modeling step one has to formulate gross reaction rates for all of these reactions. A step by step procedure for building reduced modular models is given.

A potential drawback of the method is that even small changes to the assumptions of the model may lead to merging of layers which results in lower decrease of combinatorial complexity. This is the case if additional graded interactions between processes of different layers are introduced.

Mathematical analysis as well as simulation and optimization studies using insulin signaling models show that the approximation quality is excellent for relevant parameter settings. In the considered examples it is even possible to reconstruct the eliminated micro-states with high accuracy. As we also showed the method allows an enormous reduction of the number of necessary ODEs compared with detailed combinatorial models.

The method can be combined with the domain-oriented lumping technique

Methods

Initial conditions

The total concentration of insulin receptor in hepatocytes was reported to be 10^{5 }receptors per cell ^{-12 }^{23 }molecules there are 1.66·10^{-19 }^{-20 }

Parameter values

Insulin receptor autophosphorylation ^{-1}. Insulin receptor dephosphorylation on the plasma membrane in vitro has a half-life of about 3 ^{-1}. Parameters describing binding of IRS to the insulin receptor were originally reported to describe the binding of the p85 subunit to IRS

Abbreviations

nM, nano molar (10^{-9 }^{-1}); ass., assumption; ODE, ordinary differential equation

Competing interests

The author(s) declares that there are no competing interests.

Authors' contributions

MK developed the method, provided the example systems and performed the optimization study. HC performed the mathematical analysis and participated in development of the method. ME took part in mathematical analysis and development of the method. SE took part in development of the method. EDG initiated and supervised the study. All authors approved the final manuscript.

Acknowledgements

The authors acknowledge support from the german federal ministry of education and research (Bundesministerium für Bildung und Forschung, BMBF). This work was funded by the Network Systems Biology HepatoSys.