Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, 131 Princess Street, Manchester, M1 7DN, UK

School of Mathematics, The University of Manchester, Oxford Road, Manchester M13 9PL, UK

School of Chemical Engineering and Analytical Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK

School of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK

Virginia Bioinformatics Institute, Virginia Tech, Washington Street 0499, Virginia 24061, USA

Abstract

Background

Advances in bioinformatic techniques and analyses have led to the availability of genome-scale metabolic reconstructions. The size and complexity of such networks often means that their potential behaviour can only be analysed with constraint-based methods. Whilst requiring minimal experimental data, such methods are unable to give insight into cellular substrate concentrations. Instead, the long-term goal of systems biology is to use kinetic modelling to characterize fully the mechanics of each enzymatic reaction, and to combine such knowledge to predict system behaviour.

Results

We describe a method for building a parameterized genome-scale kinetic model of a metabolic network. Simplified linlog kinetics are used and the parameters are extracted from a kinetic model repository. We demonstrate our methodology by applying it to yeast metabolism. The resultant model has 956 metabolic reactions involving 820 metabolites, and, whilst approximative, has considerably broader remit than any existing models of its type. Control analysis is used to identify key steps within the system.

Conclusions

Our modelling framework may be considered a stepping-stone toward the long-term goal of a fully-parameterized model of yeast metabolism. The model is available in SBML format from the BioModels database (BioModels ID: MODEL1001200000) and at

Background

Recent advances in genome sequencing techniques and bioinformatic analyses have led to an explosion of systems-wide biological data. In turn, the reconstruction of genome-scale networks for micro-organisms has become possible. Whilst the first stoichiometric models were limited to the central metabolic pathways, later efforts such as iFF708

The ability to analyse, interpret and ultimately predict cellular behaviour is a long sought-after goal. The genome sequencing projects are defining the molecular components within the cell, but describing their integrated function will be a challenging task. Ideally, one would like to use enzyme kinetics to characterize fully the mechanics of each reaction, in terms of how changes in metabolite concentrations affect local reaction rates. However, a considerable amount of data and effort is required to parameterize even a small mechanistic model; the determination of such parameters is costly and time-consuming, and moreover much of the required information may be difficult or impossible to determine experimentally. Instead, genome-scale metabolic modelling has relied on constraint-based analysis

In a previous paper, we presented a method for constructing a kinetic model for a metabolic pathway based only on the knowledge of its stoichiometry

**Genome-scale model for yeast**. Compressed ZIP file (220 KB) containing the model in SBML format.

Click here for file

Results and Discussion

Algorithm

Model construction

A number of reconstructions of the metabolic network of yeast based on genomic and literature data have been published. However, due to different approaches utilized in the reconstruction, as well as different interpretations of the literature, the earlier reconstructions differ significantly. A community effort resulted in a consensus network model of yeast metabolism, combining results from previous models (

Species are localized to 15 compartments, including membranes. To limit complexity, we decompartmentalize the model, restricting entities to intra- or extra-cellular space. We also lump together reactions catalyzed by isoenzymes; the resultant model is reduced in size to 1059 reactions, of which 956 are metabolic, involving 1748 species, of which 820 are metabolites (the remaining 938 species are enzymes and enzyme complexes). Estimation of unknown system fluxes are addressed with the use of flux balance analysis (FBA)

That is, we define an objective function _{
j
}, that we maximize over all possible steady state fluxes (

In a previous paper

• Network stoichiometry (

• Reference fluxes (

• Reference metabolite concentrations (

• Elasticities (

To the stoichiometric model, we append kinetics (fluxes, concentrations and elasticities) from the set of models available from the BioModels database (11^{th }release)

An example of the SBML model's MIRIAM-compliant annotations

**An example of the SBML model's MIRIAM-compliant annotations**. The (concentration) parameter is taken from BioModels ID 70. Since the parameter is not available from yeast, it is flagged as originating from taxonomy 9606 (

Flux estimation

55 reactions in the (decompartmentalized) genome-scale model have fluxes that are defined in models stored on the BioModels database. Of these, the 21 data specific to yeast are presented in Table ^{
T
}):

Selected reaction fluxes used in the model

**Reaction**

**Flux (mM/s)**

acetaldehyde transport

0.00141

adenylate kinase

0

alcohol dehydrogenase, reverse rxn (acetaldehyde → ethanol)

1.17

ATPase, cytosolic

0.595

enolase

1.76

ethanol transport

0.0134

fructose-bisphosphate aldolase

0.733

glycerol-3-phosphate dehydrogenase (NAD)

0.149

glycerol-3-phosphatase

0.051

glyceraldehyde-3-phosphate dehydrogenase

1.06

glucose transport (uniport)

0.59

glycerol transport via channel

0.00141

hexokinase (D-glucose:ATP)

0.866

phosphofructokinase

0.606

glucose-6-phosphate isomerase

0.733

phosphoglycerate kinase

0.875

phosphoglycerate mutase

1.76

pyruvate kinase

1.06

pyruvate decarboxylase

1.25

triose-phosphate isomerase

0.395

alpha, alpha-trehalose-phosphate synthase (UDP-forming)

0.04

The data are taken from those models in the BioModels database specific to yeast.

where _{
j
}, then choosing a flux as close as possible to the centre of the box. Iterating, the method minimizes and centres the flux through the network and, in this case, fixes all 956 fluxes to unique values. The algorithm

**Reference fluxes**. Excel spreadsheet (XLS, 105 KB) containing the reference flux for all reactions, as estimated by application of the algorithm

Click here for file

A simple FBA formulation is solved, in order to identify the maximum achievable growth rate, _{
j
}into their positive and negative parts. The solution of this first iteration provides the minimal total flux through the network (_{1}). We then find the bounds on each reaction flux, subject to the new constraint that the total flux through the network cannot be larger than _{1}. The bounds are calculated by solving an optimisation problem for maximizing and minimizing the flux of each reaction iteratively. These limits are set as the new upper and lower bounds for the fluxes. The "centre" for each flux is the mean of the new bounds, as the most representative value of all solutions.

In the second iteration, we place a box around the hull (defining new bounds), before minimizing the distance between the flux of each reaction and the centre value, subject to the constraint that the total network flux cannot exceed _{1}, as found in the first iteration. In turn, this leads to new bounds and a corresponding centre. Each iteration of the algorithm adds an additional constraint, and the flux is drawn towards the centre of the bounds. After a finite number of iterations, the bounds converge to a single solution, within a specified tolerance.

The algorithm is explained in detail in a previous paper

Concentrations

82 intracellular metabolites' concentrations are defined in various models within BioModels. Of these, the 22 specific to yeast are presented in Table

Selected intracellular metabolite concentrations used in the model

**Metabolite**

**Concentration (mM)**

3-Phospho-D-glyceroyl phosphate

2.75 × 10^{-4}

D-Glycerate 2-phosphate

0.0371

3-Phospho-D-glycerate

0.278

Acetaldehyde

0.17

ADP

1.63

AMP

0.796

ATP

1.13

CO2

1

Dihydroxyacetone phosphate

0.59

Ethanol

50

D-Fructose 2,6-bisphosphate

0.02

D-Fructose 6-phosphate

0.112

D-Fructose 1,6-bisphosphate

2.82

Glyceraldehyde 3-phosphate

0.069

D-Glucose 6-phosphate

1.02

D-Glucose

0.0906

Glycerol

2.27

Glycerol 3-phosphate

0.457

Nicotinamide adenine dinucleotide

1.5

Nicotinamide adenine dinucleotide - reduced

0.0861

Phosphoenolpyruvate

0.0302

Pyruvate

8.36

The data are taken from those models in the BioModels database specific to yeast.

Extracellular metabolite concentrations used in the model

**Metabolite**

**Concentration (mM)**

4-Aminobenzoate

0.0015

L-Arginine

1

L-Aspartate

1

Biotin

8.2 × 10^{-5}

Citrate

1

Fumarate

1

D-Glucose

11.1

L-Glutamate

1

L-Histidine

1

myo-Inositol

0.055

potassium

7.11

L-Leucine

1

L-Lysine

1

L-Malate

1

L-Methionine

1

Sodium

1.71

Ammonium

38

(R)-Pantothenate

0.0042

Pyridoxine

0.0019

Pyruvate

1

Riboflavin

5.3 × 10^{-4}

L-Serine

1

Sulfate

42.2

Succinate

1

Thiamin

0.0012

L-Threonine

1

L-Tryptophan

1

L-Valine

1

Values are as defined in the "metabolic footprinting" medium

Elasticities

151 elasticities are calculated from models within BioModels, using symbolic differentiation. For the remaining values, we follow the tendency modelling approach of Visser

An assumption of irreversible mass-action kinetics would lead to reaction rate ^{2 }

Linlog kinetics

To produce our genome-scale, kinetic model of yeast metabolism, the above parameters may be combined in a phenomenological rate law such as linlog kinetics:

where

Testing

Control analysis

To test the resultant genome-scale model, and to try and indentify key steps in the metabolic network of yeast, we calculate the flux control coefficients for reactions, as defined by metabolic control analysis (MCA). MCA studies how the control of fluxes and intermediate concentrations in a metabolic pathway is distributed among the different enzymes that constitute the pathway. Developed independently by Kacser and Burns

Whilst Reder's formula is often used in computational applications, it assumes that a certain matrix is invertible; this may not be true, especially if some reference reaction rates are zero. For example, the number of independent metabolites is often defined solely in terms of stoichiometry as rank(

In Tables

Reactions exerting most control over glucose transport

**Reaction**

**
C
^{
J
}
**

glucose transport (uniport)

1.149

glucosamine-6-phosphate deaminase

-0.787

glutamine-fructose-6-phosphate transaminase

-0.655

glutamine synthetase

-0.520

inorganic diphosphatase

0.421

L-asparaginase

0.323

ATPase, cytosolic

0.250

phosphofructokinase

0.235

glycerol-3-phosphate dehydrogenase (NAD)

-0.233

adenylate kinase (GTP)

0.231

Reactions are ranked in terms of their flux control coefficient. See additional file

Reactions exerting most control over biomass production

**Reaction**

**
C
^{
J
}
**

glucosamine-6-phosphate deaminase

0.532

glutamine-fructose-6-phosphate transaminase

0.441

glutamine synthetase

0.358

H2O transport via diffusion

0.212

inorganic diphosphatase

-0.193

glycerol-3-phosphate dehydrogenase (NAD)

0.189

L-asparaginase

-0.146

adenylate kinase (GTP)

-0.142

glucose transport (uniport)

-0.132

ribonucleoside-triphosphate reductase (UTP)

-0.104

Reactions are ranked in terms of their flux control coefficient. See additional file

**Control over glucose transport**. Excel spreadsheet (XLS, 105 KB) containing flux control coefficients for all reactions for control over glucose transport.

Click here for file

**Control over biomass production**. Excel spreadsheet (XLS, 114 KB) containing flux control coefficients for all reactions for control over biomass production.

Click here for file

Implementation

The systems biology approach often involves the development of mechanistic models, such as the reconstruction of dynamic systems from the quantitative properties of their elementary building blocks. Typically, this is performed in a 'bottom-up' manner, whereby models built as individual elements are experimentally-determined. Here we propose an alternative, 'top-down' mechanism, whereby an approximative model of the whole system is built initially; this model can then be used to guide experimental design and can subsequently be updated as specific knowledge becomes available from experimental results, following the iterative 'cycle of knowledge' approach

The genome-scale model that is produced with the presented methodology is offered in SBML format, with MIRIAM-compliant annotations. Such markup allows automated reasoning about the model's assumptions and provenance

Conclusions

In this paper, we present a novel methodology that can be used to create a parameterized, genome-scale kinetic model of the metabolic network of an organism. The methodology is demonstrated by its application on yeast metabolism, through appending existing kinetic submodels from the BioModels database to a stoichiometric model of yeast. The final model has 956 metabolic reactions involving 820 metabolites and, to our knowledge has significantly wider scope than any previous models of comparable type. We demonstrate the usefulness of such a model, by applying the principles of metabolic control analysis to identify key steps within the network.

Critically, both the original stoichiometric model, and the kinetic model that constitutes the end-result of the method are available in SBML, using MIRIAM-compliant annotations. Models in BioModels are annotated with computer-readable references such as ChEBI

Our methodology clearly has limitations, in that the linlog framework is only valid in a region near the chosen reference state. Moreover, due to the vast lack of information, many of the parameters used in building the model are unknown and must be estimated through techniques such as flux balance analysis. Nonetheless, our modelling framework is a necessary stepping stone at creation of a genome-scale kinetic model, and may thus be considered the first step in the deductive-inductive 'cycle of knowledge' crucial for systems biology

Methods

Control analysis

Let us return to Equation (3), a generalized description of the temporal evolution of a metabolic network in differential equation format. Let us also assume that the reference state ^{-1}·

where

where

In general, the rank(_{0 }<_{0 }link matrix _{
r
}denote a _{0 }× ^{+}' denotes the Moore-Penrose pseudoinverse _{
r
}.

From Equation (6), and noting that the rows of _{
r
}form the identity matrix, we find _{
r
}and hence

where the _{0 }× _{0 }matrix (_{r}·

Having transformed the system, we add a small perturbation to reaction

where _{
j
}denotes the ^{th }standard basis vector and the notation _{
r, j
}is used to denote the ^{th }column of _{
r
}. The new steady state resulting from this perturbation is given by

Using Equation (9), we may resolve the definition of (unscaled) flux control and concentration control coefficients as

and

respectively. If we compare our expressions to those given in Reder

As such, we may see that we have extended Reder's work to encompass the possibility that rank(

Equation (10) may be used to calculate flux control coefficients for our genome-scale model. These parameters may also be defined in their more usual scaled form

Nomenclature

The indices and variables appearing throughout the paper are defined in Table

Nomenclature

**Index**

**Description**

**Size**

species/metabolites

reactions

subset of

55

subset of

_{0}

**Variable**

**Description**

**Dimensions**

compartment volumes

^{
J
}

scaled flux control coefficients

^{
J
}

unscaled flux control coefficients

^{
S
}

unscaled concentration control coefficients

_{
j
}

denotes the ^{th }standard basis vector

vector specifying the optimized fluxes

stoichiometric matrix

link matrix

_{0}

time

metabolite concentrations

reference metabolite concentrations

_{
r
}

independent metabolite concentrations

_{0 }× 1

flux vector

reference flux vector

^{min}

lower bounds vector

^{max}

upper bounds vector

^{
T
}

fluxes defined in the Biomodels database

55 × 1

optimization objective

maximum achievable growth rate

_{1}

minimal total flux through the network

perturbation

elasticity

unscaled elasticity matrix

Authors' contributions

KS performed the calculations. KS and ES drafted the manuscript. All authors conceived the methodology and read and approved the final manuscript.

Acknowledgements

We are grateful for the financial support of the BBSRC and EPSRC through grant BB/C008219/1 "The Manchester Centre for Integrative Systems Biology (MCISB)". We also thank Michael Howard for invaluable discussions, and our MCISB colleagues.