Open Access Methodology article

Least-squares methods for identifying biochemical regulatory networks from noisy measurements

Jongrae Kim1, Declan G Bates1, Ian Postlethwaite1, Pat Heslop-Harrison2 and Kwang-Hyun Cho34*

Author Affiliations

1 Department of Engineering, University of Leicester, Leicester, LE1 7RH, UK

2 Department of Biology, University of Leicester, Leicester, LE1 7RH, UK

3 College of Medicine, Seoul National University, Jongno-gu, Seoul, 110-799, Korea

4 Bio-MAX Institute, Seoul National University, Gwanak-gu, Seoul, 151-818, Korea

For all author emails, please log on.

BMC Bioinformatics 2007, 8:8  doi:10.1186/1471-2105-8-8

Published: 10 January 2007

Abstract

Background

We consider the problem of identifying the dynamic interactions in biochemical networks from noisy experimental data. Typically, approaches for solving this problem make use of an estimation algorithm such as the well-known linear Least-Squares (LS) estimation technique. We demonstrate that when time-series measurements are corrupted by white noise and/or drift noise, more accurate and reliable identification of network interactions can be achieved by employing an estimation algorithm known as Constrained Total Least Squares (CTLS). The Total Least Squares (TLS) technique is a generalised least squares method to solve an overdetermined set of equations whose coefficients are noisy. The CTLS is a natural extension of TLS to the case where the noise components of the coefficients are correlated, as is usually the case with time-series measurements of concentrations and expression profiles in gene networks.

Results

The superior performance of the CTLS method in identifying network interactions is demonstrated on three examples: a genetic network containing four genes, a network describing p53 activity and mdm2 messenger RNA interactions, and a recently proposed kinetic model for interleukin (IL)-6 and (IL)-12b messenger RNA expression as a function of ATF3 and NF-κB promoter binding. For the first example, the CTLS significantly reduces the errors in the estimation of the Jacobian for the gene network. For the second, the CTLS reduces the errors from the measurements that are corrupted by white noise and the effect of neglected kinetics. For the third, it allows the correct identification, from noisy data, of the negative regulation of (IL)-6 and (IL)-12b by ATF3.

Conclusion

The significant improvements in performance demonstrated by the CTLS method under the wide range of conditions tested here, including different levels and types of measurement noise and different numbers of data points, suggests that its application will enable more accurate and reliable identification and modelling of biochemical networks.