Log on / register
Feedback | Support | My details
Open AccessResearch article

Relating gene expression data on two-component systems to functional annotations in Escherichia coli

Anne M Denton1 email, Jianfei Wu1 email, Megan K Townsend2 email, Preeti Sule2 email and Birgit M Prüß2 email

Department of Computer Science and Operations Research, North Dakota State University, Fargo, ND 58105, USA

Department of Veterinary and Microbiological Sciences, North Dakota State University, Fargo, ND 58105, USA

author email corresponding author email

BMC Bioinformatics 2008, 9:294doi:10.1186/1471-2105-9-294

Published: 25 June 2008

Abstract

Background

Obtaining physiological insights from microarray experiments requires computational techniques that relate gene expression data to functional information. Traditionally, this has been done in two consecutive steps. The first step identifies important genes through clustering or statistical techniques, while the second step assigns biological functions to the identified groups. Recently, techniques have been developed that identify such relationships in a single step.

Results

We have developed an algorithm that relates patterns of gene expression in a set of microarray experiments to functional groups in one step. Our only assumption is that patterns co-occur frequently. The effectiveness of the algorithm is demonstrated as part of a study of regulation by two-component systems in Escherichia coli. The significance of the relationships between expression data and functional annotations is evaluated based on density histograms that are constructed using product similarity among expression vectors. We present a biological analysis of three of the resulting functional groups of proteins, develop hypotheses for further biological studies, and test one of these hypotheses experimentally. A comparison with other algorithms and a different data set is presented.

Conclusion

Our new algorithm is able to find interesting and biologically meaningful relationships, not found by other algorithms, in previously analyzed data sets. Scaling of the algorithm to large data sets can be achieved based on a theoretical model.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.