Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Bayesian integrated modeling of expression data: a case study on RhoG

Rashi Gupta12*, Dario Greco2, Petri Auvinen2 and Elja Arjas13

Author Affiliations

1 Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, FIN-00014, Helsinki, Finland

2 Institute of Biotechnology, University of Helsinki, P.O. Box 56, FIN-00014, Helsinki, Finland

3 National Institute for Health and Welfare (THL), Mannerheimintie 166, 00300 Helsinki, Finland

For all author emails, please log on.

BMC Bioinformatics 2010, 11:295  doi:10.1186/1471-2105-11-295

Published: 1 June 2010

Abstract

Background

DNA microarrays provide an efficient method for measuring activity of genes in parallel and even covering all the known transcripts of an organism on a single array. This has to be balanced against that analyzing data emerging from microarrays involves several consecutive steps, and each of them is a potential source of errors. Errors tend to accumulate when moving from the lower level towards the higher level analyses because of the sequential nature. Eliminating such errors does not seem feasible without completely changing the technologies, but one should nevertheless try to meet the goal of being able to realistically assess degree of the uncertainties that are involved when drawing the final conclusions from such analyses.

Results

We present a Bayesian hierarchical model for finding differentially expressed genes between two experimental conditions, proposing an integrated statistical approach where correcting signal saturation, systematic array effects, dye effects, and finding differentially expressed genes, are all modeled jointly. The integration allows all these components, and also the associated errors, to be considered simultaneously. The inference is based on full posterior distribution of gene expression indices and on quantities derived from them rather than on point estimates. The model was applied and tested on two different datasets.

Conclusions

The method presents a way of integrating various steps of microarray analysis into a single joint analysis, and thereby enables extracting information on differential expression in a manner, which properly accounts for various sources of potential error in the process.