BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Methodology article

Parameter estimation for robust HMM analysis of ChIP-chip data

Peter Humburg1,2*, David Bulger1 and Glenn Stone2

Author Affiliations

1 Department of Statistics, Macquarie University, North Ryde, NSW 2109, Australia

2 CSIRO Mathematical and Information Sciences, North Ryde, NSW 2113, Australia

For all author emails, please log on.

BMC Bioinformatics 2008, 9:343 doi:10.1186/1471-2105-9-343

Published: 18 August 2008

Abstract

Background

Tiling arrays are an important tool for the study of transcriptional activity, protein-DNA interactions and chromatin structure on a genome-wide scale at high resolution. Although hidden Markov models have been used successfully to analyse tiling array data, parameter estimation for these models is typically ad hoc. Especially in the context of ChIP-chip experiments, no standard procedures exist to obtain parameter estimates from the data. Common methods for the calculation of maximum likelihood estimates such as the Baum-Welch algorithm or Viterbi training are rarely applied in the context of tiling array analysis.

Results

Here we develop a hidden Markov model for the analysis of chromatin structure ChIP-chip tiling array data, using t emission distributions to increase robustness towards outliers. Maximum likelihood estimates are used for all model parameters. Two different approaches to parameter estimation are investigated and combined into an efficient procedure.

Conclusion

We illustrate an efficient parameter estimation procedure that can be used for HMM based methods in general and leads to a clear increase in performance when compared to the use of ad hoc estimates. The resulting hidden Markov model outperforms established methods like TileMap in the context of histone modification studies.