Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Predicting nucleosome positioning using a duration Hidden Markov Model

Liqun Xi1, Yvonne Fondufe-Mittendorf2, Lei Xia3, Jared Flatow4, Jonathan Widom2* and Ji-Ping Wang1*

Author Affiliations

1 Department of Statistics, Northwestern University, Evanston, IL 60208, USA

2 Department of of Biochemistry, Molecular Biology and Cell Biology, Northwestern University, Evanston, IL 60208, USA

3 Department of Electrical Engineering and Computer Science, Evanston, IL 60208, USA

4 Clinical and Translational Sciences Institute, Northwestern University, Chicago, 60611, USA

For all author emails, please log on.

BMC Bioinformatics 2010, 11:346  doi:10.1186/1471-2105-11-346

Published: 24 June 2010

Abstract

Background

The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software tool for predicting nucleosome positioning can help understanding how a genome's nucleosome organization may facilitate genome function.

Results

We present a duration Hidden Markov model for nucleosome positioning prediction by explicitly modeling the linker DNA length. The nucleosome and linker models trained from yeast data are re-scaled when making predictions for other species to adjust for differences in base composition. A software tool named NuPoP is developed in three formats for free download.

Conclusions

Simulation studies show that modeling the linker length distribution and utilizing a base composition re-scaling method both improve the prediction of nucleosome positioning regarding sensitivity and false discovery rate. NuPoP provides a user-friendly software tool for predicting the nucleosome occupancy and the most probable nucleosome positioning map for genomic sequences of any size. When compared with two existing methods, NuPoP shows improved performance in sensitivity.