Predicting nucleosome positioning using a duration Hidden Markov Model
1 Department of Statistics, Northwestern University, Evanston, IL 60208, USA
2 Department of of Biochemistry, Molecular Biology and Cell Biology, Northwestern University, Evanston, IL 60208, USA
3 Department of Electrical Engineering and Computer Science, Evanston, IL 60208, USA
4 Clinical and Translational Sciences Institute, Northwestern University, Chicago, 60611, USA
BMC Bioinformatics 2010, 11:346 doi:10.1186/1471-2105-11-346Published: 24 June 2010
The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software tool for predicting nucleosome positioning can help understanding how a genome's nucleosome organization may facilitate genome function.
We present a duration Hidden Markov model for nucleosome positioning prediction by explicitly modeling the linker DNA length. The nucleosome and linker models trained from yeast data are re-scaled when making predictions for other species to adjust for differences in base composition. A software tool named NuPoP is developed in three formats for free download.
Simulation studies show that modeling the linker length distribution and utilizing a base composition re-scaling method both improve the prediction of nucleosome positioning regarding sensitivity and false discovery rate. NuPoP provides a user-friendly software tool for predicting the nucleosome occupancy and the most probable nucleosome positioning map for genomic sequences of any size. When compared with two existing methods, NuPoP shows improved performance in sensitivity.