Computing DNA duplex instability profiles efficiently with a two-state model: trends of promoters and binding sites
1 Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL, USA
2 National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, USA
3 Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
BMC Bioinformatics 2010, 11:604 doi:10.1186/1471-2105-11-604Published: 21 December 2010
DNA instability profiles have been used recently for predicting the transcriptional start site and the location of core promoters, and to gain insight into promoter action. It was also shown that the use of these profiles can significantly improve the performance of motif finding programs.
In this work we introduce a new method for computing DNA instability profiles. The model that we use is a modified Ising-type model and it is implemented via statistical mechanics. Our linear time algorithm computes the profile of a 10,000 base-pair long sequence in less than one second. The method we use also allows the computation of the probability that several consecutive bases are unpaired simultaneously. This is a feature that is not available in other linear-time algorithms. We use the model to compare the thermodynamic trends of promoter sequences of several genomes. In addition, we report results that associate the location of local extrema in the instability profiles with the presence of core promoter elements at these locations and with the location of the transcription start sites (TSS). We also analyzed the instability scores of binding sites of several human core promoter elements. We show that the instability scores of functional binding sites of a given core promoter element are significantly different than the scores of sites with the same motif occurring outside the functional range (relative to the TSS).
The time efficiency of the algorithm and its genome-wide applications makes this work of broad interest to scientists interested in transcriptional regulation, motif discovery, and comparative genomics.