Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Flanking sequence context-dependent transcription factor binding in early Drosophila development

Jessica L Stringham1, Adam S Brown2, Robert A Drewell234 and Jacqueline M Dresch56*

Author Affiliations

1 Computer Science Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA

2 Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA

3 Department of Biological Sciences, Mount Holyoke College, South Hadley, MA 01705, USA

4 Department of Biology, Amherst College, Amherst, MA 01002, USA

5 Mathematics Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA

6 Department of Mathematics, Amherst College, Amherst, MA 01002, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14:298  doi:10.1186/1471-2105-14-298

Published: 4 October 2013

Abstract

Background

Gene expression in the Drosophila embryo is controlled by functional interactions between a large network of protein transcription factors (TFs) and specific sequences in DNA cis-regulatory modules (CRMs). The binding site sequences for any TF can be experimentally determined and represented in a position weight matrix (PWM). PWMs can then be used to predict the location of TF binding sites in other regions of the genome, although there are limitations to this approach as currently implemented.

Results

In this proof-of-principle study, we analyze 127 CRMs and focus on four TFs that control transcription of target genes along the anterio-posterior axis of the embryo early in development. For all four of these TFs, there is some degree of conserved flanking sequence that extends beyond the predicted binding regions. A potential role for these conserved flanking sequences may be to enhance the specificity of TF binding, as the abundance of these sequences is greatly diminished when we examine only predicted high-affinity binding sites.

Conclusions

Expanding PWMs to include sequence context-dependence will increase the information content in PWMs and facilitate a more efficient functional identification and dissection of CRMs.

Keywords:
Transcription factor; Binding site; Position weight matrix; Enhancer; Cis-regulatory module; Drosophila