Optimizing the GATA-3 position weight matrix to improve the identification of novel binding sites
Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
BMC Genomics 2012, 13:416 doi:10.1186/1471-2164-13-416Published: 22 August 2012
The identifying of binding sites for transcription factors is a key component of gene regulatory network analysis. This is often done using position-weight matrices (PWMs). Because of the importance of in silico mapping of tentative binding sites, we previously developed an approach for PWM optimization that substantially improves the accuracy of such mapping.
The present work implements the optimization algorithm applied to the existing PWM for GATA-3 transcription factor and builds a new di-nucleotide PWM. The existing available PWM is based on experimental data adopted from Jaspar. The optimized PWM substantially improves the sensitivity and specificity of the TF mapping compared to the conventional applications. The refined PWM also facilitates in silico identification of novel binding sites that are supported by experimental data. We also describe uncommon positioning of binding motifs for several T-cell lineage specific factors in human promoters.
Our proposed di-nucleotide PWM approach outperforms the conventional mono-nucleotide PWM approach with respect to GATA-3. Therefore our new di-nucleotide PWM provides new insight into plausible transcriptional regulatory interactions in human promoters.