This article is part of the supplement: Eighth International Conference on Bioinformatics (InCoB2009):Computational Biology
DNA-binding residues and binding mode prediction with binding-mechanism concerned models
1 Department of Computer Science and Information Engineering, National Taiwan University, Taipei, 106, Taiwan, Republic of China
2 Department of Engineering Science and Ocean Engineering, National Taiwan University, Taipei, 106, Taiwan, Republic of China
3 Institute of Biomedical Engineering, National Taiwan University, Taipei, 106, Taiwan, Republic of China
4 Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, 106, Taiwan, Republic of China
5 Center for Systems Biology and Bioinformatics, National Taiwan University, Taipei, 106, Taiwan, Republic of China
BMC Genomics 2009, 10(Suppl 3):S23 doi:10.1186/1471-2164-10-S3-S23Published: 3 December 2009
Protein-DNA interactions are essential for fundamental biological activities including DNA transcription, replication, packaging, repair and rearrangement. Proteins interacting with DNA can be classified into two categories of binding mechanisms - sequence-specific and non-specific binding. Protein-DNA specific binding provides a mechanism to recognize correct nucleotide base pairs for sequence-specific identification. Protein-DNA non-specific binding shows sequence independent interaction for accelerated targeting by interacting with DNA backbone. Both sequence-specific and non-specific binding residues contribute to their roles for interaction.
The proposed framework has two stage predictors: DNA-binding residues prediction and binding mode prediction. In the first stage - DNA-binding residues prediction, the predictor for DNA specific binding residues achieves 96.45% accuracy with 50.14% sensitivity, 99.31% specificity, 81.70% precision, and 62.15% F-measure. The predictor for DNA non-specific binding residues achieves 89.14% accuracy with 53.06% sensitivity, 95.25% specificity, 65.47% precision, and 58.62% F-measure. While combining prediction results of sequence-specific and non-specific binding residues with OR operation, the predictor achieves 89.26% accuracy with 56.86% sensitivity, 95.63% specificity, 71.92% precision, and 63.51% F-measure. In the second stage, protein-DNA binding mode prediction achieves 75.83% accuracy while using support vector machine with multi-class prediction.
This article presents the design of a sequence based predictor aiming to identify sequence-specific and non-specific binding residues in a transcription factor with DNA binding-mechanism concerned. The protein-DNA binding mode prediction was introduced to help improve DNA-binding residues prediction. In addition, the results of this study will help with the design of binding-mechanism concerned predictors for other families of proteins interacting with DNA.