Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

A local average distance descriptor for flexible protein structure comparison

Hsin-Wei Wang1, Chia-Han Chu2, Wen-Ching Wang23 and Tun-Wen Pai1*

Author Affiliations

1 Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan

2 Biomedical Science and Engineering Center, National Tsing Hua University, Hsinchu, Taiwan

3 Institute of Molecular and Cellular Biology and Department of Life Science, National Tsing Hua University, Hsinchu, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2014, 15:95  doi:10.1186/1471-2105-15-95

Published: 2 April 2014

Abstract

Background

Protein structures are flexible and often show conformational changes upon binding to other molecules to exert biological functions. As protein structures correlate with characteristic functions, structure comparison allows classification and prediction of proteins of undefined functions. However, most comparison methods treat proteins as rigid bodies and cannot retrieve similarities of proteins with large conformational changes effectively.

Results

In this paper, we propose a novel descriptor, local average distance (LAD), based on either the geodesic distances (GDs) or Euclidean distances (EDs) for pairwise flexible protein structure comparison. The proposed method was compared with 7 structural alignment methods and 7 shape descriptors on two datasets comprising hinge bending motions from the MolMovDB, and the results have shown that our method outperformed all other methods regarding retrieving similar structures in terms of precision-recall curve, retrieval success rate, R-precision, mean average precision and F1-measure.

Conclusions

Both ED- and GD-based LAD descriptors are effective to search deformed structures and overcome the problems of self-connection caused by a large bending motion. We have also demonstrated that the ED-based LAD is more robust than the GD-based descriptor. The proposed algorithm provides an alternative approach for blasting structure database, discovering previously unknown conformational relationships, and reorganizing protein structure classification.