An optimized TOPS+ comparison method for enhanced TOPS models
1 Joint Center for Molecular Modeling, Sanford-Burnham Medical Research Institute, La Jolla, CA 92037, USA
2 School of Information Systems, Computing and Mathematics, Brunel University, Uxbridge, Middlesex UB8 3PH, UK
3 Algorithms, Bioinformatics, Complexity and Formal Methods Research Group, Technical University of Catalonia, E-08034 Barcelona, Spain
BMC Bioinformatics 2010, 11:138 doi:10.1186/1471-2105-11-138Published: 17 March 2010
Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure comparison based on our TOPS+ model (Topological descriptions of Protein Structures Enhanced with Ligand Information). In this paper we show how these results can be significantly improved using parameter optimization, and we call the resulting optimised TOPS+ method as advanced TOPS+ comparison method i.e. advTOPS+.
We have developed a TOPS+ string model as an improvement to the TOPS [1-3] graph model by considering loops as secondary structure elements (SSEs) in addition to helices and strands, representing ligands as first class objects, and describing interactions between SSEs, and SSEs and ligands, by incoming and outgoing arcs, annotating SSEs with the interaction direction and type. Benchmarking results of an all-against-all pairwise comparison using a large dataset of 2,620 non-redundant structures from the PDB40 dataset  demonstrate the biological significance, in terms of SCOP classification at the superfamily level, of our TOPS+ comparison method.
Our advanced TOPS+ comparison shows better performance on the PDB40 dataset  compared to our basic TOPS+ method, giving 90% accuracy for SCOP alpha+beta; a 6% increase in accuracy compared to the TOPS and basic TOPS+ methods. It also outperforms the TOPS, basic TOPS+ and SSAP comparison methods on the Chew-Kedem dataset , achieving 98% accuracy.