Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Open Badges Research article

Comparative mapping of sequence-based and structure-based protein domains

Ya Zhang123*, John-Marc Chandonia1, Chris Ding2 and Stephen R Holbrook1*

Author Affiliations

1 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

2 Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

3 School of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, USA

For all author emails, please log on.

BMC Bioinformatics 2005, 6:77  doi:10.1186/1471-2105-6-77

Published: 25 March 2005



Protein domains have long been an ill-defined concept in biology. They are generally described as autonomous folding units with evolutionary and functional independence. Both structure-based and sequence-based domain definitions have been widely used. But whether these types of models alone can capture all essential features of domains is still an open question.


Here we provide insight on domain definitions through comparative mapping of two domain classification databases, one sequence-based (Pfam) and the other structure-based (SCOP). A mapping score is defined to indicate the significance of the mapping, and the properties of the mapping matrices are studied.


The mapping results show a general agreement between the two databases, as well as many interesting areas of disagreement. In the cases of disagreement, the functional and evolutionary characteristics of the domains are examined to determine which domain definition is biologically more informative.