Log on / register
Feedback | Support | My details
Open AccessResearch article

A comparison of Pfam and MEROPS: Two databases, one comprehensive, and one specialised.

David J Studholme email, Neil D Rawlings email, Alan J Barrett email and Alex Bateman email

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK

author email corresponding author email

BMC Bioinformatics 2003, 4:17doi:10.1186/1471-2105-4-17

Published: 9 May 2003

Additional files

Additional file 1:

For each of the MEROPS families that overlapped exactly one Pfam family, the number of unique sequences in the intersection was counted and expressed as a percentage of the total number of unique sequences in the union of the two sets. The numbers of unique sequences found only in the MEROPS family and the numbers found only in Pfam family were also counted and expressed as percentages of the numbers of unique sequences in the union of both families.

Format: DOC Size: 98KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 2:

For each of the MEROPS families that overlapped more than one Pfam family, the number of unique sequences in the intersection was counted and expressed as a percentage of the total number of unique sequences in the union of the two sets. The numbers of unique sequences found only in the MEROPS family and the numbers found only in Pfam family were also counted and expressed as percentages of the numbers of unique sequences in the union of both families.

Format: DOC Size: 114KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 3:

We attempted to build new Pfam families to represent those peptidase families in MEROPS that did not substantially overlap any family in Pfam. Where we successfully created a new family, the Pfam accession number and long name is given.

Format: DOC Size: 33KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 4:

The sets of member sequences found in each family were compared as described in the text. Those sequences identified as members of a Pfam family but not included in the overlapping MEROPS family were examined in detail to determine the reasons for the discrepancies. As a result of this, many of the sequences were then considered to be bona fide members of the family and were then added to the MEROPS database.

Format: DOC Size: 39KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 5:

The sets of members sequences found in each family were compared as described in the text. Those sequences identified as members of a MEROPS family but not included in the overlapping Pfam family were examined in detail to determine the reasons for the discrepancies. Several sequences had been wrongly included in MEROPS as a result of trivial errors (column 2). Some discrepancies could be explained by differences in the underlying sequence data where SwissProt/trEMBL where Pfam7.8 used an older (out of date) version of a SwissProt/trEMBL sequence. The remaining discrepancies could be explained by differences in the methods of building families between the two databases.

Format: DOC Size: 44KB Download file

This file can be viewed with: Microsoft Word Viewer


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.