A comparison of Pfam and MEROPS: Two databases, one comprehensive, and one specialised.
-
* Corresponding author: David J Studholme ds2@sanger.ac.uk
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
BMC Bioinformatics 2003, 4:17 doi:10.1186/1471-2105-4-17
Published: 9 May 2003Additional files
Additional file 1:
For each of the MEROPS families that overlapped exactly one Pfam family, the number of unique sequences in the intersection was counted and expressed as a percentage of the total number of unique sequences in the union of the two sets. The numbers of unique sequences found only in the MEROPS family and the numbers found only in Pfam family were also counted and expressed as percentages of the numbers of unique sequences in the union of both families.
Format: DOC Size: 98KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 2:
For each of the MEROPS families that overlapped more than one Pfam family, the number of unique sequences in the intersection was counted and expressed as a percentage of the total number of unique sequences in the union of the two sets. The numbers of unique sequences found only in the MEROPS family and the numbers found only in Pfam family were also counted and expressed as percentages of the numbers of unique sequences in the union of both families.
Format: DOC Size: 114KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 3:
We attempted to build new Pfam families to represent those peptidase families in MEROPS that did not substantially overlap any family in Pfam. Where we successfully created a new family, the Pfam accession number and long name is given.
Format: DOC Size: 33KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 4:
The sets of member sequences found in each family were compared as described in the text. Those sequences identified as members of a Pfam family but not included in the overlapping MEROPS family were examined in detail to determine the reasons for the discrepancies. As a result of this, many of the sequences were then considered to be bona fide members of the family and were then added to the MEROPS database.
Format: DOC Size: 39KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 5:
The sets of members sequences found in each family were compared as described in the text. Those sequences identified as members of a MEROPS family but not included in the overlapping Pfam family were examined in detail to determine the reasons for the discrepancies. Several sequences had been wrongly included in MEROPS as a result of trivial errors (column 2). Some discrepancies could be explained by differences in the underlying sequence data where SwissProt/trEMBL where Pfam7.8 used an older (out of date) version of a SwissProt/trEMBL sequence. The remaining discrepancies could be explained by differences in the methods of building families between the two databases.
Format: DOC Size: 44KB Download file
This file can be viewed with: Microsoft Word Viewer
