Table 1

Definition of terms.

Term
Definition

SP-trEMBL
Sequence dataset containing 2,241,227 sequences from the Swiss-Prot (version 48.1) and TrEBML (version 31.1) sequence databases.
Integr8_263
Sequence dataset containing 913,094 sequences from 263 completed genomes listed in the Integr8 genome database.
Pfam_struc
Pfam-A family containing a PDB structure that has not yet been classified into the CATH domain database.
NewFam
Protein families generated in the Gene3D

Marsden et al. BMC Bioinformatics 2007 8:86   doi:10.1186/1471-2105-8-86