Table 2

Percentage of problematic and singleton domain sequences.


Percentage of domains

Sequence dataset
Transmembrane & problematic
Singleton

SP-trEMBL
18.5
22.6
integr8_263
17.9
24.9
A thaliana
17.5
16.0
B anthracis
20.3
8.6
C elegans
19.8
22.1
D melanogaster
18.7
18.7
E coli
15.7
7.3
H sapiens
15.9
20.9
S cerevisiae
14.9
24.7
T maritima
13.4
12.7

The percentage of problematic and singleton domain sequences in Swiss-Prot & TrEMBL, 263 completed genomes and eight model genomes Problematic domains are defined as those containing helical transmembrane helices or significant regions of low complexity or coiled-coil

Marsden et al. BMC Bioinformatics 2007 8:86   doi:10.1186/1471-2105-8-86