Table 2

A brief summary of the sequence data underlying the emerencia web service at http://emerencia.math.chalmers.se webcite as of May 2005. The threshold BLAST E-values for "good" and "poor" matches were arbitrarily set to 0.0 and 1e-100, respectively. Graphical illustrations showing the population of the database over time and additional aspects of emerencia are generated automatically on a monthly basis and are available at the above address.

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES

7528 (21 % of total)

NUMBER OF IDENTIFIED SEQUENCES

28959 (79% of total)

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES WITH GOOD MATCHES (E-VALUE = 0.0)

4791 (64 % of the insufficiently identified sequences)

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES WITH POOR MATCHES (E-VALUE >1E-100)

1135 (15 % of the insufficiently identified sequences)

TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 1995-01-01

180 (0.5%)

TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 2000-01-01

3651 (10 %)

TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 2005-01-01

31858 (87%)

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES LAST UPDATED BEFORE 2000-01-01

264 (3.5 % of the insufficiently identified sequences)

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES LAST UPDATED BEFORE 2000-01-01 AND WITH POOR MATCHES (E-VALUE > 1E-100)

17 (0.2 % of the insufficiently identified sequences)

NUMBER OF IDENTIFIED SEQUENCES HAVING AT LEAST ONE INSUFFICIENTLY IDENTIFIED COUNTERPART AS IDENTIFIED BY BLAST

2981 (10 % of the identified sequences)

NUMBER OF IDENTIFIED SEQUENCES WITHOUT INSUFFICIENTLY IDENTIFIED COUNTERPARTS

25978 (90 % of the identified sequences)


Nilsson et al. BMC Bioinformatics 2005 6:178   doi:10.1186/1471-2105-6-178

Open Data