Open Access Open Badges Database

Collembase: a repository for springtail genomics and soil quality assessment

Martijn JTN Timmermans1*, Muriel E de Boer1, Benjamin Nota1, Tjalf E de Boer1, Janine Mariën1, Rene M Klein-Lankhorst2, Nico M van Straalen1 and Dick Roelofs1

Author Affiliations

1 Vrije Universiteit, Institute of Ecological Science, Department of Animal Ecology, De Boelelaan 1085, 1081 HV, Amsterdam, The Netherlands

2 PRI Greenomics, Droevendaalse steeg 1, 6708 PB Wageningen, The Netherlands

For all author emails, please log on.

BMC Genomics 2007, 8:341  doi:10.1186/1471-2164-8-341

Published: 27 September 2007



Environmental quality assessment is traditionally based on responses of reproduction and survival of indicator organisms. For soil assessment the springtail Folsomia candida (Collembola) is an accepted standard test organism. We argue that environmental quality assessment using gene expression profiles of indicator organisms exposed to test substrates is more sensitive, more toxicant specific and significantly faster than current risk assessment methods. To apply this species as a genomic model for soil quality testing we conducted an EST sequencing project and developed an online database.


Collembase is a web-accessible database comprising springtail (F. candida) genomic data. Presently, the database contains information on 8686 ESTs that are assembled into 5952 unique gene objects. Of those gene objects ~40% showed homology to other protein sequences available in GenBank (blastx analysis; non-redundant (nr) database; expect-value < 10-5). Software was applied to infer protein sequences. The putative peptides, which had an average length of 115 amino-acids (ranging between 23 and 440) were annotated with Gene Ontology (GO) terms. In total 1025 peptides (~17% of the gene objects) were assigned at least one GO term (expect-value < 10-25). Within Collembase searches can be conducted based on BLAST and GO annotation, cluster name or using a BLAST server. The system furthermore enables easy sequence retrieval for functional genomic and Quantitative-PCR experiments. Sequences are submitted to GenBank (Accession numbers: EV473060EV481745).


Collembase webcite is a resource of sequence data on the springtail F. candida. The information within the database will be linked to a custom made microarray, based on the Agilent platform, which can be applied for soil quality testing. In addition, Collembase supplies information that is valuable for related scientific disciplines such as molecular ecology, ecogenomics, molecular evolution and phylogenetics.