Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Tardigrade workbench: comparing stress-related proteins, sequence-similar and functional protein clusters as well as RNA elements in tardigrades

Frank Förster1, Chunguang Liang1, Alexander Shkumatov2, Daniela Beisser1, Julia C Engelmann1, Martina Schnölzer3, Marcus Frohme4, Tobias Müller1, Ralph O Schill5 and Thomas Dandekar1*

Author Affiliations

1 Dept of Bioinformatics, Biocenter University of Würzburg, 97074 Würzburg, Germany

2 EMBL, Hamburg Outstation, Notkestrasse 85, 22603 Hamburg, Germany

3 Functional Proteome Analysis, German Cancer Research Center, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany

4 University of Applied Sciences, Bahnhofstraße 1, 15745 Wildau, Germany

5 Dept of Zoology, Institute for Biology, Universität Stuttgart, 70569 Stuttgart, Germany

For all author emails, please log on.

BMC Genomics 2009, 10:469  doi:10.1186/1471-2164-10-469

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/10/469


Received:14 April 2009
Accepted:12 October 2009
Published:12 October 2009

© 2009 Förster et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Tardigrades represent an animal phylum with extraordinary resistance to environmental stress.

Results

To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de webcite. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data.

Conclusion

Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences.

Background

Tardigrades are small metazoans resembling microscopic bears ("water-bears", 0.05 mm to 1.5 mm in size) and live in marine, freshwater and terrestrial environments, especially in lichens and mosses [1-3]. They are a phylum of multi-cellular animals capable of reversible suspension of their metabolism and entering a state of cryptobiosis [4,5]. A dehydrated tardigrade, known as anhydrobiotic tun-stage [6,7], can survive for years without water. Moreover, the tun is resistant to extreme pressures and temperatures (low/high), as well as radiation and vaccuum [8-13].

Well known species include Hypsibius dujardini which is an obligatory parthenogenetic species [14]. The tardigrade H. dujardini can be cultured continuously for decades and can be cryopreserved. It has a compact genome, a little smaller than that of Caenorhabditis elegans or Drosophila melanogaster, and the rate of protein evolution in H. dujardini is similar to that of other metazoan taxa [15]. H. dujardini has a short generation time, 13-14 days at room temperature. Embryos of H. dujardini have a stereotyped cleavage pattern with asymmetric cell divisions, nuclear migrations, and cell migrations occurring in reproducible patterns [15]. Molecular data are sparse but include the purinergic receptor occuring in H. dujardini [16].

Milnesium tardigradum is an abundant and ubiquitous terrestrial tardigrade species in Europe and possibly worldwide [17]. It has unique anatomy and motion characteristics compared to other water bears. Most water bears prefer vegetarian food, M. tardigradum is more carnivorous, feeding on rotifers and nematodes. The animals are really tough and long-living, one of the reasons why M. tardigradum is one of the best-studied species so far.

Questions of general interest are: How related are tardigrade proteins to each other? Which protein families provide tardigrade-specific adaptations? Which regulatory elements influence the mRNA stability? Starting from all published tardigrade sequences as well as 607 unpublished new sequences from Milnesium tardigradum, we analyse tardigrade specific clusters of related proteins, functional protein clusters and conserved regulatory elements in mRNA mainly involved in mRNA stability. The different clusters and identified motifs are analysed and discussed, all data are also available as a first anchor to study specific adaptations of tardigrades in more detail (Tardigrade workbench). Furthermore, the tardigrade analyzer, a sequence server to analyse individual tardigrade specific sequences, is made available. It will be regularly updated to include new tardigrade sequences. It has a number of new features for tardigrade analysis not available from standard servers such as the NIH Entrez system [18]: several new species-specific searches (Echiniscus testudo, Tulinus stephaniae), additional new sequence information (M. tardigradum) and pattern-searches for nucleotide sequences (including pattern search on non-redundant protein database, NRDB). An easy search for clusters of orthologous groups (COG, [19]) different from the COGnitor tool [20] allowing tardigrade specific COG and eukaryotic COG (KOG) searches is also available.

Furthermore, a batch mode allows a rapid analysis of up to 100 sequences simultaneously when uploaded in a file in FASTA format (for tardigrade species or NRDB).

Two fifths of the tardigrade sequences cluster in longer protein families, and we hypothesise for a number of these that they are implicated in the unique stress adaptation potential of tardigrades. We find also ten tardigrade specific clusters. The unique tardigrade adaptions are furthermore indicated by a number of functional COGs and KOGs identified here, showing a particular emphasis on the protection of proteins and DNA. RNA read out is specifically regulated by several motifs for mRNA stability clearly overrepresented in tardigrades.

Results and Discussion

We analysed all publicly available tardigrade sequences (status 9th of April 2009) as well as 607 unpublished M. tardigradum sequences from our ongoing transcriptome analysis.

Major tardigrade protein clusters of related sequence-similar proteins

All available tardigrade sequences were clustered by the CLANS algorithm [21]. Interestingly, 39.3 % of the predicted proteins (mainly EST-based predictions) cluster in just 58 major families, each with at least 20 sequences [see additional file 1: Table S1]. These include 4,242 EST sequences from a total of 10,787.

Additional file 1. Additional Tables and Figures. The file contains seven additional figures and two additional tables. One of these tables summarizes annotation and different identifiers for 607 new EST sequences from Milne-sium tardigradum.

Format: PDF Size: 1.3MB Download file

This file can be viewed with: Adobe Acrobat ReaderOpen Data

Using these clusters, a number of tardigrade-specific adaptations become apparent (Table 1 [and additional file 1: Table S1]): the clusters include elongation factors (cluster 12), ribosomal RNAs and proteins (cluster 1, 4, 32 and 56) which are part of the transcriptional or translational machinery. Cluster 5 (chitinase binding domain [22]) could provide membrane and structural reorganization or immune protection (e.g. fungi) according to homologous protein sequences characterized in other organisms. Other clusters show protein families related to the tardigrade stress adaptation potential, e.g. ubiquitin-related proteins (cluster 14; maybe stress-induced protein degradation) and cytochrome oxidase-related proteins (cluster 2, suggested to be involved in respiratory chain).

Table 1. CLANS clusters of sequence similar proteins in published tardigrade sequences1

Moreover, proteins responsible for protein degradation (cluster 15) were found as well as proteins regulating peptidases (cluster 16). Cluster 23 consists of 53 heat shock proteins which are involved in many stress response reactions [23]. Few diapause specific proteins (cluster 24) are known from other animals. Diapause is a reversible state of developmental suspension. It is observed in diverse taxa, from plants to animals, including marsupials and some other mammals [24] as well as insects (associated molecular function varies but involves calcium channel inhibition [25]) and should here support the tun formation or regulate other (e.g. developmental) metabolic inactive states. Furthermore, proteins involved in storage or transportation of fatty acids also seem to be important (cluster 31, [26]). Late embryogenesis abundant (LEA) protein expression seems to be linked to desiccation stress and the acquisition of desiccation tolerance in organisms [27] e.g. nematodes [28,29] and rotifers [30]. Thirty-one LEA type 1 family proteins were found in cluster 38.

LEA proteins are wide-spread among plants and synthesized in response to certain stresses [31,32]. The LEA type 1 family is well known in higher plants (rice, maize, carrots) to be synthesized during late embryogenesis and in ABA stress response. It includes desiccation-related protein PCC3-06 of Cratersostigma plantagineum. LEA type 1 family occurs in bacteria (e.g. Haemophilus influenzae, Deinococcus radiodurans), but is atypical for animals. However, this is an animal example where LEA family type 1 is well represented and forms a full cluster.

Moreover, ten clusters (8, 18, 19, 30, 33, 35, 37, 42, 51, 55) consist of proteins which seem to be specific for tardigrades. These show no significant homology to known proteins.

Functional clusters of stress-specific adaptations present in tardigrades

To gain a systematic overview of involved tardigrade functions, all available tardigrade sequences were classified species-specific according to COG functional category [19,20] as well as according to COG number and molecular function encoded. Note that in this section "protein" implies one type of protein. A COG or KOG comprises often several sequences from different tardigrades. Prokaryotic (COG) and eukaryotic (KOG) gene clusters were compared (Table 2; details on the WEB http://waterbear.bioapps.biozentrum.uni-wuerzburg.de/ webcite). Again, several tardigrade-specific adaptations stand out, e.g. highly represented COGs regulate translation elongation factor and sulfate adenylate transferase and a strong ubiquitin system. There are many cysteine proteases (21 proteins). For redox protection there are 14 thioredoxin-domain containing proteins and 75 Heme/copper-type cytochrome/quinol-like proteins as well as ubiquinone oxidoreductase subunits (15 proteins). There are ten proteins involved in seleno-cysteine specific translation [33,34]. In eukaryotes, selenoproteins show a mosaic occurrence, with some organisms, such as vertebrates and algae, but notably also tardigrades, having dozens of these proteins, while other organisms, such as higher plants and fungi, having lost all selenoproteins during evolution [34]. Membrane GTPases (25 proteins) are often of Lep A (leader peptidase [35]) type in tardigrades. In general, members of the GTPase superfamily regulate membrane signaling pathways in all cells. However, LepA, as well as NodO, are prokaryotic-type GTPases very similar to protein synthesis elongation factors but apparently have membrane-related functions [35]. It is interesting to observe this prokaryotic-type GTPase in tardigrades. We suggest that it will have similar function as known in other organisms and thus ensure protein translation (elongation factor) coupled to membrane integrity and possibly cytoskeletal rearrangement which would again boost the tardigrade resistance to stress.

Table 2. Highly represented protein functions in Tardigrades (COGs and KOGs).

The KOGs show similar highly represented families and adaptations. Abberant proteins are rapidly recognized by ubiquitination-like proteins (220 proteins) and ubiquitin-ligase related enzymes (71 proteins) as well as proteasome regulatory subunits (85 proteins). For protein protection and refolding disulfide isomerases (26 proteins) and cyclophilin type peptidyl-prolyl cis-trans isomerases (43 proteins; KOG 0879-0885) are available. Connected to redox protection are also thirty AAA+type ATPases and three peroxisome assembly factor 2 containing proteins (KOG0736). This broad effort in protein protection is further supported by molecular chaperones (HSP70, mortalins and other; total of 50 proteins) and chaperonin complex components (32 proteins; KOG0356-0364). There are six superoxide dismutases and six copper chaperons for thioredoxins (37 proteins), glutaredoxin-like proteins (nine) and ten thiodisulfide isomerases as well as 52 glutathione-S-transferases. We found 22 hits to helicases. Tardigrade DNA protection is represented by 52 proteins of the molecular chaperone DNA J family: proteins of the DNA J family are classified into 3 types according to their structural domain decomposition. Type I J proteins compose of the J domain, a gly-rich region connecting the J domain and a zinc finger domain, and possibly a C-terminal domain. Type II lacks the Zn-finger domain and type III only contains the J domain [36,37]. The latter two are referred to as DnaJ-like proteins. Analysis of the domains present in tardigrade proteins by SMART [38] and Pfam [39] searches reveals only the J domain and in some cases a transmembrane region, identifying them as type III DnaJ-like proteins. For further information on these COGs/KOGs see Table 3.

Table 3. Identified DnaJ-family COGs/KOGs in Tardigrades and Milnesium tardigradum1.

Moreover, undesired proteins can be rapidly degraded by cathepsin F-like proteins (31 proteins) or L-like proteins (46 proteins). There are several calcium-dependent protein kinases (25 proteins; KOG0032-0034) and actin-bundling proteins. According to this observation calcium signaling should be implicated in adaptive rearrangement of the cytoskeleton during tardigrade rehydration. The cytoskeleton is a key element in the organisation of eukaryotic cells. It has been described in the literature that the properties of actin are modulated by small heat-shock proteins including a direct actin-small heat-shock protein interaction to inhibit actin polymerization to protect the cytoskeleton [40,41] (compare with the CLANS cluster 24 (Diapause proteins) found in the above analysis).

Translation in tardigrades includes polypeptide release factors (71 proteins) and proteins for translation elongation (77 proteins). There are about 80 GTP-binding ADP-ribosylation factors. The secretion system and Rab/Ras GTPases are fully represented (183 proteins). Seventeen tubulin anchor proteins show that the cytoskeleton is well maintained. Finally, we find 14 TNF-associated factors and 34 apolipoprotein D/lipocalin proteins.

Typical motifs in tardigrade mRNAs

The regulatory motif search showed a number of known regulatory RNA elements involved in tardigrade mRNA regulation (Table 4 for H. dujardini and M. tardigradum). Certainly it can not be formally ruled out that some of these elements work in a tardigrade modified way. Similarly, there are probably further patterns which are tardigrade specific, but not detected with the UTRscan software [42] applied for analysis.

Table 4. Regulatory elements in Hypsibius dujardini1 and Milnesium tardigradum2 mRNA sequences.

The RNA elements found include the lox-P DICE element [43] in H. dujardini as top hit with as many as 1,269 ESTs (23.6% of all H. dujardini EST sequences). The cytidine-rich 15-lipoxygenase differentiation control element (15-LOX DICE, [44]) binds KH domain proteins of the type hnRNP E and K (stronger in multiple copies), mediating mRNA stabilization and translational control [43].

Furthermore, a high number of mRNAs contains K-Boxes (cUGUGAUa, [45]) and brd Boxes (AGCUUUA, [46]). All these elements are involved in mRNA storage and mRNA stability. These two elements are potential targets for miRNAs as shown in Drosophila melanogaster [47].

However, in the two tardigrade species compared, only 16 of 30 well known RNA elements are found, suggesting a clear bias in tardigrade mRNA regulation. For example, the widely used AU rich elements in higher organisms [42] such as vertebrates are absent in tardigrades [see additional file 1].

Regulatory elements in tardigrade mRNA are probably important for their adaptation, in particular to support transformation to tun stage and back to active stage again. The list of RNA elements found can be compared for instance to our data on regulatory elements in human anucleate platelets [48] where mRNAs have to be stockpiled for the whole life of the platelet. Due to this comparatively long life, a long mRNA untranslated region is important in these cells. The same should apply to tardigrade mRNAs, since their average UTR is predicted to be long. A different stock-piling scenario occurs in unfertilized eggs, but due to developmental constraints, here localization signals are often in addition important for developmental gradients. We tested for these in tardigrades but did not find a high representation of localization motifs.

Web-tool tardigrade analyzer

We created a convenient platform to allow rapid sequence comparisons of new protein sequences, in particular from new sequencing efforts in tardigrades, to our database by applying rapid heuristic local alignment using BLAST [49] and allowing to search in selected species.

A batch mode allows the analysis of up to 100 sequences simultaneously when uploaded in a file in FASTA format. Output data are displayed according to an enhanced BLAST output format with graphical illustrations. Low expected E-values result for searches using the option of our tardigrade specific databases: a more specific smaller database reduces the probability of false positives. As an alternative for general sequence analysis, a search against the non-redundant database of GenBank can be performed. This takes more computational power and yields higher E-values, however, it identifies functions for most sequences. An additional useful feature is to scan all available data for peptide motifs or PROSITE signatures using a "pattern" module [additional file 1: Fig. S1] or assign potential functions by COGs [19]. The first is helpful to recognize tardigrade proteins in cases where the tardigrade sequence has diverged far, and only critical residues for function are still conserved as motif signatures. It can also be applied to search for regulatory RNA motifs such as polyadenylation sites (e.g. AAUAAA or AAUUAA) or recognize promotor modules such as the glucocorticoid receptor element (GRE; palindromic pattern: AGAACAnnnTGTTCT). For this purpose, both, the tardigrade sequences and the non redundant database can be searched (e.g. to look for stress-specific regulatory RNA elements; [additional file 1, Fig. S2]).

Interestingly, this nucleotide (RNA or DNA) specific option is not available on some common servers, e.g. the PHI-BLAST [50] server at NIH. Further options include a user-defined database [additional file 1: Fig. S3] and interactively animated stress clusters (Figure 1).

thumbnailFigure 1. Functional clusters by CLANS of sequence related proteins in tardigrades. All available [see additional file 1: Figure S5] tardigrade protein sequences were clustered in a 3D sphere according to their sequence distance and were projected to the paper plane. Individual protein functions are colored [for color code see additional file 1: Table S1] and all listed in Table 1. Functional clusters appear as patches of an individual color. Color code and clusters can be interactively examined at the Tardigrade workbench http://waterbear.bioapps.biozentrum.uni-wuerzburg.de webcite and are given in [additional file 1 Table S1]. figure1.pdf

The tool http://waterbear.bioapps.biozentrum.uni-wuerzburg.de/ webcite allows rapid searches for tardigrade specific sequences, e.g. molecular adaptations against stress [see additional file 1 for screenshots and a tutorial]. For instance, a search for trehalase sequences shows no trehalase mRNA in the H. dujardini sequences. In contrast, there are several heat shock proteins in tardigrades, an example is HSP90 proteins (identified by sequence similarity as well as by a pattern hit based approach using the PROSITE entry PS00298 with the signature Y-x- [NQHD]- [KHR]- [DE]- [IVA]-F- [LM]-R- [ED]; Table 5). Specific COGs are also rapidly assigned for any desired sequence. This includes the option to map the query sequence of interest to any of the known tardigrade specific COGs. Furthermore, nucleotide patterns such as mRNA polyadenylation sites are rapidly identified e.g. in H. dujardini mRNAs [additional file 1: Fig. S4]. Similarly, other mRNA 3'UTR elements can be identified, e.g. AU rich sequences mediating mRNA instability or regulatory K-boxes (motif cUGUGAUa, [45]) in tardigrades.

Table 5. HSP90 proteins identified in Hypsibius dujardini using the Tardigrade analyzer1.

Implications

Tardigrades show a surprising large amount of related sequences. Certainly, one has to correct for a few genes sequenced from many lineages for phylogenetic studies in tardigrades (cytochrome c, rRNA etc.) However, despite this, a number of tardigrade-specific clusters still remain. Furthermore, Table 1 shows that most of the annotated clusters are stress-related.

Looking at specific protein functions, both COG and KOG proteins show that tardigrades spend an extraordinary effort in protein protection, turnover and recycling as well as redox protection. Some other specific adaptations become apparent also from Table 2, but the complete extent of these adaptations is unclear given the limited sampling of available tardigrade sequences. Furthermore, protection of DNA is critical as it has been shown that tardigrade tuns accumulate DNA damage which first has to be repaired before resurrection occurs [51,52]. Taking this into consideration, DNA J proteins were investigated in more detail since proteins of this family are well represented in tardigrades, including several COGs and KOGs. Several data underline the extremely high resistance of tardigrades to temperature, pressure and radiation as well as a high repair potential regarding DNA [11,51]. Thus, we suggest that the high repair potential is also mediated by this well represented protein family. Phylogenetic analysis (Table 3) shows that these proteins are represented by several KOGs as well as the classic COGs in tardigrades. In particular, the first three KOG families are also used in M. tardigradum, where extreme stress tolerance requires strong repair mechanisms [17]. Furthermore, all these tardigrade proteins in Table 3 are small, having neither zinc-finger domains nor low complexity regions, but instead consisting of single DNA J domains which would always place them in type I (subfamily A) of DNA-J like proteins. This suggests that the direct interaction with DNA-J like proteins is the key molecular function.

Finally, we could show that there are 16 regulatory elements used in tardigrade mRNA, while a number of other elements known from higher eukaryotic organisms and vertebrates is not used. It is interesting to note that the elements often used in tardigrades are all involved in regulation of mRNA stability. Thus, they may be implicated in stage switching, as presumably in the initial phases of the tun awakening or tun formation, new supply of mRNA is turned off and instead regulation of synthesized mRNA becomes important.

In addition, and for further research we supply the web tool tardigrade analyzer. There are a number of alternative tools available, e.g. from NCBI http://www.ncbi.nlm.nih.gov/ webcite. However, we offer some species-specific searches not available from these sources as well as RNA and promotor pattern search (not only for tardigrades but also for NRDB; not available from NIH). Furthermore, there are functional COG prediction as well as new, unpublished tardigrade sequences from M. tardigradum, all above reported data including the reported sequences and detailed functional clusterings as well as regular server updates. A better understanding of the survival mechanisms in these organisms will lead to the development of new methods in several areas of biotechnology. For example, preservation of biological materials in situ, macromolecules and cells from non-adapted organisms [53]. This is, of course, only a first and very general overview on potential tardigrade specific adaptations, more species-specific data will be considered as more information becomes available.

Conclusion

Tardigrade genomes invest in stress-specific adaptations, this includes major sequence related protein clusters, functional clusters for stress as well as specific regulatory elements in mRNA. For further tardigrade genome analysis we offer the tardigrade workbench as a flexible tool for rapid and efficient analysis of sequence similarity, protein function and clusters, COG membership and regulatory elements.

Methods

Tardigrade-sequences

The cosmopolitan eutardigrade species M. tardigradum Doyére 1849 (Apochela, Milnesidae) was cultured. Tardigrades were kept and reared on petri dishes (diameter: 9.4 cm) filled with a small layer of agarose (3 %) (peqGOLD Universal Agarose, peqLAB, Erlangen, Germany) and covered with spring water (Volvic™ water, Danone Waters Deutschland, Wiesbaden, Germany) at 20 ± 2°C and a light/dark cycle of 12 h. Rotifers Philodina citrina and nematodes Panagrellus sp. were provided as food source, juvenile tardigrades were also fed with green algae Chlorogonium elongatum. For all experiments adult animals in good physical condition were taken directly from the culture and starved for three days to avoid preparation of additional RNA originating from not completely digested food in the intestinal system. For an overview of RNAs present both in active and tun stage we used a mixture of the same number of animals.

Total RNA extraction was performed using the QIAGEN RNeasy®Mini kit (Qiagen, Hilden, Germany). The cDNA synthesis was reversed transcribed using 1 μg total RNA by the Creator™ SMART™ cDNA Library Construction Kit (Clontech-Takara Bio Europe, France). The resulting cDNA was amplified following the manufacturers protocol and cloned into pDNR-Lib cloning vector. The resulting plasmids were used to transform Escherichia coli by electroporation. Sequencing of the cDNA-library was done by ABI 3730XL capillary sequencer (GATC Biotech AG, Konstanz, Germany). All obtained EST sequences were deposited with Genbank including dbEST databank.

Nucleotide sequences from other tardigrades were collected from Genbank. For H. dujardini, the best represented species, we composed 5,235 ESTs. We stored H. dujardini as well as all published sequences of other tardigrade species (e.g. T. stephaniae, E. testudo, M. tardigradum, R. coronifer) in a database (10,787 sequences including translated sequences, details in [additional file 1], status on April, 2009).

CLANS clustering

For a systematic overview on tardigrade specific adaptations we first clustered all published tardigrade nucleotide sequences into functional clusters (Figure 1) using the Cluster analysis of sequences (CLANS) algorithm [21]. All sequences were clustered in 3D space using 0.001 as an E-value cut-off for TBLASTX all-against-all searches. [additional file 1: Fig. S4].

Identification of regulatory elements

For this the ESTs of H. dujardini and M. tardigradum were systematically screened using the software UTRscan [42]. This software screens 30 regulatory elements for RNA regulation with a focus on 3' UTR elements and stability of mRNA. The default settings for batch mode were used and all reported elements were collected.

COG clustering and identification

In order to acquire a systematic overview of the functionalities, we used the latest version of COG/KOG databases ftp://ftp.ncbi.nih.gov/pub/COG webcite and the BLAST hits from both nucleotide search and protein search were clustered according to their COG ID. Searches were carried out in parallel on all the tardigrade species including M. tardigradum, H. dujardini, E. testudo, T. stephaniae and R. coronifer. The results are summarized in a table shown in the tardigrade analyzer, the background color from cold to warm (blue to red) indicates the cluster size, which enables an easy comparison. Moreover, users are allowed to click the COG ID and the hit number. The server then reports the corresponding sequence ID, description, conservation and the homologous entries recorded in the database. The server with its data is automatically updated bi-monthly according to the latest tardigrade databases.

Tardigrade workbench

The tardigrade workbench is implemented in Perl using the Bioperl modules [54]. NCBI BLAST program of 2.2.17 is involved in the software package. A database of Postgresql 8.1.9 is applied to manage the tardigrade entries so as to accelerate the searching queried by investigators. The COG cluster information is automatically updated each week and warehoused on the server. In addition, the run of tardigrade workbench requires an Apache server, a linux system of at least 2 GB memory is highly recommended.

Authors' contributions

FF did tardigrade protein data analysis including CLANS clustering and RNA motif analysis. CL established the current version of the tardigrade workbench including programming new routines, data management and nucleotide motif analysis. AS did the initial setup of the server, of the virtual ribosome and the CLANS clustering. DB, JE, MS and MF participated in tardigrade data analysis. TM gave expert advice and input on statistics, RS gave expert advice on tardigrade physiology and zoology. TD led and guided the study including analysis of data and program, supervision, and manuscript writing. All authors participated in the writing of the manuscript and approved the final version.

Acknowledgements

Stylistic corrections by Rosemary Wilson from EMBL Hamburg are gratefully acknowledged. Support by the state Bavaria, DFG (TR34A5) and the German Federal Ministry of Education and Research, BMBF (0313838A, 0313838B, 0313838C, 0313838D, 0313838E) is gratefully acknowledged.

References

  1. Marcus E, Dahl F: Spinnentiere oder Arachnoidea IV. Bärtierchen (Tardigrada). Urban & Fischer Bei Elsevier; 1928. OpenURL

  2. Marcus E: Zur Ökologie und Physiologie der Tardigraden.

    Zool Jahrb Abt Phys 1928, 44:323-370. OpenURL

  3. Nelson DR: Current Status of the Tardigrada: Evolution and Ecology.

    Integr Comp Biol 2002, 42:652-659. Publisher Full Text OpenURL

  4. Keilin D: The Leeuwenhoek Lecture: The problem of anabiosis or latent life: History and current concept.

    Proc R Soc Lond B Biol Sci 1959, 150:149-191. PubMed Abstract | Publisher Full Text OpenURL

  5. Ramazzotti G, Maucci W: The Phylum Tardigrada.

    Memorie dell'Istituto Italiano di Idrobiologia, Pallanza 1983, 41:309-314. OpenURL

  6. Baumann H: Die Anabiose der Tardigraden.

    Zool Jahrb 1922, 45:501-556. OpenURL

  7. Baumann H: Bemerkungen zur Anabiose von Tardigraden.

    Zool Anz 1927, 72:175-179. OpenURL

  8. Horikawa DD, Sakashita T, Katagiri C, Watanabe M, Kikawada T, Nakahara Y, Hamada N, Wada S, Funayama T, Higashi S, Kobayashi Y, Okuda T, Kuwabara M: Radiation tolerance in the tardigrade Milnesium tardigradum.

    Int J Radiat Biol 2006, 82:843-848. PubMed Abstract | Publisher Full Text OpenURL

  9. Hengherr S, Worland MR, Reuner A, Brümmer F, Schill RO: Freeze tolerance, supercooling points and ice formation: comparative studies on the subzero temperature survival of limno-terrestrial tardigrades.

    J Exp Biol 2009, 212:802-807. PubMed Abstract | Publisher Full Text OpenURL

  10. Hengherr S, Worland MR, Reuner A, Brümmer F, Schill RO: High-Temperature Tolerance in Anhydrobiotic Tardigrades Is Limited by Glass Transition.

    Physiol Biochem Zool 2009, 82(6):749-755. PubMed Abstract | Publisher Full Text OpenURL

  11. Jönsson KI, Rabbow E, Schill RO, Harms-Ringdahl M, Rettberg P: Tardigrades survive exposure to space in low Earth orbit.

    Curr Biol 2008, 18:R729-R731. PubMed Abstract | Publisher Full Text OpenURL

  12. Jönsson KI, Schill RO: Induction of Hsp70 by desiccation, ionising radiation and heat-shock in the eutardigrade Richtersius coronifer.

    Comp Biochem Physiol B Biochem Mol Biol 2007, 146:456-460. PubMed Abstract | Publisher Full Text OpenURL

  13. Wright JC: Cryptobiosis 300 Years on from van Leuwenhoek: What Have We Learned about Tardigrades?

    Zoologischer Anzeiger - A Journal of Comparative Zoology 2001, 240:563-582. Publisher Full Text OpenURL

  14. Ammermann D: The cytology of parthenogenesis in the tardigrade Hypsibius dujardini.

    Chromosoma 1967, 23(2):203-213. PubMed Abstract | Publisher Full Text OpenURL

  15. Gabriel WN, McNuff R, Patel SK, Gregory TR, Jeck WR, Jones CD, Goldstein B: The tardigrade Hypsibius dujardini, a new model for studying the evolution of development.

    Dev Biol 2007, 312:545-559. PubMed Abstract | Publisher Full Text OpenURL

  16. Bavan S, Straub VA, Blaxter ML, Ennion SJ: A P2X receptor from the tardigrade species Hypsibius dujardini with fast kinetics and sensitivity to zinc and copper.

    BMC Evol Biol 2009, 9:17. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  17. Kinchin I, Dennis R: The biology of tardigrades. Portland Press London; 1994. OpenURL

  18. Baxevanis AD: Searching the NCBI databases using Entrez.

    Curr Protoc Hum Genet 2006., Chapter 6

    Unit 6.10

    PubMed Abstract | Publisher Full Text OpenURL

  19. Tatusov RL, Galperin MY, Natale DA, Koonin EV: The COG database: a tool for genome-scale analysis of protein functions and evolution.

    Nucleic Acids Res 2000, 28:33-36. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  20. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: The COG database: an updated version includes eukaryotes.

    BMC Bioinformatics 2003, 4:41. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  21. Frickey T, Lupas A: CLANS: a Java application for visualizing protein families based on pairwise similarity.

    Bioinformatics 2004, 20:3702-3704. PubMed Abstract | Publisher Full Text OpenURL

  22. Tjoelker LW, Gosting L, Frey S, Hunter CL, Trong HL, Steiner B, Brammer H, Gray PW: Structural and functional definition of the human chitinase chitin-binding domain.

    J Biol Chem 2000, 275:514-520. PubMed Abstract | Publisher Full Text OpenURL

  23. Qian SB, McDonough H, Boellmann F, Cyr DM, Patterson C: CHIP-mediated stress recovery by sequential ubiquitination of substrates and Hsp70.

    Nature 2006, 440:551-555. PubMed Abstract | Publisher Full Text OpenURL

  24. Chen WH, Ge X, Wang W, Yu J, Hu S: A gene catalogue for post-diapause development of an anhydrobiotic arthropod Artemia franciscana.

    BMC Genomics 2009, 10:52. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  25. Kim YJ, Nachman RJ, Aimanova K, Gill S, Adams ME: The pheromone biosynthesis activating neuropeptide (PBAN) receptor of Heliothis virescens: identification, functional expression, and structure-activity relationships of ligand analogs.

    Peptides 2008, 29:268-275. PubMed Abstract | Publisher Full Text OpenURL

  26. Alvarez-Ordóñnez A, Fernández A, López M, Bernardo A: Relationship between membrane fatty acid composition and heat resistance of acid and cold stressed Salmonella senftenberg CECT 4384.

    Food Microbiol 2009, 26:347-353. PubMed Abstract | Publisher Full Text OpenURL

  27. Tunnacliffe A, Wise MJ: The continuing conundrum of the LEA proteins.

    Naturwissenschaften 2007, 94:791-812. PubMed Abstract | Publisher Full Text OpenURL

  28. Browne JA, Dolan KM, Tyson T, Goyal K, Tunnacliffe A, Burnell AM: Dehydration-specific induction of hydrophilic protein genes in the anhydrobiotic nematode Aphelenchus avenae.

    Eukaryot Cell 2004, 3:966-975. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  29. Goyal K, Tisi L, Basran A, Browne J, Burnell A, Zurdo J, Tunnacliffe A: Transition from natively unfolded to folded state induced by desiccation in an anhydrobiotic nematode protein.

    J Biol Chem 2003, 278:12977-12984. PubMed Abstract | Publisher Full Text OpenURL

  30. Tunnacliffe A, Lapinski J, McGee B: A putative LEA protein, but no trehalose, is present in anhydrobiotic bdelloid rotifers.

    Hydrobiologia 2005, 546:315-321. Publisher Full Text OpenURL

  31. Kobayashi F, Maeta E, Terashima A, Takumi S: Positive role of a wheat HvABI5 ortholog in abiotic stress response of seedlings.

    Physiol Plant 2008, 134:74-86. PubMed Abstract | Publisher Full Text OpenURL

  32. Hong-Bo S, Zong-Suo L, Ming-An S: LEA proteins in higher plants: structure, function, gene expression and regulation.

    Colloids Surf B Biointerfaces 2005, 45:131-135. PubMed Abstract | Publisher Full Text OpenURL

  33. Fagegaltier D, Lescure A, Walczak R, Carbon P, Krol A: Structural analysis of new local features in SECIS RNA hairpins.

    Nucleic Acids Res 2000, 28:2679-2689. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  34. Lobanov AV, Hatfield DL, Gladyshev VN: Eukaryotic selenoproteins and selenoproteomes.

    Biochim Biophys Acta 2009, in press. PubMed Abstract | Publisher Full Text OpenURL

  35. March PE: Membrane-associated GTPases in bacteria.

    Mol Microbiol 1992, 6:1253-1257. PubMed Abstract | Publisher Full Text OpenURL

  36. Walsh P, Bursać D, Law YC, Cyr D, Lithgow T: The J-protein family: modulating protein assembly, disassembly and translocation.

    EMBO Rep 2004, 5:567-571. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  37. Cheetham ME, Caplan AJ: Structure, function and evolution of DnaJ: conservation and adaptation of chaperone function.

    Cell Stress Chaperones 1998, 3:28-36. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  38. Letunic I, Doerks T, Bork P: SMART 6: recent updates and new developments.

    Nucleic Acids Res 2009, 37:D229-D232. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  39. Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer ELL, Bateman A: The Pfam protein families database.

    Nucleic Acids Res 2008, 36:D281-D288. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  40. Mounier N, Arrigo AP: Actin cytoskeleton and small heat shock proteins: how do they interact?

    Cell Stress Chaperones 2002, 7:167-176. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  41. Sun Y, MacRae TH: Small heat shock proteins: molecular structure and chaperone function.

    Cell Mol Life Sci 2005, 62:2460-2476. PubMed Abstract | Publisher Full Text OpenURL

  42. Pesole G, Liuni S: Internet resources for the functional analysis of 5' and 3' untranslated regions of eukaryotic mRNAs.

    Trends Genet 1999, 15:378. PubMed Abstract | Publisher Full Text OpenURL

  43. Ostareck-Lederer A, Ostareck DH, Hentze MW: Cytoplasmic regulatory functions of the KH-domain proteins hnRNPs K and E1/E2.

    Trends Biochem Sci 1998, 23:409-411. PubMed Abstract | Publisher Full Text OpenURL

  44. Ostareck-Lederer A, Ostareck DH, Standart N, Thiele BJ: Translation of 15-lipoxygenase mRNA is inhibited by a protein that binds to a repeated sequence in the 3' untranslated region.

    EMBO J 1994, 13:1476-1481. PubMed Abstract | PubMed Central Full Text OpenURL

  45. Lai EC, Burks C, Posakony JW: The K box, a conserved 3' UTR sequence motif, negatively regulates accumulation of enhancer of split complex transcripts.

    Development 1998, 125:4077-4088. PubMed Abstract | Publisher Full Text OpenURL

  46. Lai E: Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation.

    Nat Genet 2002, 30:363-364. PubMed Abstract | Publisher Full Text OpenURL

  47. Lai EC, Tam B, Rubin GM: Pervasive regulation of Drosophila Notch target genes by GY-box-, Brd-box-, and K-box-class microRNAs.

    Genes Dev 2005, 19:1067-1080. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  48. Dittrich M, Birschmann I, Pfrang J, Herterich S, Smolenski A, Walter U, Dandekar T: Analysis of SAGE data in human platelets: features of the transcriptome in an anucleate cell.

    Thromb Haemost 2006, 95:643-651. PubMed Abstract | Publisher Full Text OpenURL

  49. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

    Nucleic Acids Res 1997, 25:3389-3402. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  50. Zhang Z, Schäffer AA, Miller W, Madden TL, Lipman DJ, Koonin EV, Altschul SF: Protein sequence similarity searches using patterns as seeds.

    Nucleic Acids Res 1998, 26:3986-3990. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  51. Neumann S, Reuner A, Brümmer F, Schill RO: DNA damage in storage cells of anhydrobiotic tardigrades.

    Comp Biochem Physiol A Mol Integr Physiol 2009, 153:425-429. PubMed Abstract | Publisher Full Text OpenURL

  52. Schill R, Neumann S, Reuner A, Brümmer F: Detection of DNA damage with single-cell gel electrophoresis in anhydrobiotic tardigrades.

    Comp Biochem Physiol A Mol Integr Physiol 2008, 151:32-32. Publisher Full Text OpenURL

  53. Schill RO, Mali B, Dandekar T, Schnölzer M, Reuter D, Frohme M: Molecular mechanisms of tolerance in tardigrades: new perspectives for preservation and stabilization of biological material.

    Biotechnol Adv 2009, 27:348-352. PubMed Abstract | Publisher Full Text OpenURL

  54. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JGR, Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E: The Bioperl toolkit: Perl modules for the life sciences.

    Genome Res 2002, 12:1611-1618. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL