MASiVEdb: the Sirevirus Plant Retrotransposon Database
1 Institute of Agrobiotechnology, Centre for Research and Technology Hellas, Thessaloniki, 57001, Greece
2 Department of Genetics and Plant Breeding, Aristotle University of Thessaloniki, Thessaloniki, 54006, Greece
3 Central European Institute of Technology, Masaryk University, Brno, Czech Republic
BMC Genomics 2012, 13:158 doi:10.1186/1471-2164-13-158Published: 30 April 2012
Sireviruses are an ancient genus of the Copia superfamily of LTR retrotransposons, and the only one that has exclusively proliferated within plant genomes. Based on experimental data and phylogenetic analyses, Sireviruses have successfully infiltrated many branches of the plant kingdom, extensively colonizing the genomes of grass species. Notably, it was recently shown that they have been a major force in the make-up and evolution of the maize genome, where they currently occupy ~21% of the nuclear content and ~90% of the Copia population. It is highly likely, therefore, that their life dynamics have been fundamental in the genome composition and organization of a plethora of plant hosts. To assist studies into their impact on plant genome evolution and also facilitate accurate identification and annotation of transposable elements in sequencing projects, we developed MASiVEdb (Mapping and Analysis of SireVirus Elements Database), a collective and systematic resource of Sireviruses in plants.
Taking advantage of the increasing availability of plant genomic sequences, and using an updated version of MASiVE, an algorithm specifically designed to identify Sireviruses based on their highly conserved genome structure, we populated MASiVEdb (http://bat.infspire.org/databases/masivedb/ webcite) with data on 16,243 intact Sireviruses (total length >158Mb) discovered in 11 fully-sequenced plant genomes. MASiVEdb is unlike any other transposable element database, providing a multitude of highly curated and detailed information on a specific genus across its hosts, such as complete set of coordinates, insertion age, and an analytical breakdown of the structure and gene complement of each element. All data are readily available through basic and advanced query interfaces, batch retrieval, and downloadable files. A purpose-built system is also offered for detecting and visualizing similarity between user sequences and Sireviruses, as well as for coding domain discovery and phylogenetic analysis.
MASiVEdb is currently the most comprehensive directory of Sireviruses, and as such complements other efforts in cataloguing plant transposable elements and elucidating their role in host genome evolution. Such insights will gradually deepen, as we plan to further improve MASiVEdb by phylogenetically mapping Sireviruses into families, by including data on fragments and solo LTRs, and by incorporating elements from newly-released genomes.