This article is part of the supplement: European Molecular Biology Network (EMBnet) Conference 2008: 20th Anniversary Celebration. Leading applications and technologies in bioinformatics
RetroTector online, a rational tool for analysis of retroviral elements in small and medium size vertebrate genomic sequences
1 Physiology unit, Dpt of Neuroscience, Box 593, Uppsala, Sweden
2 Linnaeus Centre for Bioinformatics, Biomedical Centre, Box 598, 751 24 Uppsala, Sweden
3 Section of Virology, Department of Medical Sciences, Uppsala University, Academic Hospital, 751 85 Uppsala, Sweden
BMC Bioinformatics 2009, 10(Suppl 6):S4 doi:10.1186/1471-2105-10-S6-S4Published: 16 June 2009
The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences.
RetroTector© (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web.
ROL http://www.fysiologi.neuro.uu.se/jbgs/ webcite was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10 000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of <= 100 Megabases. Jobs are shown in an IP-number specific list. Results are text files, and can be viewed with the program, RetroTectorViewer.jar (at the same site), which has the full graphical capabilities of the basic ReTe program. A detailed analysis of any retroviral sequences found in the submitted sequence is graphically presented, exportable in standard formats. With the current server, a complete analysis of a 1 Megabase sequence is complete in 10 minutes. It is possible to mask nonretroviral repetitive sequences in the submitted sequence, using host genome specific "brooms", which increase specificity.
Proviral sequences can be hard to recognize, especially if the integration occurred many million years ago. Precise delineation of LTR, gag, pro, pol and env can be difficult, requiring manual work. ROL is a way of simplifying these tasks.
ROL provides 1. annotation and presentation of known retroviral sequences, 2. detection of proviral chains in unknown genomic sequences, with up to 100 Mbase per submission.