Open Access Research article

Analysis of a comprehensive dataset of diversity generating retroelements generated by the program DiGReF

Thomas Schillinger1, Mohamed Lisfi2, Jingyun Chi2, John Cullum2 and Nora Zingler13*

Author Affiliations

1 Department of Molecular Genetics, University of Kaiserslautern, Kaiserslautern, Germany

2 Department of Genetics, University of Kaiserslautern, Kaiserslautern, Germany

3 Department of Biology - Group of Molecular Genetics, University of Kaiserslautern, Paul-Ehrlich-Straße Building 24, Room 117, D-67663, Kaiserslautern, Germany

For all author emails, please log on.

BMC Genomics 2012, 13:430  doi:10.1186/1471-2164-13-430

Published: 28 August 2012

Additional files

Additional file 1:

Program DiGReF. Software to search for DGRs in a list of sequences supplied as GI numbers. Requires BioPerl to run.

Format: PL Size: 32KB Download file

Open Data

Additional file 2:

Results from BLASTp search and DiGReF analysis. Part A of the table lists the gi-numbers of all RTs from the psi-blast search that were positive in a DiGReF analysis with default settings (cut off seven or more adenine exchanges in a 50 bp window. Part B shows the additional 47 hits obtained when lowering this cut off to five or more A substitutions. Only six of these are most likely DGRs (i.e. they feature a (L/I/V)GxxxSQ or (L/I/V)GxxxNQ sequence, and their VR is part of an ORF and not a low complexity repeat). Part C lists the remaining gi-numbers of RTs from the psi-blast search that yielded no hit in the DiGReF analysis.

Format: XLSX Size: 49KB Download file

Open Data

Additional file 3:

Program ConvertGB. Software to convert the output from DiGReF into GenBank format. Requires BioPerl to run.

Format: PL Size: 11KB Download file

Open Data

Additional file 4:

Complete NJ tree of DGR RTs. Protein sequences of DGR RTs were aligned using COBALT and a Neighbor-Joining tree was built with PHYLIP. Bootstrap values >50 are indicated. Phylogenetic groups of organisms that also cluster on RT level are marked. Distances are indicated as expected substitutions per site. A group II intron reverse transcriptase from Bacillus halodurans (GI 47076650) was used as outgroup to root the tree.

Format: JPEG Size: 7.1MB Download file

Open Data

Additional file 5:

NJ tree of 16S rRNAs from organisms featuring DGRs. 16S RNA sequences were collected from SILVA database if available. A Neighbor-Joining tree was built using MEGA5. Distances are indicated as expected substitutions per site.

Format: PDF Size: 13KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

Alignment of DGR RTs. MAFFT alignment of the 155 DGRs RTs (yellow) identified in this study. For comparison with other known RTs, RTs from 8 group II introns (pink), 8 non-LTR retrotransposons (blue), 9 retroviridae (purple) and 8 telomerases (green) were also included. Conserved domains are indicated as black bars above the alignment. Conserved amino acids are highlighted with colors reflecting their chemical properties.

Format: PNG Size: 3.1MB Download file

Open Data

Additional file 7:

DGR RTs contain a positively charged region at their C-terminus. Additional file a 5 shows a section of Additional file 4 comprising the region C-terminal to domain 5. Only positively charged amino acids are highlighted in red. In DGR RTs, domain 7 is often followed by a patch with high positive charge (up to 11 positively charged amino acids in a 20 amino acid region), a feature that is not found in other RT enzymes.

Format: PNG Size: 1.2MB Download file

Open Data