Email updates

Keep up to date with the latest news and content from BMC Evolutionary Biology and BioMed Central.

Open Access Research article

BEL/Pao retrotransposons in metazoan genomes

Nicole de la Chaux12* and Andreas Wagner123

Author Affiliations

1 Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland

2 The Swiss Institute of Bioinformatics, Basel, Switzerland

3 The Santa Fe Institute, Santa Fe, NM, USA

For all author emails, please log on.

BMC Evolutionary Biology 2011, 11:154  doi:10.1186/1471-2148-11-154

Published: 4 June 2011

Additional files

Additional file 1:

Nucleotide sequence of all identified BEL/Pao elements. Nucleotide sequence of all BEL/Pao elements that we identified de novo in fasta format. Each record (element) has a unique fasta identifier consisting of three parts: (i) element identifier, (ii) internal family identifier as listed in additional file 4, and (iii) species identifier as listed in additional file 2. All identifier are joined by the underscore symbol '_'. For example, the identifier 5_NC-3_1 represents element 5 belonging to family NC-3 and is present in species 1 (Drosophila melanogaster).

Format: ZIP Size: 17.7MB Download file

Open Data

Additional file 2:

List of used genomes, Repbase Update, and Gypsy Database elements. All used metazoan genomes are listed together with an internal identifier. Additionally we give the current URL from which the genome sequence can be accessed, the accession numbers, number of sequences of the genome included in our analysis, overall number of nucleotides, the number of BEL/Pao elements we identified in that genome, the number of different BEL/Pao families we identified in that genome, and to which subkingdom/superphylum/phylum the species belong. Additionally we list all mammalian genomes we used and all genomes we excluded from our analysis. Furthermore, we give the name of all BEL/Pao elements from Repbase Update and from the Gypsy Database, together with the species name they occur in, and an internal identifier. The internal identifiers are also used for the nucleotide sequences in additional file 1 and in the sequence alignment of additional file 7.

Format: XLS Size: 181KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

BEL/Pao copy number per Mbps. A) The histogram shows the number of genomes containing a given copy number of BEL/Pao elements per Mbps. The inset shows the number of genomes containing between zero and one BEL/Pao elements per Mbps. The eight genomes containing more than one BEL/Pao element per Mbps come from either fruit fly or mosquito species. B) Relationship between total copy number and copy number per Mb for each genome. Each point in the graph represents one genome and shows the total BEL/Pao copy number and the copy number per Mbps. Note the logarithmic scale on both axes.

Format: PDF Size: 245KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

BEL/Pao families and their copy numbers. For each BEL/Pao family we list the copy number, species, and the superfamily in which they are present. Families that we did not use in the phylogenetic tree construction are not assigned to a superfamily.

Format: XLS Size: 183KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 5:

Species specific family classification. We describe the differences and agreements between the among-species family classification as used in the main text and the within-species family classification.

Format: PDF Size: 16.4MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

Amino acid sequences of consensus domains. A fasta file with all amino acid sequences for the domain consensus files we used for the phylogenetic tree reconstruction. The identifier consists of the family identifier (see additional file 4) and the domain name, for example NC-1_integrase represents the consensus sequence of the integrase domain of family NC-1.

Format: TXT Size: 548KB Download file

Open Data

Additional file 7:

Multiple alignment of family domains. The multiple alignment of 893 concatenated domain sequences. The phylogenetic tree is based on this multiple alignment.

Format: TXT Size: 1.3MB Download file

Open Data

Additional file 8:

Phylogenetic tree of BEL/Pao elements with species names. The Figure shows the same tree as in Figure 4 in the main text but with the species names shown in which the elements occur. If a clade of elements contained only element families from the same species or from very closely related species (e. g. mosquito species), the clade was collapsed to reduce the size of the tree. All species names are shown at the leaves of the tree. If all species in one clade of the tree belonged to the same genus, such as the genus Drosophila, only the genus name is shown, with the number of species in brackets. Major clades are highlighted in different colors.

Format: PDF Size: 414KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 9:

Structural information of superfamilies. The table shows the minimal, maximal, and median element lengths (in basepairs) and the minimal, maximal and median LTR length for the major superfamilies we identified.

Format: PDF Size: 10KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 10:

List of phyla covered by each superfamily. The table shows the phyla in which BEL/Pao superfamily members were identified.

Format: PDF Size: 19KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data