Open Access Highly Accessed Research article

Comparative genomic analysis of human infective Trypanosoma cruzi lineages with the bat-restricted subspecies T. cruzi marinkellei

Oscar Franzén1*, Carlos Talavera-López1, Stephen Ochaya1, Claire E Butler2, Louisa A Messenger3, Michael D Lewis3, Martin S Llewellyn3, Cornelis J Marinkelle4, Kevin M Tyler2, Michael A Miles3 and Björn Andersson1*

Author Affiliations

1 Department of Cell and Molecular Biology, Karolinska Institutet, Box 285, Stockholm, SE, 171 77, Sweden

2 Norwich Medical School, University of East Anglia, Norwich, Norfolk, NR4 7TJ, United Kingdom

3 Department of Pathogen Molecular Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, United Kingdom

4 Centro de Investigaciones en Microbiología y Parasitología Tropical, Universidad de los Andes, Santafé de Bogotá, Colombia

For all author emails, please log on.

BMC Genomics 2012, 13:531  doi:10.1186/1471-2164-13-531

Published: 5 October 2012

Additional files

Additional file 1:

Figure S1. Flow cytometry analysis of the T. c. marinkellei genome size. Description: Fluorescence emission histograms for propidium iodide-labelled epimastigotes showing relative DNA contents of T. c. cruzi Esm/3 (TcII), T. c. cruzi Sylvio X10/4 (TcI) and T. c. marinkellei B7/11.

Format: PDF Size: 165KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Figure S2. Histogram and smoothed density estimate of assembly-wide coverage differences between Tcm and Tcc X10. Description: (A) Histogram of percentage short read coverage differences from homologous regions. Percentages have been corrected for genome size. Vertical red lines indicate the lower and upper 2.5% quantiles. (B) Smoothed kernel density estimate of the left histogram created using logspline R package.

Format: PDF Size: 96KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Figure S3. Sequence variation of the TcMUCII mucin gene family. Description: Entropy plots of the TcMUCII mucin gene family. TcMUCII mucin genes were extracted from Tcm, Tcc X10 and Tcc CLBR non-Esm. Sequences were aligned with ClustalW v2.1. Sequence entropy was calculated using the entropy function of the R package bio3d. Only alignment positions with less than 10% gaps were included in the analysis. The normalized entropy score was then plotted as a function of alignment position, where conserved sites (low entropy) score 1 and diverse (high entropy) sites score 0. The analysis indicated that 5′ and 3′ termini of TcMUCII mucin genes generally are the most conserved in all three genomes and that the central region is the most variable.

Format: PDF Size: 558KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Table S1. Maxicircle gene coordinates and metrics. Description: Gene metrics for T. c. cruzi and T. c. marinkellei maxicircles. Including coordinates, average identity and length.

Format: PNG Size: 49KB Download file

Open Data

Additional file 5:

Figure S4. Maxicircle phylogenetic tree. Description: Maximum likelihood phylogenetic tree of the maxicircle sequences from T. c. marinkellei, T. c. cruzi Sylvio X10, T. c. cruzi CL Brener, T. c. cruzi Esmeraldo using T. brucei and L. tarentolae as outgroups. The full maxicircle sequences were aligned with ClustalW v2.1 and the subsequent alignment was filtered using Gblocks (default settings). The tree was inferred using MEGA v5.1 from 13,731 (49%) alignment positions.

Format: PDF Size: 134KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

Table S2. Ratio of non-synonymous and synonymous nucleotide substitutions. Description: Orthologous gene pairs between T. c. marinkellei and T. c. cruzi CL Brener displaying elevated dN/dS (> 1.1). The yn00 program was used to calculate dN and dS.

Format: PDF Size: 148KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Figure S5. Disruption of sequence co-linearity. Description: Disruption of chromosomal co-linearity between T. c. marinkellei and T. c. cruzi CL Brener non-Esmeraldo-like (A) as well as between T. c. cruzi Sylvio X10 and T. c. cruzi CL Brener non-Esmeraldo-like (B). Black chromosomes prefixed with ‘Chr’ represent sequences from Tc CL Brener whereas white chromosomes prefixed ‘contig’ represent sequences from Tcm and Tcc X10 assemblies. Alignments were generated using the promer software (Kurtz et al., 2004). Chromosomal stretches marked with green color represent gaps in the assembly. Only gaps larger than 5 kb are shown. The most outer numbers are sequence identifiers.

Format: PNG Size: 820KB Download file

Open Data

Additional file 8:

Figure S6. PCR validation of synteny breaks. Description: PCR validation results from a few regions containing synteny breaks in T. c. marinkellei and T. c. cruzi Sylvio X10.

Format: XLS Size: 36KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 9:

Figure S7. Phylogenetic tree of VIPER elements. Description: Maximum likelihood phylogenetic tree of VIPER retroelements from T. c. marinkellei, T. c. cruzi CLBR, T. c. cruzi X10. The colors correspond to; blue (Tcm), green (Tcc CLBR), red (Tcc X10). VIPER elements were identified with RepeatMasker and only elements longer than 2000 bp were included: 209 sequences in total (35 from Tcm, 57 from Tcc X10 and 117 from Tcc CLBR). The average branch lengths were; 0.0682 (Tcm), 0.039 (Tcc X10), 0.0455 (Tcc CLBR). The alignment was constructed with ClustalW and manually inspected. Gblocks was used to remove ambiguities from the alignment, which resulted in a total of 1518 positions that were used for inferring the phylogeny. The maximum likelihood tree was inferred with RAxML using the GTRCAT model and 100 bootstrap replicates.

Format: XLS Size: 47KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data