Open Access Research article

Widespread, focal copy number variations (CNV) and whole chromosome aneuploidies in Trypanosoma cruzi strains revealed by array comparative genomic hybridization

Todd A Minning1, D Brent Weatherly1, Stephane Flibotte2 and Rick L Tarleton1*

Author Affiliations

1 Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, Georgia, 30602, USA

2 Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada

For all author emails, please log on.

BMC Genomics 2011, 12:139  doi:10.1186/1471-2164-12-139

Published: 7 March 2011

Additional files

Additional file 1:

Microsoft PowerPoint file of theCGHViewer view of T. cruzi chromosome 35 showing the CNV generated by knockout of one copy each of ECH1 and ECH2 (enoyl-CoA hydratase/isomerase family protein; Tc00.1047053511529.160, Tc00.1047053511529.150). Each dot represents an oligonucleotide probe. The CL-Brener strain, which was used as the reference strain for genome sequencing, is hybrid, thus probes were designed to non-Esmeraldo (non-Esm) sequences (blue), Esmeraldo-like (Esm) sequences (green), non-Esm gene family sequences (black), and Esm gene family sequences (gray). Positive log2 ratios of signal intensities (wild type strain/knockout strain) represent deletion in the knockout strain and negative log2 ratios represent amplification in the knockout strain, relative to wt T. cruzi. Units for the X axis (Position) are base pairs. Inset in panel A is the GBrowse view of the locus (ECH genes purple circle). Panel B is a close-up view of the locus on chromosome 35.

Format: PPT Size: 780KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional File 2:

PowerPoint file of all of the array data ordered by chromosome. Representative plots from 16 hybridizations for each T. cruzi chromosome. Each dot represents an oligonucleotide probe. The CL-Brener strain, which was used as the reference strain for genome sequencing, is hybrid, thus probes were designed to non-Esmeraldo (non-Esm) sequences (blue), Esmeraldo-like (Esm) sequences (green), non-Esm gene family sequences (black), and Esm gene family sequences (gray). In each panel positive log2 ratios of signal intensities (test strain/reference) represent amplification in the test strain and negative log2 ratios represent deletion in the test strain, relative to CL-Brener, which was the reference strain in all hybridizations. Boxed regions were the features selected for chromosome typing as explained in Figure 3. The different patterns observed for each chromosome are displayed at the bottom with letters corresponding to the typing letters in Figure 3. Blue boxes denote lower copy number in the test strain versus the reference strain, red boxes higher copy number in the test strain, and black boxes equal copy number in the test and reference strains. In each case for each chromosome, the CL-Brener type was the default type "A." Also, while two strains may have been assigned to the same CNV signature type for a given chromosome, they were not necessarily identical for that chromosome, as not every single CNV for every single chromosome was used in the typing (such an analysis would render every chromosome for every strain unique and make finding common patterns impossible). Note that chromosomes 4, 5, 18, 28, and 29 did not present sufficiently informative typing regions. Thus, all strains were type "A" for these chromosomes. Also, the CNV were haplotype specific. Therefore, in some cases this made the up or down calls (box color) appear incorrect, especially if the log2ratio for the feature was off the scale making it appear as if the boxed region is referring to the other haplotype. For example see PalDa for chromosome 8. In the boxed region the green (Esm) probes appear to be up yet the box is blue, indicating lower copy number, because the feature refers to non-Esm probes which are off the scale of the figure. Lastly, the boxes are guides to identifying the CNV used for typing, but they do not represent the exact bounds of those typing regions. In some cases due to the scale of the figure CNV that are close to the typing CNV appear to be part of the typing CNV.

Format: PPT Size: 15.8MB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 3:

A java-based executable file for viewing all of the CGH array data from this study. The CGH data were visualized and explored using 'CGH_Viewer,' which was written in the Java programming language. Additional file 2 is a Windows executable (.exe) that will install the CGH_Viewer, along with Java if it is not detected, on the target computer. The CGH_Viewer takes as input the mapping of probes from the microarray to the assembled chromosomes of T. cruzi as well as the results of 1 or more experiments between 2 strains. The probe-to-chromosome mapping file and the 16 experimental files are provided in the "Data" sub-folder. The results (dot plots of log2 ratios) of multiple chromosomes and multiple experiments may be viewed simultaneously. Zooming on an area of interest on a chromosome will show the same region across all experiments. The data can be filtered based on 1) the haplotype of sequence from which the probes were designed as well, 2) the ID or name of the sequence from which the probe was designed, 3) by raw intensity of the probe (a quality measurement), or 4) a sub-region of a chromosome. For additional information, see the provided documentation in the CGH_Viewer program folder (c:\Program Files\CGH_Viewer\).

Format: EXE Size: 92.1MB Download file

Open Data

Additional file 4:

Excel file containing the average normalized log2 ratios of signal intensities (test strain/CL Brener) for each coding and non-coding region in the T. cruzi genome for each of the hybridizations performed in this study. Esmeraldo-like (Esm) and non-Esmeraldo (non-Esm) probes for the indicated regions are averaged separately. Density is the genomic range (in base pairs) divided by the number of probes covering that range.

Format: XLS Size: 10.2MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 5:

MSWord file containing a chart of the distribution of genes associated with hotspot regions having less than 5 candidates on the arrays. Candidates were determined by having at least 5 probes within a maximum sequence size of 500 bp.

Format: DOCX Size: 15KB Download file

Open Data