Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Open Badges Methodology article

Gene Set Enrichment Analysis (GSEA) of Toxoplasma gondii expression datasets links cell cycle progression and the bradyzoite developmental program

Matthew McKnight Croken1, Weigang Qiu2, Michael W White3 and Kami Kim1*

Author Affiliations

1 Departments of Medicine, Microbiology & Immunology and Pathology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, 10461 Bronx, NY, USA

2 Department of Biological Sciences, Hunter College of the City of University of New York, New York 10065, NY, USA

3 Departments of Molecular Medicine and Global Health, University of South Florida, Tampa 33612, FL, USA

For all author emails, please log on.

BMC Genomics 2014, 15:515  doi:10.1186/1471-2164-15-515

Published: 24 June 2014



Large amounts of microarray expression data have been generated for the Apicomplexan parasite Toxoplasma gondii in an effort to identify genes critical for virulence or developmental transitions. However, researchers’ ability to analyze this data is limited by the large number of unannotated genes, including many that appear to be conserved hypothetical proteins restricted to Apicomplexa. Further, differential expression of individual genes is not always informative and often relies on investigators to draw big-picture inferences without the benefit of context. We hypothesized that customization of gene set enrichment analysis (GSEA) to T. gondii would enable us to rigorously test whether groups of genes serving a common biological function are co-regulated during the developmental transition to the latent bradyzoite form.


Using publicly available T. gondii expression microarray data, we created Toxoplasma gene sets related to bradyzoite differentiation, oocyst sporulation, and the cell cycle. We supplemented these with lists of genes derived from community annotation efforts that identified contents of the parasite-specific organelles, rhoptries, micronemes, dense granules, and the apicoplast. Finally, we created gene sets based on metabolic pathways annotated in the KEGG database and Gene Ontology terms associated with gene annotations available at webcite. These gene sets were used to perform GSEA analysis using two sets of published T. gondii expression data that characterized T. gondii stress response and differentiation to the latent bradyzoite form.


GSEA provides evidence that cell cycle regulation and bradyzoite differentiation are coupled. Δgcn5A mutants unable to induce bradyzoite-associated genes in response to alkaline stress have different patterns of cell cycle and bradyzoite gene expression from stressed wild-type parasites. Extracellular tachyzoites resemble a transitional state that differs in gene expression from both replicating intracellular tachyzoites and in vitro bradyzoites by expressing genes that are enriched in bradyzoites as well as genes that are associated with the G1 phase of the cell cycle. The gene sets we have created are readily modified to reflect ongoing research and will aid researchers’ ability to use a knowledge-based approach to data analysis facilitating the development of new insights into the intricate biology of Toxoplasma gondii.

Gene expression; Transcriptome; Parasite; Bradyzoite; Tachyzoite; Differentiation; Development