Open Access Highly Accessed Open Badges Research article

A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets

Milkha M Leimena12, Javier Ramiro-Garcia123, Mark Davids13, Bartholomeus van den Bogert12, Hauke Smidt2, Eddy J Smid14, Jos Boekhorst67, Erwin G Zoetendal12, Peter J Schaap13* and Michiel Kleerebezem1257*

Author Affiliations

1 TI Food and Nutrition (TIFN), P.O. Box 557, 6700 AN, Wageningen, The Netherlands

2 Laboratory of Microbiology, Dreijenplein 10, Wageningen 6703 HB, The Netherlands

3 Laboratory of System and Synthetic Biology, Wageningen University, Dreijenplein 10, 6703 HB, Wageningen, The Netherlands

4 Laboratory of Food Microbiology, Wageningen University, P.O. Box 8129, 6700 EV, Wageningen, The Netherlands

5 Host-Microbe Interactomics Group, Wageningen University, P.O. box 338, 6700 AH, Wageningen, The Netherlands

6 Centre for Molecular and Biomolecular Informatics, Radboud University Medical Centre, Nijmegen, The Netherlands

7 NIZO Food Research B.V, P.O. Box 20, 6710 BA, Ede, The Netherlands

For all author emails, please log on.

BMC Genomics 2013, 14:530  doi:10.1186/1471-2164-14-530

Published: 2 August 2013



Next generation sequencing (NGS) technologies can be applied in complex microbial ecosystems for metatranscriptome analysis by employing direct cDNA sequencing, which is known as RNA sequencing (RNA-seq). RNA-seq generates large datasets of great complexity, the comprehensive interpretation of which requires a reliable bioinformatic pipeline. In this study, we focus on the development of such a metatranscriptome pipeline, which we validate using Illumina RNA-seq datasets derived from the small intestine microbiota of two individuals with an ileostomy.


The metatranscriptome pipeline developed here enabled effective removal of rRNA derived sequences, followed by confident assignment of the predicted function and taxonomic origin of the mRNA reads. Phylogenetic analysis of the small intestine metatranscriptome datasets revealed a strong similarity with the community composition profiles obtained from 16S rDNA and rRNA pyrosequencing, indicating considerable congruency between community composition (rDNA), and the taxonomic distribution of overall (rRNA) and specific (mRNA) activity among its microbial members. Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments. In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights. Metatranscriptome functional-mapping allowed the analysis of global, and genus specific activity of the microbiota, and illustrated the potential of these approaches to unravel syntrophic interactions in microbial ecosystems.


A reliable pipeline for metatransciptome data analysis was developed and evaluated using RNA-seq datasets obtained for the human small intestine microbiota. The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

Metatranscriptome; Bioinformatic pipeline; Human small intestine microbiota; Illumina sequencing; Single-end reads; Paired-end reads; COG; KEGG; Metabolic pathways