Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Differential expression analysis for paired RNA-seq data

Lisa M Chung1*, John P Ferguson2, Wei Zheng3, Feng Qian4, Vincent Bruno5, Ruth R Montgomery4 and Hongyu Zhao1*

Author Affiliations

1 Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA

2 Department of Statistics, George Washington University, Washington, DC, USA

3 Novartis Institutes for BioMedical Research, Cambridge, Massachusetts, USA

4 Section of Rheumatology, Yale School of Medicine, New Haven, Connecticut, USA

5 Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14:110  doi:10.1186/1471-2105-14-110

Published: 27 March 2013

Abstract

Background

RNA-Seq technology measures the transcript abundance by generating sequence reads and counting their frequencies across different biological conditions. To identify differentially expressed genes between two conditions, it is important to consider the experimental design as well as the distributional property of the data. In many RNA-Seq studies, the expression data are obtained as multiple pairs, e.g., pre- vs. post-treatment samples from the same individual. We seek to incorporate paired structure into analysis.

Results

We present a Bayesian hierarchical mixture model for RNA-Seq data to separately account for the variability within and between individuals from a paired data structure. The method assumes a Poisson distribution for the data mixed with a gamma distribution to account variability between pairs. The effect of differential expression is modeled by two-component mixture model. The performance of this approach is examined by simulated and real data.

Conclusions

In this setting, our proposed model provides higher sensitivity than existing methods to detect differential expression. Application to real RNA-Seq data demonstrates the usefulness of this method for detecting expression alteration for genes with low average expression levels or shorter transcript length.