Open Access Highly Accessed Research article

Empirical bayes analysis of sequencing-based transcriptional profiling without replicates

Zhijin Wu1*, Bethany D Jenkins23, Tatiana A Rynearson3, Sonya T Dyhrman4, Mak A Saito5, Melissa Mercier3 and LeAnn P Whitney2

Author Affiliations

1 Center for Statistical Sciences and Department of Community Health, Box G-121S-7, Brown University, Providence RI 02912, USA

2 Department of Cell and Molecular Biology The University of Rhode Island, 120 Flagg Road, Kingston, RI 02881, USA

3 The Graduate School of Oceanography, University of Rhode Island, South Ferry Road, Narragansett, RI 02882, USA

4 Biology Department, Woods Hole Oceanographic Institution, Woods Hole MA 02543, USA

5 Marine Chemistry and Geochemistry Department, Woods Hole Oceanographic Institution, 360 Woods Hole Rd, Woods Hole MA 02543, USA

For all author emails, please log on.

BMC Bioinformatics 2010, 11:564  doi:10.1186/1471-2105-11-564

Published: 16 November 2010



Recent technological advancements have made high throughput sequencing an increasingly popular approach for transcriptome analysis. Advantages of sequencing-based transcriptional profiling over microarrays have been reported, including lower technical variability. However, advances in technology do not remove biological variation between replicates and this variation is often neglected in many analyses.


We propose an empirical Bayes method, titled Analysis of Sequence Counts (ASC), to detect differential expression based on sequencing technology. ASC borrows information across sequences to establish prior distribution of sample variation, so that biological variation can be accounted for even when replicates are not available. Compared to current approaches that simply tests for equality of proportions in two samples, ASC is less biased towards highly expressed sequences and can identify more genes with a greater log fold change at lower overall abundance.


ASC unifies the biological and statistical significance of differential expression by estimating the posterior mean of log fold change and estimating false discovery rates based on the posterior mean. The implementation in R is available at webcite.