Figure 2.

Histogram of p-values of gene expression differences from duplicate experiments on the same biological sample. (a) Duplicate experiments were from the same DNA library sequenced in different lanes. p-values were calculated from binomial distribution. (Two datasets compared: Bullard SRR037457 vs SRR037458.) (b) When binomial distribution is applied to the same biological sample prepared in two different libraries, more genes had small probability than expected, which erroneously predicted the existence of significantly differentially expressed genes when there should not be any. (Two datasets compared: Bullard SRR037467 vs SRR037471.) (c) When the same two libraries are compared using beta-binomial distribution, there is no longer high density at small p-value. Peak of proportion normalization was used in these calculations. These histograms were drawn using R package Bum-class [27].

Cai et al. BMC Bioinformatics 2012 13(Suppl 13):S5   doi:10.1186/1471-2105-13-S13-S5