Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Open Badges Methodology article

An integrated hierarchical Bayesian approach to normalizing left-censored microRNA microarray data

Jia Kang1* and Ethan Yixun Xu23

Author Affiliations

1 Department of Biometrics Research, Merck Research Laboratories, Rahway, NJ 07065, USA

2 Department of Safety Assessment, Merck Research Laboratories, West Point, PA 19486, USA

3 Present Address: Discovery Informatics, Infinity Pharmaceuticals, 780 Memorial Drive, Cambridge, MA 02139, USA

For all author emails, please log on.

BMC Genomics 2013, 14:507  doi:10.1186/1471-2164-14-507

Published: 26 July 2013



MicroRNAs (miRNAs) are small endogenous ssRNAs that regulate target gene expression post-transcriptionally through the RNAi pathway. A critical pre-processing procedure for detecting differentially expressed miRNAs is normalization, aiming at removing the between-array systematic bias. Most normalization methods adopted for miRNA data are the same methods used to normalize mRNA data; but miRNA data are very different from mRNA data mainly because of possibly larger proportion of differentially expressed miRNA probes, and much larger percentage of left-censored miRNA probes below detection limit (DL). Taking the unique characteristics of miRNA data into account, we present a hierarchical Bayesian approach that integrates normalization, missing data imputation, and feature selection in the same model.


Results from both simulation and real data seem to suggest the superiority of performance of Bayesian method over other widely used normalization methods in detecting truly differentially expressed miRNAs. In addition, our findings clearly demonstrate the necessity of miRNA data normalization, and the robustness of our Bayesian approach against the violation of standard assumptions adopted in mRNA normalization methods.


Our study indicates that normalization procedures can have a profound impact on the detection of truly differentially expressed miRNAs. Although the proposed Bayesian method was formulated to handle normalization issues in miRNA data, we expect that biomarker discovery with other high-dimensional profiling techniques where there are a significant proportion of left-censored data points (e.g., proteomics) might also benefit from this approach.

miRNA; Normalization; Hierarchical Bayesian modeling; Detection limit; Variable selection