Differential meta-analysis of RNA-seq data from multiple studies
1 INRA, UMR1313 Génétique animale et biologie intégrative, 78352 Jouy-en-Josas, France
2 AgroParisTech, UMR1313 Génétique animale et biologie intégrative, 75231 Paris 05, France
3 Université Lille Nord de France, UDSL, EA2694 Biostatistics, Lille, France
4 Inria Lille Nord Europe, MODAL, Lille, France
BMC Bioinformatics 2014, 15:91 doi:10.1186/1471-2105-15-91Published: 29 March 2014
High-throughput sequencing is now regularly used for studies of the transcriptome (RNA-seq), particularly for comparisons among experimental conditions. For the time being, a limited number of biological replicates are typically considered in such experiments, leading to low detection power for differential expression. As their cost continues to decrease, it is likely that additional follow-up studies will be conducted to re-address the same biological question.
We demonstrate how p-value combination techniques previously used for microarray meta-analyses can be used for the differential analysis of RNA-seq data from multiple related studies. These techniques are compared to a negative binomial generalized linear model (GLM) including a fixed study effect on simulated data and real data on human melanoma cell lines. The GLM with fixed study effect performed well for low inter-study variation and small numbers of studies, but was outperformed by the meta-analysis methods for moderate to large inter-study variability and larger numbers of studies.
The p-value combination techniques illustrated here are a valuable tool to perform differential meta-analyses of RNA-seq data by appropriately accounting for biological and technical variability within studies as well as additional study-specific effects. An R package metaRNASeq is available on the CRAN (http://cran.r-project.org/web/packages/metaRNASeq webcite).