Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data
1 Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
2 Informatics Institute, University of Missouri, Columbia, MO, USA
3 C.S. Bond Life Science Center, University of Missouri, Columbia, MO, USA
4 Divisions of Plant Science and Biochemistry, Columbia, MO, USA
5 Current address: Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
BMC Bioinformatics 2013, 14:278 doi:10.1186/1471-2105-14-278Published: 22 September 2013
High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed.
We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature.
We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.