Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules
1 Joint Carnegie Mellon – University of Pittsburgh PhD Program in Computational Biology, Carnegie Mellon University, Pittsburgh, PAUSA
2 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, PAUSA
3 Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PAUSA
4 Machine Learning Department, Carnegie Mellon University, Pittsburgh, PAUSA
BMC Genomics 2013, 14:196 doi:10.1186/1471-2164-14-196Published: 21 March 2013
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant.
While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso.
Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.