Open Access Highly Accessed Open Badges Methodology article

A Bayesian variable selection procedure to rank overlapping gene sets

Axel Skarman1, Mohammad Shariati12, Luc Jans1, Li Jiang1 and Peter Sørensen1*

Author Affiliations

1 Department of Molecular Biology and Genetics, Aarhus University, Blichers Allé 20, PO Box 50, Aarhus, Tjele DK-8830, Denmark

2 Department of Animal Science, Ferdowsi University of Mashhad, Mashhad, 91775, Iran

For all author emails, please log on.

BMC Bioinformatics 2012, 13:73  doi:10.1186/1471-2105-13-73

Published: 3 May 2012



Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. Different methods to prioritize gene sets, such as the genes in a given molecular pathway, have been described. In many cases, these methods test one gene set at a time, and therefore do not consider overlaps among the pathways. Here, we present a Bayesian variable selection method to prioritize gene sets that overcomes this limitation by considering all gene sets simultaneously. We applied Bayesian variable selection to differential expression to prioritize the molecular and genetic pathways involved in the responses to Escherichia coli infection in Danish Holstein cows.


We used a Bayesian variable selection method to prioritize Kyoto Encyclopedia of Genes and Genomes pathways. We used our data to study how the variable selection method was affected by overlaps among the pathways. In addition, we compared our approach to another that ignores the overlaps, and studied the differences in the prioritization. The variable selection method was robust to a change in prior probability and stable given a limited number of observations.


Bayesian variable selection is a useful way to prioritize gene sets while considering their overlaps. Ignoring the overlaps gives different and possibly misleading results. Additional procedures may be needed in cases of highly overlapping pathways that are hard to prioritize.

Bayesian variable selection; Gene set; Overlap