Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

kruX: matrix-based non-parametric eQTL discovery

Jianlong Qi12, Hassan Foroughi Asl3, Johan Björkegren34 and Tom Michoel15*

Author Affiliations

1 School of Life Sciences – LifeNet, Freiburg Institute for Advanced Studies (FRIAS), University of Freiburg, Freiburg, Germany

2 Epigenomic Mapping Centre, McGill University, Montreal, Canada

3 Cardiovascular Genomics Group, Division of Vascular Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden

4 Department of Medical Pathology and Forensic Medicine, University of Tartu, Tartu, Estonia

5 Division of Genetics & Genomics, The Roslin Institute, The University of Edinburgh, EH25 9RG Easter Bush, Midlothian, UK

For all author emails, please log on.

BMC Bioinformatics 2014, 15:11  doi:10.1186/1471-2105-15-11

Published: 14 January 2014

Abstract

Background

The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive.

Results

We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations.

Conclusion

kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com webcite.

Keywords:
eQTL; Non-parametric methods; Matrix algebra; Software