Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Comparison of information-theoretic to statistical methods for gene-gene interactions in the presence of genetic heterogeneity

Lara Sucheston12, Pritam Chanda3, Aidong Zhang3, David Tritchler145 and Murali Ramanathan6*

Author Affiliations

1 Department of Biostatistics, State University of New York, Buffalo, NY 14260, USA

2 Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo New York, 14263, USA

3 Department of Computer Science and Engineering, State University of New York, Buffalo, NY 14260, USA

4 Department of Biostatistics, University of Toronto

5 Ontario Cancer Institute, Toronto, Ontario, Canada, USA

6 Department of Pharmaceutical Sciences, State University of New York, Buffalo, NY 14260, USA

For all author emails, please log on.

BMC Genomics 2010, 11:487  doi:10.1186/1471-2164-11-487

Published: 3 September 2010

Abstract

Background

Multifactorial diseases such as cancer and cardiovascular diseases are caused by the complex interplay between genes and environment. The detection of these interactions remains challenging due to computational limitations. Information theoretic approaches use computationally efficient directed search strategies and thus provide a feasible solution to this problem. However, the power of information theoretic methods for interaction analysis has not been systematically evaluated. In this work, we compare power and Type I error of an information-theoretic approach to existing interaction analysis methods.

Methods

The k-way interaction information (KWII) metric for identifying variable combinations involved in gene-gene interactions (GGI) was assessed using several simulated data sets under models of genetic heterogeneity driven by susceptibility increasing loci with varying allele frequency, penetrance values and heritability. The power and proportion of false positives of the KWII was compared to multifactor dimensionality reduction (MDR), restricted partitioning method (RPM) and logistic regression.

Results

The power of the KWII was considerably greater than MDR on all six simulation models examined. For a given disease prevalence at high values of heritability, the power of both RPM and KWII was greater than 95%. For models with low heritability and/or genetic heterogeneity, the power of the KWII was consistently greater than RPM; the improvements in power for the KWII over RPM ranged from 4.7% to 14.2% at for α = 0.001 in the three models at the lowest heritability values examined. KWII performed similar to logistic regression.

Conclusions

Information theoretic models are flexible and have excellent power to detect GGI under a variety of conditions that characterize complex diseases.