An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene Interactions on risk of myocardial infarction: The importance of model validation
1 Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL 35294-0022, USA
2 Section of Cardiovascular Medicine, Department of Medicine, Yale University School of Medicine, New Haven, CT 06510, USA
3 Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School, Nashville, TN 37232-0700, USA
4 Section of Health Policy and Administration, Department of Epidemiology and Public Health and Robert Wood Johnson Clinical Scholars Program, Yale University School of Medicine, New Haven, CT 06510, USA
5 Yale-New Haven Hospital Center for Outcomes Research and Evaluation, New Haven, CT 06510, USA
6 Division of Preventive Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215, USA
7 Center for Cardiovascular Disease Prevention, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02215, USA
8 Departments of Medicine and Pharmacology, Vanderbilt University Medical School, Nashville, TN 37232-0700, USA
BMC Bioinformatics 2004, 5:49 doi:10.1186/1471-2105-5-49Published: 30 April 2004
To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation.
The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46% prediction error for the final MDR model was not significantly lower than that expected by chance.
The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.