BMC Bioinformatics
|
Viewing options:Associated material:Related literature:- Articles citing this article
- Other articles by authors
- Related articles/pages
Tools: Post to:
|
 Methodology articleMegaSNPHunter: a learning approach to detect disease predisposition SNPs and high level interactions in genome wide association studyXiang Wan1 , Can Yang1 , Qiang Yang2 , Hong Xue3 , Nelson LS Tang4 and Weichuan Yu1  1
Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, PR China 2
Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong, PR China 3
Department of Biochemistry, Hong Kong University of Science and Technology, Hong Kong, PR China 4
Laboratory for Genetics of Disease Susceptibility, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, PR China author email corresponding author email
BMC Bioinformatics 2009,
10:13doi:10.1186/1471-2105-10-13
|
|
| Published: |
9 January 2009 |
Abstract
Background
The interactions of multiple single nucleotide polymorphisms (SNPs) are highly hypothesized to affect an individual's susceptibility to complex diseases. Although many works have been done to identify and quantify the importance of multi-SNP interactions, few of them could handle the genome wide data due to the combinatorial explosive search space and the difficulty to statistically evaluate the high-order interactions given limited samples.
Results
Three comparative experiments are designed to evaluate the performance of MegaSNPHunter. The first experiment uses synthetic data generated on the basis of epistasis models. The second one uses a genome wide study on Parkinson disease (data acquired by using Illumina HumanHap300 SNP chips). The third one chooses the rheumatoid arthritis study from Wellcome Trust Case Control Consortium (WTCCC) using Affymetrix GeneChip 500K Mapping Array Set. MegaSNPHunter outperforms the best solution in this area and reports many potential interactions for the two real studies.
Conclusion
The experimental results on both synthetic data and two real data sets demonstrate that our proposed approach outperforms the best solution that is currently available in handling large-scale SNP data both in terms of speed and in terms of detection of potential interactions that were not identified before. To our knowledge, MegaSNPHunter is the first approach that is capable of identifying the disease-associated SNP interactions from WTCCC studies and is promising for practical disease prognosis. |