Functionally informative tag SNP selection using a pareto-optimal approach: playing the game of life

Lee, Phil Hyoun; Jung, Jae-Yoon; Shatkay, Hagit

doi:10.1186/1471-2105-10-S13-O5

Volume 10 Supplement 13

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

Oral presentation
Open access
Published: 19 October 2009

Functionally informative tag SNP selection using a pareto-optimal approach: playing the game of life

Phil Hyoun Lee¹,
Jae-Yoon Jung¹ &
Hagit Shatkay¹

BMC Bioinformatics volume 10, Article number: O5 (2009) Cite this article

3666 Accesses
3 Citations
Metrics details

Introduction

Major interest in current epidemiology, medicine, and pharmarco-genomics is focused on identifying single nucleotide polymorphisms (SNPs) that underlie the etiology of common and complex diseases. However, due to the tremendous number of SNPs on the human genome, there is a clear need to prioritize SNPs to expedite genotyping and analysis overhead associated with disease-gene studies. Tag SNP selection and Functional SNP selection are the two main approaches for addressing the SNP selection problem. However, little was done so far to effectively combine these distinct and possibly competing approaches. Here we present a new multi-objective optimization framework for identifying SNPs that are both informative tagging and have functional significance.

Methods

Our SNP selection algorithm is based on the notion of Pareto optimality [1], which has been extensively used for addressing multi-objective optimization problems in game theory, economics and engineering. We describe the details of its three main steps as follows.

STEP 1. Computing Linkage Disequilibrium of SNPs

To efficiently compute the score of tagging informativeness, we calculate the pair-wise LD between all pairs of candidate SNPs in advance. As a measure of pair-wise LD, following Carlson et al. [2], we currently use the coefficient of determination, r².

STEP 2. Retrieving Functional Significance of SNPs

We currently use the FS score of SNPs obtained from F-SNP [3], which assesses the deleterious functional effects of SNPs, using 16 bioinformatics tools, with respect to protein translation, splicing regulation, transcriptional regulation, and post-translational modification.

STEP 3. Selecting Functionally Informative Tag SNPs

Our selection algorithm is based on multi-objective simulated-annealing (SA) search. We also introduce two heuristics for generating a new neighboring solution to guide efficient search while expediting convergence. Figure 1 summarizes the proposed algorithm.

Conclusion

We applied our system to 34 disease-susceptibility genes for lung cancer, which is one of the most extensively-studied cancer types due to its high mortality rate [4]. Our algorithm always finds a collection of Pareto optimal SNP subsets that performs better than the subsets selected by other SNP selection approaches, with respect to both tagging informativeness and functional significance (shown in Figure 2). Moreover, we clearly show that our system improves upon general-purpose search algorithms for identifying Pareto optimal solutions (p-values are 1.37e-004, 3.11e-015, 2.43e-149 and 3.89e-179).

References

Kirman AP: Pareto as an economist. The New Palgrave: A Dictionary of Economics 1987, 5: 804–808.
Google Scholar
Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 2004, 74(1):106–120. 10.1086/381000
Article PubMed Central CAS PubMed Google Scholar
Lee P, Shatkay H: F-SNP: computationally predicted functional SNPs for disease association studies. Nucleic Acids Res 2008, ( 36 Database issue):D820-D824.
Zhu Y, Hoffman A, Wu X, Zhang H, Zhang Y, Leaderer D, Zheng T: Correlating observed odds ratios from lung cancer case-control studies to SNP functional scores predicted by bioinformatics tools. Mutation Research 2008, 639: 80–88.
Article PubMed Central CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Queen's University, Kingston, ON, K7L 3N6, Canada
Phil Hyoun Lee, Jae-Yoon Jung & Hagit Shatkay

Authors

Phil Hyoun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Yoon Jung
View author publications
You can also search for this author in PubMed Google Scholar
Hagit Shatkay
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lee, P.H., Jung, JY. & Shatkay, H. Functionally informative tag SNP selection using a pareto-optimal approach: playing the game of life. BMC Bioinformatics 10 (Suppl 13), O5 (2009). https://doi.org/10.1186/1471-2105-10-S13-O5

Download citation

Published: 19 October 2009
DOI: https://doi.org/10.1186/1471-2105-10-S13-O5

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

Functionally informative tag SNP selection using a pareto-optimal approach: playing the game of life

Introduction

Methods

STEP 1. Computing Linkage Disequilibrium of SNPs

STEP 2. Retrieving Functional Significance of SNPs

STEP 3. Selecting Functionally Informative Tag SNPs

Conclusion

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

Functionally informative tag SNP selection using a pareto-optimal approach: playing the game of life

Introduction

Methods

STEP 1. Computing Linkage Disequilibrium of SNPs

STEP 2. Retrieving Functional Significance of SNPs

STEP 3. Selecting Functionally Informative Tag SNPs

Conclusion

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us