Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Predicting functionally important SNP classes based on negative selection

Mark A Levenstien and Robert J Klein*

Author Affiliations

1Program in Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065 USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12:26  doi:10.1186/1471-2105-12-26

Published: 19 January 2011

Abstract

Background

With the advent of cost-effective genotyping technologies, genome-wide association studies allow researchers to examine hundreds of thousands of single nucleotide polymorphisms (SNPs) for association with human disease. Recently, many researchers applying this strategy have detected strong associations to disease with SNP markers that are either not in linkage disequilibrium with any nonsynonymous SNP or large distances from any annotated gene. In such cases, no well-established standard practice for effective SNP selection for follow-up studies exists. We aim to identify and prioritize groups of SNPs that are more likely to affect phenotypes in order to facilitate efficient SNP selection for follow-up studies.

Results

Based on the annotations available in the Ensembl database, we categorized SNPs in the human genome into classes related to regulatory attributes, such as epigenetic modifications and transcription factor binding sites, in addition to classes related to gene structure and cross-species conservation. Using the distribution of derived allele frequencies (DAF) within each class, we assessed the strength of natural selection for each class relative to the genome as a whole. We applied this DAF analysis to Perlegen resequenced SNPs genome-wide. Regulatory elements annotated by Ensembl such as specific histone methylation sites as well as classes defined by cross-species conservation showed negative selection in comparison to the genome as a whole.

Conclusions

These results highlight which annotated classes are under purifying selection, have putative functional importance, and contain SNPs that are strong candidates for follow-up studies after genome-wide association. Such SNP annotation may also be useful in interpreting results of whole-genome sequencing studies.