SNPs in Multi-Species Conserved Sequences (MCS) as useful markers in association studies: a practical approach
1 Center for Human Genetics Research and Department of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, TN, USA
2 Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
3 Center for Human Genetics and Department of Medicine, Duke University Medical Center, Durham, NC, USA
4 Department of Neurology, University of California San Francisco, San Francisco, CA, USA
5 Institute of Human Genomics, Miami University's Miller School of Medicine, Miami, FL, USA
BMC Genomics 2007, 8:266 doi:10.1186/1471-2164-8-266Published: 6 August 2007
Although genes play a key role in many complex diseases, the specific genes involved in most complex diseases remain largely unidentified. Their discovery will hinge on the identification of key sequence variants that are conclusively associated with disease. While much attention has been focused on variants in protein-coding DNA, variants in noncoding regions may also play many important roles in complex disease by altering gene regulation. Since the vast majority of noncoding genomic sequence is of unknown function, this increases the challenge of identifying "functional" variants that cause disease. However, evolutionary conservation can be used as a guide to indicate regions of noncoding or coding DNA that are likely to have biological function, and thus may be more likely to harbor SNP variants with functional consequences. To help bias marker selection in favor of such variants, we devised a process that prioritizes annotated SNPs for genotyping studies based on their location within Multi-species Conserved Sequences (MCSs) and used this process to select SNPs in a region of linkage to a complex disease. This allowed us to evaluate the utility of the chosen SNPs for further association studies. Previously, a region of chromosome 1q43 was linked to Multiple Sclerosis (MS) in a genome-wide screen. We chose annotated SNPs in the region based on location within MCSs (termed MCS-SNPs). We then obtained genotypes for 478 MCS-SNPs in 989 individuals from MS families.
Analysis of our MCS-SNP genotypes from the 1q43 region and comparison to HapMap data confirmed that annotated SNPs in MCS regions are frequently polymorphic and show subtle signatures of selective pressure, consistent with previous reports of genome-wide variation in conserved regions. We also present an online tool that allows MCS data to be directly exported to the UCSC genome browser so that MCS-SNPs can be easily identified within genomic regions of interest.
Our results showed that MCS can easily be used to prioritize markers for follow-up and candidate gene association studies. We believe that this novel approach demonstrates a paradigm for expediting the search for genes contributing to complex diseases.