Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

CAG-encoded polyglutamine length polymorphism in the human genome

Stefanie L Butland1 email, Rebecca S Devon2 email, Yong Huang1 email, Carri-Lyn Mead1 email, Alison M Meynert1 email, Scott J Neal2 email, Soo Sen Lee1 email, Anna Wilkinson1 email, George S Yang3 email, Macaire MS Yuen1 email, Michael R Hayden2,4 email, Robert A Holt3,5 email, Blair R Leavitt2,4* email and BF Francis Ouellette1,4* email

UBC Bioinformatics Centre, Michael Smith Laboratories, University of British Columbia, Vancouver, Canada

Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, Canada

Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, Canada

Department of Medical Genetics, University of British Columbia, Vancouver, Canada

Department of Psychiatry, University of British Columbia, Vancouver, Canada

author email corresponding author email* Contributed equally

BMC Genomics 2007, 8:126doi:10.1186/1471-2164-8-126

Published: 22 May 2007

Abstract

Background

Expansion of polyglutamine-encoding CAG trinucleotide repeats has been identified as the pathogenic mutation in nine different genes associated with neurodegenerative disorders. The majority of individuals clinically diagnosed with spinocerebellar ataxia do not have mutations within known disease genes, and it is likely that additional ataxias or Huntington disease-like disorders will be found to be caused by this common mutational mechanism. We set out to determine the length distributions of CAG-polyglutamine tracts for the entire human genome in a set of healthy individuals in order to characterize the nature of polyglutamine repeat length variation across the human genome, to establish the background against which pathogenic repeat expansions can be detected, and to prioritize candidate genes for repeat expansion disorders.

Results

We found that repeats, including those in known disease genes, have unique distributions of glutamine tract lengths, as measured by fragment analysis of PCR-amplified repeat regions. This emphasizes the need to characterize each distribution and avoid making generalizations between loci. The best predictors of known disease genes were occurrence of a long CAG-tract uninterrupted by CAA codons in their reference genome sequence, and high glutamine tract length variance in the normal population. We used these parameters to identify eight priority candidate genes for polyglutamine expansion disorders. Twelve CAG-polyglutamine repeats were invariant and these can likely be excluded as candidates. We outline some confusion in the literature about this type of data, difficulties in comparing such data between publications, and its application to studies of disease prevalence in different populations. Analysis of Gene Ontology-based functions of CAG-polyglutamine-containing genes provided a visual framework for interpretation of these genes' functions. All nine known disease genes were involved in DNA-dependent regulation of transcription or in neurogenesis, as were all of the well-characterized priority candidate genes.

Conclusion

This publication makes freely available the normal distributions of CAG-polyglutamine repeats in the human genome. Using these background distributions, against which pathogenic expansions can be identified, we have begun screening for mutations in individuals clinically diagnosed with novel forms of spinocerebellar ataxia or Huntington disease-like disorders who do not have identified mutations within the known disease-associated genes.


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.