Open Access Research article

Bovine proteins containing poly-glutamine repeats are often polymorphic and enriched for components of transcriptional regulatory complexes

Vicki Whan1, Matthew Hobbs2, Sean McWilliam1, David J Lynn3, Ylva Strandberg Lutzow1, Mehar Khatkar2, William Barendse1, Herman Raadsma2 and Ross L Tellam1*

Author Affiliations

1 CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Rd, St Lucia, Queensland 4067, Australia

2 Centre for Advanced Technologies in Animal Genetics and Reproduction (ReproGen), The University of Sydney, PMB3 Camden, NSW 2570, Australia

3 Animal Bioscience Department, Teagasc, Dunsany, County Meath, Ireland

For all author emails, please log on.

BMC Genomics 2010, 11:654  doi:10.1186/1471-2164-11-654

Published: 23 November 2010



About forty human diseases are caused by repeat instability mutations. A distinct subset of these diseases is the result of extreme expansions of polymorphic trinucleotide repeats; typically CAG repeats encoding poly-glutamine (poly-Q) tracts in proteins. Polymorphic repeat length variation is also apparent in human poly-Q encoding genes from normal individuals. As these coding sequence repeats are subject to selection in mammals, it has been suggested that normal variations in some of these typically highly conserved genes are implicated in morphological differences between species and phenotypic variations within species. At present, poly-Q encoding genes in non-human mammalian species are poorly documented, as are their functions and propensities for polymorphic variation.


The current investigation identified 178 bovine poly-Q encoding genes (Q ≥ 5) and within this group, 26 genes with orthologs in both human and mouse that did not contain poly-Q repeats. The bovine poly-Q encoding genes typically had ubiquitous expression patterns although there was bias towards expression in epithelia, brain and testes. They were also characterised by unusually large sizes. Analysis of gene ontology terms revealed that the encoded proteins were strongly enriched for functions associated with transcriptional regulation and many contributed to physical interaction networks in the nucleus where they presumably act cooperatively in transcriptional regulatory complexes. In addition, the coding sequence CAG repeats in some bovine genes impacted mRNA splicing thereby generating unusual transcriptional diversity, which in at least one instance was tissue-specific. The poly-Q encoding genes were prioritised using multiple criteria for their likelihood of being polymorphic and then the highest ranking group was experimentally tested for polymorphic variation within a cattle diversity panel. Extensive and meiotically stable variation was identified.


Transcriptional diversity can potentially be generated in poly-Q encoding genes by the impact of CAG repeat tracts on mRNA alternative splicing. This effect, combined with the physical interactions of the encoded proteins in large transcriptional regulatory complexes suggests that polymorphic variations of proteins in these complexes have strong potential to affect phenotype.