Chromosome 17q21.31 contains a common inversion polymorphism of approximately 900 kb in populations with European ancestry. Two divergent MAPT haplotypes, H1 and H2 are described with distinct linkage disequilibrium patterns across the region reflecting the inversion status at this locus. The MAPT H1 haplotype has been associated with progressive supranuclear palsy, corticobasal degeneration, Parkinson’s disease and Alzheimer’s disease, while the H2 is linked to recurrent deletion events associated with the 17q21.31 microdeletion syndrome, a disease characterized by developmental delay and learning disability.
In this study, we investigate the effect of the inversion on the expression of genes in the 17q21.31 region. We find the expression of several genes in and at the borders of the inversion to be affected; specific either to whole blood or different regions of the human brain. The H1 haplotype was found to be associated with an increased expression of LRRC37A4, PLEKH1M and MAPT. In contrast, a decreased expression of MGC57346, LRRC37A and CRHR1 was associated with H1.
Studies thus far have focused on the expression of MAPT in the inversion region. However, our results show that the inversion status affects expression of other genes in the 17q21.31 region as well. Given the link between the inversion status and different neurological diseases, these genes may also be involved in disease pathology, possibly in a tissue-specific manner.
The chromosomal band 17q21.31 contains a common inversion polymorphism linked with neurodegenerative diseases including progressive supranuclear palsy , corticobasal degeneration , Parkinson’s disease , and Alzheimer’s disease . This inversion of approximately 900 kb is mostly present in populations with European ancestry (Figure 1) [5,6]. This region contains a number of genes, including corticotrophin releasing hormone receptor 1 (CRHR1) and microtubule-associated protein tau (MAPT). Two divergent MAPT haplotypes, H1 and H2 are described with distinct linkage disequilibrium patterns across the region reflecting the inversion status at this locus. The H2 haplotype is inverted and is relatively common in Europeans (~20%), however, almost absent in African and Asian populations. This configuration is associated with fecundity and appears to be under positive selection in European populations .
Figure 1. 17q21.31 inversion region. Schematic overview of the 17q21.31 region in the direction orientation corresponding with the H1 haplotype and the cis-region of 1 MB each side (Chr17:39,899,921-42,989,253; UCSC browser, March 2006 assembly, http://genome.ucsc.edu webcite). Pink segment shows the 40850001–41850000 region. Expression of genes colored red is increased in the H1/H1 haplotype (LRRC37A4 in blood, PLEKHM1 and MAPT in one or more brain regions), those in green decreased (MGC57346 in blood, CRHR1 and LRRC37A in one or more brain regions).
Specific H1 haplotypes are associated with the neurodegenerative disorders, whereas the H2 haplotype is linked to recurrent deletion events resulting in the 17q21.31 microdeletion syndrome, characterized by developmental delay and learning disability [7,8]. The H1 haplotype is linked with an increased expression of MAPT, resulting in overproduction and aggregation of hyperphosphorylated protein Tau in neuronal cell bodies, which is linked to disease pathology of a number of neurodegenerative disorders [9-12]. However, little is known about the regulation of expression levels of other genes in the region.
In this study we used single nucleotide polymorphism (SNP) genotype data to reconstruct inversion haplotypes in the chromosome 17q21.31 region and studied the effect of the inversion status on gene expression of all known genes in the region in whole blood and different regions of the human brain.
Regulation of expression in whole blood
A principal component analysis (PCA) was applied to 38 SNPs in the 40,850,001-41,850,000 region on chromosome 17. The first principal component (PC1) represents the 17q21.31 inversion genotypes homozygous H1, heterozygous H1/H2 and homozygous H2/H2. Of the 437 individuals, 6 were excluded because of ambiguity in master genotype call (>3SD from mean PC1 value of genotype cluster). This resulted in three distinct clusters of individuals, representing H1/H1 (252 individuals, 59%), H1/H2 (160 individuals, 37%) and H2/H2 (19 individuals, 4%) genotypes (Hardy Weinberg p = 0.31), depicted in Figure 2. For 22 of the 56 genes in the region, 28 gene expression probes were available in the blood dataset (full list in Additional file 1: Table S1).
Additional file 1. Table S1. This table contains a list of genes in the 17q21.31 region and the expression probes for these genes in both datasets. ProbeIDs colored red are expressed above detection level in the corresponding dataset. Black probes represent probes available on the array, but not detectable in the tissue studied.
Format: XLS Size: 46KB Download file
This file can be viewed with: Microsoft Excel Viewer
Figure 2. The 17q21.31 inversion haplotypes in the whole blood dataset. Scatterplot of PC1 and PC2 values in the 17q21.31 region (constructed using 38 SNPs in the 40,850,001-41,850,000 interval). Individuals fall into three clusters, representing H1/H1 (blue), H1/H2 (green) and H2/H2 (orange).
A linear regression analysis showed a positive association of LRRC37A4 (B = 0.37, p = 1.4 × 10-55, Figure 3) with the number of H1 alleles. In contrast, MGC57346 was negatively associated with the H1/H1 genotype (B = −0.19, p = 1.2 × 10-16, Figure 3) Results are given in Table 1. We did not detect expression of the MAPT gene in blood.
Figure 3. Expression changes associated with the 17q21.31 inversion haplotypes in the whole blood dataset. Barplot shows the mean log2 expression values (error bars represent standard deviation) of LRRC37A4 and MGC57346 for H1/H1, H1/H2 and H2/H2 genotypes.
Table 1. Expression probes associated with 17q21.31 inversion region in different datasets
Regulation of expression in human brain
We consulted a publically available human brain expression dataset consisting of frontal cortex, temporal cortex, cerebellum and pons of 144 individuals . Master genotypes of the chromosome 17q21.31 inversion were reconstructed using PC1 values of 60 SNPs in the 40,850,001-41,850,000 interval. This resulted in three distinct clusters of individuals, representing H1/H1 (77 individuals, 53%), H1/H2 (54 individuals, 38%) and H2/H2 (13 individuals, 9%) genotypes (Hardy Weinberg p = 0.43).
For 36 of the 56 genes in the inversion region, 40 expression probes were available in every brain area (full list in Additional file 1: Table S1). A linear regression analysis of allele dosage was performed for each brain area separately (Table 1). In line with literature, we found a higher expression of MAPT to be associated with the H1/H1 genotype in frontal cortex (B = 0.34, p = 4.8 × 10-6) and cerebellum (B = 0.36, p = 1.6 × 10-8). In addition, the H1/H1 genotype was also associated with increased expression of PLEKHM1 in cerebellum (B = 0.18, p = 1.9 × 10-6). In contrast, lower expression of CRHR1 was associated with this genotype in cerebellum (B = −0.89, p = 5.2 × 10-5), while undetected in the other brain regions. Finally, decreased expression of LRRC37A was found to be associated with the H1/H1 genotype in frontal cortex (B = −0.25, p = 7.4 × 10-13), temporal cortex (B = −0.29, p = 6.9 × 10-16) and pons (B = −0.17, p = 1.5 × 10-8).
No polymorphic SNPs were detected in the probe sequence that could have confounded the hybridization signal. When aligning the LRRC37A probe sequence (50 nucleotides, identical on H12.v3 and H8.v2 Illumina beadarrays) to refseq RNA sequences (NCBI BLAST; http://blast.ncbi.nlm.nih.gov/Blast.cgi webcite), it was found that this probe aligns significantly with not only LRRC37A (100%), but also LRRC37A2 (100%), LRRC37A3 (100%) and LRRC37A4 (94%). Therefore, the strong association can be the result of non-specific binding to more than one target gene in this gene family. Alignment of all other significant probes sequences, including LRRC37A4, did not suggest non-specific binding.
The chromosome 17q21.31 inversion of the MAPT (microtubule-associated protein Tau) locus represents one of the most structurally complex and evolutionarily dynamic regions of the genome . The distinct clades of haplotypes (H1 and H2) represent the direct and inverted orientation of the inversion, each with different functional impacts. Specific H1 haplotypes are associated with neurodegenerative disorders such as progressive supranuclear palsy  and Parkinson’s disease , whereas the H2 haplotype is associated with recurrent microdeletions resulting in the 17q21.31 microdeletion syndrome [7,8]. Neurodegenerative diseases associated with the H1 haplotype exhibit aggregation of hyperphosphorylated protein Tau in neuronal cell bodies [2,12].
Gene expression differences have been described for MAPT, but there has been no systematic approach to study the effect of inversion status on expression of the other genes at this locus. We used principal component analysis to identify inversion haplotypes at chromosome 17q21.31, and observed that the effect of inversion status is not limited to MAPT expression levels, but also affects several other genes in the 17q21.31 region. In line with literature, we found increased expression of MAPT to be associated with the number of H1 alleles in brain. However, we only observed this in frontal cortex and cerebellum, suggesting that regulation of this gene may differ between brain regions. A previous study identified a specific sequence variant in MAPT (htSNP167/rs242557) in the 17q21.31 region on the H1 haplotype regulating the expression of MAPT in neuronal and non-neuronal cell lines . In this study we focused on the effect of the entire inversion on gene expression with use of a robust principal component analysis strategy. Genotype data of this particular SNP was available in the brain dataset but was not significantly associated with gene expression values.
Importantly, we observed that genes other than MAPT are functionally regulated by the inversion haplotypes as well and may therefore be of importance in diseases associated with the inversion region.
The expression of CRHR1 (corticotrophin releasing hormone receptor 1) is significantly decreased in the H1/H1 haplotype and that of PLEKHM1 (pleckstrin homology domain containing, family M (with RUN domain) member 1) increased. These associations were found in cerebellum only. The PLEKHM1 gene is involved in osteopertrosis by affecting vesicular transport and therefore osteoclast-osteoblast cross-talk . Currently there is no functional data available on MGC57346 (hypothetical protein LOC401884) that was found to be differentially expressed due to inversion status in whole blood. MGC57346 consists of 4 exons of which 3 are shared with the long isoform of CRHR1 (Figure 1). It is therefore possible that these are different splice forms of a single gene, which would suggest that the concurrent association findings with inversion status represent a single event. The fact that directionality of MGC57346 expression in blood and CRHR1 expression in cerebellum is the same in this study supports this view. The CRHR gene is a critical part of the hypothalamic-pituitary-adrenal (HPA) axis that mediates stress response and has been implicated in the pathophysiology of stress-related psychiatric disorders. Of the two receptors in this system (CRHR1 and CRHR2), overactivity of CRHR1 in anxiety and depression has been a consistent finding in animal studies . In human, there is evidence for an interaction of CRHR1 function and stressful life events on vulnerability to depression and alcoholism through regulation of HPA-axis and possibly additional interaction with serotonin transporter loci . In addition, multiple sclerosis (MS) has been associated to HPA-axis activity, specifically genetic variants in CRHR1. We find increased expression of CRHR1 associated with H2 configuration, suggesting that CRHR1 activity and/or stress response might also be altered in or contributing to the H2 related phenotypes such as developmental delay and learning disability.
There is no functional data available for leucine rich repeat containing 37, member A4 (LRRC37A4) expressed in whole blood and LRRC37A (leucine rich repeat containing 37A) expressed in brain. For both genes we observed a significant association with inversion status, however, with opposite effects. It is important to note that the LRRC37 gene family is located at either inversion breakpoints and is therefore likely to be affected by copy number variation that are associated with 17q21.31 inversion status  (Figure 1). Of the LRRC37 family, member A4 (LRRC37A4) in particular has been shown to be most variable in copy number . For these reasons, the association between inversion status and gene expression levels of these genes could be entirely due to differences in copy number linked to H1 and H2 haplotypes.
A recent study finds a strong association between germline hypomethylation and genomic instability, describing that DNA methylation deserts are highly enriched for structural rearrangements . The authors report that rare CNVs that are associated with several neuropsychiatric disorders are significantly linked with local hypomethylation. In fact, germline hypomethylation seems to play a more important role in chromosomal rearrangement than the presence of segmental duplications. Future studies should address whether inversion status of the 17q21.31 region can be linked to (large-scale) changes in epigenetic tags.
In conclusion, our results indicate that the chromosome 17q21.31 inversion polymorphism associated with several neurodegenerative disorders affects the expression of multiple genes besides MAPT in a tissue-specific manner. It is therefore likely that these other genes may also play a role in pathophysiology of these neurodegenerative disorders.
Whole blood dataset
The whole blood dataset consisting of 437 healthy controls is described elsewhere [21,22]. In short, this data set consists of 244 males and 193 females with a mean age of 62 years, who where recruited as controls in a genetic study of amyotrophic lateral sclerosis. These control subjects were selected for being in good general health and unaffected with neurological and neurodegenerative diseases. Genotypes were generated on the Illumina 370 k chip according to manufacturers’ protocol at deCODE genetics in Iceland. QC included missing genotypes per individual <0.05, genotype rate per SNP >0.05, MAF >0.05, Hardy Weinberg p<0.00001. Gene expression data for this control dataset was generated using the Illumina H-12.v3 beadarray and quantile normalized and log2 transformed using the PreprocessCore package in R . Expression probes were filtered for a mean detection value > 0.90.
We consulted a publically available brain expression dataset of 144 individuals . Gene expression data was generated on Illumina H-8.v2 beadarrays (GEO; GSE15745) and genotype data on Illumina 550 k chips (dbGAP; phs000249.v1.p1). Data are available for four different brain regions; cerebellum, frontal cortex, temporal cortex and pons. Data was log2 transformed and quantile normalized using the Lumi package for R . Expression data was filtered for each brain region separately with a detection p-value threshold of 0.01.
Principal component analysis
We applied a principal component analysis (PCA) using SNP data to reconstruct the inversion genotypes homozygous H1, heterozygous H1/H2, and homozygous H2. The first two principal components (PC1 and PC2) were calculated with genotypes numerically encoded as m/m = 0, m/M = 1, M/M = 2, where m and M are minor and major alleles, respectively, in the 17q21.31 region; 40,850,001-41,850,000 (corresponding to coordinates in UCSC assembly March 2006; http://genome.ucsc.edu webcite) . We centered and normalized the genotype matrix and used it for PCA as described by Price et al. .
A linear regression analysis of allele dosage (H1/H1 = 2, H1/H2 = 1 and H2/H2 = 0) was performed on transcripts in the 17q21.31 cis-region; defined as 1 MB on either side of the inversion (39,899,921-42,989,253, Figure 1), containing 56 genes (Refseq genes from UCSC browser, March 2006 assembly; http://genome.ucsc.edu webcite) listed in Additional file 1: Table S1. In the whole blood dataset, age and gender were taken covariates. In brain, covariates also included post-mortem interval, brain bank and hybridization batch. We report the unadjusted p-value and B value, indicating the actual change in expression associated with each copy of the H1 allele (0, 1 or 2). Bonferroni correction was applied to determine significance thresholds for the number of probes tested; p < 0.05/28 = 0.0018 for the whole blood dataset and p < 0.05/40 = 0.0017 for each brain region, assuming independence between regions.
Significant expression probes were subsequently tested for common polymorphic SNPs in the 50-mer probe sequence based on genomic location (provided by Illumina) and Hapmap SNPs release 27. The threshold for common SNPs is minor allele frequency (MAF) >1%.
The authors declare that they have no competing interests.
SDJ performed statistical analyses and drafted the manuscript. IC generated inversion haplotypes. EJ and ES provided technical support. LHvdB and JHV prepared datasets for analysis. RAO conceived of the study, participated in its design and helped to draft the manuscript. All authors read and approved the final manuscript.
The authors thank Carolien de Kovel for statistical support. This work was supported by funding from the US National Institutes of Health MH078075 and NS058980 (RAO) and the Amyotrophic Lateral Sclerosis Association (RAO and LHvdB).
Cruts M, Rademakers R, Gijselinck I, van der Zee J, Dermaut B, de Pooter T, de Rijk P, Del-Favero J, van Broeckhoven C: Genomic architecture of human 17q21 linked to frontotemporal dementia uncovers a highly homologous family of low-copy repeats in the tau region.
Koolen DA, Vissers LE, Pfundt R, de Leeuw N, Knight SJ, Regan R, Kooy RF, Reyniers E, Romano C, Fichera M, et al.: A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism.
Myers AJ, Pittman AM, Zhao AS, Rohrer K, Kaleem M, Marlowe L, Lees A, Leung D, McKeith IG, Perry RH, et al.: The MAPT H1c risk haplotype is associated with increased expression of tau and especially of 4 repeat containing transcripts.
Rademakers R, Melquist S, Cruts M, Theuns J, Del-Favero J, Poorkaj P, Baker M, Sleegers K, Crook R, De Pooter T, et al.: High-density SNP haplotyping suggests altered regulation of tau gene expression in progressive supranuclear palsy.
Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, Lai SL, Arepalli S, Dillman A, Rafferty IP, Troncoso J, et al.: Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain.
Del Fattore A, Fornari R, Van Wesenbeeck L, de Freitas F, Timmermans JP, Peruzzi B, Cappariello A, Rucci N, Spera G, Helfrich MH, et al.: A new heterozygous mutation (R714C) of the osteopetrosis gene, pleckstrin homolog domain containing family M (with run domain) member 1 (PLEKHM1), impairs vesicular acidification and increases TRACP secretion in osteoclasts.
Ressler KJ, Bradley B, Mercer KB, Deveau TC, Smith AK, Gillespie CF, Nemeroff CB, Cubells JF, Binder EB: Polymorphisms in CRHR1 and the serotonin transporter loci: gene x gene x environment interactions on depressive symptoms.
Briggs FB, Bartlett SE, Goldstein BA, Wang J, McCauley JL, Zuvich RL, De Jager PL, Rioux JD, Ivinson AJ, Compston A, et al.: Evidence for CRHR1 in multiple sclerosis using supervised machine learning and meta-analysis in 12,566 individuals.
Li J, Harris RA, Cheung SW, Coarfa C, Jeong M, Goodell MA, White LD, Patel A, Kang SH, Shaw C, et al.: Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.
Saris CG, Horvath S, van Vught PW, van Es MA, Blauw HM, Fuller TF, Langfelder P, DeYoung J, Wokke JH, Veldink JH, van den Berg LH, Ophoff RA: Weighted gene co-expression network analysis of the peripheral blood from Amyotrophic Lateral Sclerosis patients.
Dubois PC, Trynka G, Franke L, Hunt KA, Romanos J, Curtotti A, Zhernakova A, Heap GA, AdÃny R, Aromaa A, Bardella MT, van den Berg LH, Bockett NA, de la Concha EG, Dema B, Fehrmann RS, FernÃndez-Arquero M, Fiatal S, Grandone E, Green PM, Groen HJ, Gwilliam R, Houwen RH, Hunt SE, Kaukinen K, Kelleher D, Korponay-Szabo I, Kurppa K, MacMathuna P, MÃ¤ki M, et al.: Multiple common variants for celiac disease influencing immune gene expression.
Nat Genet 2010, 42(4):295-302.
Epub 2010 Feb 28PubMed Abstract | Publisher Full Text | PubMed Central Full Text