Open Access Highly Accessed Open Badges Research article

Shared ancestral susceptibility to colorectal cancer and other nutrition related diseases

Stefanie Huhn1*, Melanie Bevier1, Anja Rudolph2, Barbara Pardini3, Alessio Naccarati3, Rebecca Hein28, Michael Hoffmeister4, Ludmila Vodickova35, Jan Novotny6, Hermann Brenner4, Jenny Chang-Claude2, Kari Hemminki17, Pavel Vodicka35 and Asta Försti17

Author Affiliations

1 Department of Molecular Genetic Epidemiology, German Cancer Research Center DKFZ, Heidelberg 69120, Germany

2 Division of Cancer Epidemiology, German Cancer Research Center DKFZ, Heidelberg 69120, Germany

3 Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, Prague, 14200, Czech Republic

4 Division of Clinical Epidemiology and Aging Research, German Cancer Research Center DKFZ, Heidelberg, 69120, Germany

5 Institute of Biology and Medical Genetics, 1st Faculty of Medicine, Charles University, Prague 2, 12000, Czech Republic

6 Department of Oncology, General Teaching Hospital, Prague 2, 12808, Czech Republic

7 Center of Primary Health Care Research, Clinical Research Center, Lund University, Malmö, SE-20502, Sweden

8 PMV Forschungsgruppe, University of Cologne, Cologne, 50931, Germany

For all author emails, please log on.

BMC Medical Genetics 2012, 13:94  doi:10.1186/1471-2350-13-94

The electronic version of this article is the complete one and can be found online at:

Received:28 February 2012
Accepted:28 September 2012
Published:5 October 2012

© 2012 Huhn et al.; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



The majority of non-syndromic colorectal cancers (CRCs) can be described as a complex disease. A two-stage case–control study on CRC susceptibility was conducted to assess the influence of the ancestral alleles in the polymorphisms previously associated with nutrition-related complex diseases.


In stage I, 28 single nucleotide polymorphisms (SNPs) were genotyped in a hospital-based Czech population (1025 CRC cases, 787 controls) using an allele-specific PCR-based genotyping system (KASPar®). In stage II, replication was carried out for the five SNPs with the lowest p values. The replication set consisted of 1798 CRC cases and 1810 controls from a population-based German study (DACHS). Odds ratios (ORs) and 95% confidence intervals (CIs) for associations between genotypes and CRC risk were estimated using logistic regression. To identify signatures of selection, Fay-Wu’s H and Integrated Haplotype Score (iHS) were estimated.


In the Czech population, carriers of the ancestral alleles of AGT rs699 and CYP3A7 rs10211 showed an increased risk of CRC (OR 1.26 and 1.38, respectively; two-sided p≤0.05), whereas carriers of the ancestral allele of ENPP1 rs1044498 had a decreased risk (OR 0.79; p≤0.05). For rs1044498, the strongest association was detected in the Czech male subpopulation (OR 0.61; p=0.0015). The associations were not replicated in the German population. Signatures of selection were found for all three analyzed genes.


Our study showed evidence of association for the ancestral alleles of polymorphisms in AGT and CYP3A7 and for the derived allele of a polymorphism in ENPP1 with an increased risk of CRC in Czechs, but not in Germans. The ancestral alleles of these SNPs have previously been associated with nutrition-related diseases hypertension (AGT and CYP3A7) and insulin resistance (ENPP1). Future studies may shed light on the complex genetic and environmental interactions between different types of nutrition-related diseases.

Colorectal cancer; Nutrition; Complex diseases


Colorectal cancer (CRC; OMIM ID: 114500) is among the most common cancers in industrialized countries and one of the leading causes of cancer-related mortality [1]. The incidence rates vary among different groups and populations depending on sex, age and country, with higher rates among males than females and increasing with age [2]. The differences in the incidence rates across the globe are mainly attributed to differences in diet and other environmental factors.

In both sporadic and familial CRC, genes and environment together contribute to the risk of CRC. The majority of non-syndromic CRCs can be characterized as a complex disease [3]. The major factors that modify CRC risk are obesity, diabetes, red meat consumption, physical inactivity, alcohol consumption, chronic inflammation, and cigarette smoking [4-6]. Additionally, the intake of vitamin D, calcium, fruit and vegetables may potentially influence the risk of CRC [5]. Gene-environment interactions also underlie other complex diseases, such as obesity (OMIM ID: 601665) and diabetes mellitus type II (T2D; OMIM ID: 125853).

A specific feature of CRC and other complex diseases is that they are mainly diseases of humans living in industrialized societies, in environments with almost unlimited food supply and low energy expenditure [5,7]. Nutrition is one of the most important environmental traits influencing the fitness of an individual. In the past centuries, the genetic constitution of an individual was supposed to optimize food utilization in order to protect against malnutrition. In modern societies, the ancestral genetic constitution might not be beneficial anymore, because it does not protect against the relatively new condition of overnutrition. Therefore, variants that protect against overnutrition and related health issues are supposed to be rare [7]. Nevertheless, genetic variants that promote a carbohydrate-based nutrition as well as genetic variants that show ancestral susceptibility to a nutrition-related disease have already been described (inter alia [8-10]). Signatures of such processes can be detected in the human genome using genome-wide approaches that evaluate differences in the world-wide allele frequencies and haplotype distribution (inter alia [11-15]).

Based on the interplay between genetic and environmental risk factors in nutrition-related complex diseases we posed the following hypothesis: “Polymorphisms with ancestral alleles associating with a nutrition-related disease and showing signatures of recent selection may be associated with CRC risk.” We outline here the methods used for selecting such candidate genes and show the results when the selected variants were tested for CRC risk in two large case–control studies.


Candidate SNP selection for the case–control study

The study focused on SNPs, for which ancestral alleles have previously been associated with nutrition-related complex diseases other than CRC, such as obesity, T2D and metabolic syndrome. Information about such SNPs was collected from 30 published reports by browsing the PubMed database ( webcite) [16] for the keywords “diabetes”, “obesity”, “metabolic syndrome” (OMIM ID: 605552) and “hypertension” (OMIM ID: 145500) up to 06/2009. Most of the articles were based on genome-wide association studies or were meta-analyses. A complete list of the publications can be found at the reference list of the 1.

Additional file 1. Shared Ancestral Susceptibility to Colorectal Cancer.pdf. Title and description of the dataset: Table I. Information about genes and SNPs considered as candidates for the case–control study.

Format: PDF Size: 128KB Download file

This file can be viewed with: Adobe Acrobat ReaderOpen Data

From these 30 reports, associations with the risk of the diseases and with the related quantitative traits were retrieved. The quantitative traits for diabetes were fasting glucose level and insulin resistance. For obesity, the traits were body mass index (BMI) and waist to hip ratio. The quantitative traits for hypertension and the metabolic syndrome were high-density lipoprotein (HDL) level, low-density lipoprotein (LDL) level, triglycerides level, salt sensitivity, blood pressure and insulin resistance. A complete list of the reported associations can be found in the Additional file 1.

The candidate SNP selection for the association study took place in three major phases (Figure 1):

thumbnailFigure 1. Workflow for the selection of candidate SNPs for the ancestral susceptibility project.

(1) “Selection of Candidate SNPs”: All published SNPs were evaluated for the nature of the risk allele – either ancestral (A) or derived (D) - and the allele frequency differences between African, European and Asian populations (YRI: Sub-Saharan African population, Yoruba in Ibadan, Nigeria; CEU: Caucasian population, Utah residents with Northern and Western European ancestry from the CEPH collection; HCB: East Asian population, Han Chinese in Beijing, China; JPT: East Asian population, Japanese in Tokyo, Japan). An allele was considered a “risk allele” when it was associated with a significantly increased risk of a key-disease (OR>1; statistical significance based on the criteria of the original publication), or when it was associated with a significant increase of quantitative values in the original publication. The nature of the risk allele was determined by using the NCBI database ( webcite) [16].

The reported ancestral susceptibility SNPs that showed an absolute allele frequency difference of >45% between the African and any non-African population were chosen for further investigation. The threshold value of 45% was set to detect variants with a “major-to-minor” allele change between populations, thus indicating a possible influence of selective pressure. A second, lower threshold (25%) was set for the difference between the YRI and the CEU population to acknowledge the more recent separation of the European than the East Asian population from the African population.

(2) “Candidate Gene Definition”: The SNPs that passed the first selection criteria were evaluated for their location in the genome, possible functional effects, linkage disequilibrium (LD) with other polymorphisms within the gene region and the number of candidate SNPs in the gene region.

(3) “Tagging SNP Approach”: In addition to the evaluation of the reported ancestral susceptibility SNPs, a tagging SNP approach was carried out for each candidate gene or gene region using the genotyping data of the CEU population and HaploView© software [17]. Next to a minor allele frequency (MAF) of ≥5%, a tagging SNP had to feature the following parameters:

– be or capture a phase 1 SNP and/or

– be or capture a functional polymorphism and/or

– capture a maximal number of SNPs within a candidate gene or gene region with >25% allele frequency difference between the YRI and the CEU population

In the majority of cases, the reported ancestral susceptibility SNP itself was genotyped. When that was not possible (e.g. because the assay design failed due to the structure of the surrounding sequence) another SNP that was in LD with the reported SNP (r2>0.9) was selected for genotyping in order to indirectly gain information about the reported ancestral susceptibility SNP. For large, diverse genes/gene clusters, additional tagging SNPs were selected in order to gain more knowledge about the genes. These tagging SNPs should also fulfil the criterion of >25% allele frequency difference between the YRI and CEU population (Table 1).

Table 1. Information about genes and SNPs selected for the SNP association study

Allele frequencies were obtained from the NCBI database for the submitter population IDs: HapMap -CEU, -YRI, -HCB and -JPT (NCBI dbSNP Build 130; webcite) [16]. LD/r2 was obtained from HapMap using the HapMap3 - and HapMart Genome Browser (Release #2, Phase 3; webcite) [28,29] and HaploView© software (Version 3, Release 2, Analyse Panel CEU) [17] with the implemented Tagger tagSNP selection algorithm [30]. The data about possible functionality originate from the NCBI database ( webcite) [16] and the PolyPhen ( webcite) [inter alia [31].

Study populations

First, a hospital-based study population from the Czech Republic was analyzed. Between 09/2004 and 05/2009, 1025 CRC cases were recruited by nine oncological departments in the Czech Republic [32]. The sampled patients showed positive colonoscopic results for malignancy, histologically confirmed as colon or rectal carcinomas. The patients who met the Amsterdam criteria I or II for hereditary nonpolyposis colorectal cancer (OMIM ID: 120435) were excluded from the study. During the same time period, 787 healthy controls were recruited by five gastroenterological departments of the Czech Republic [32]. They were individuals undergoing colonoscopy for various gastrointestinal complaints, such as macroscopic bleeding, positive faecal occult blood test or abdominal pain of unknown origin. Only individuals showing negative colonoscopic results for malignancies, colorectal adenomas, benign polyps or inflammatory bowel disease were eligible for the study. Beside general information about sex and age, information about BMI (OMIM ID: 606641) and diabetes status was available for most of the individuals (Table 2).

Table 2. Characteristics of the Czech and the DACHS study population, at the time of diagnosis for cases and at the time of sampling for controls

The SNPs that showed nominally significant associations in the Czech population were additionally analysed in a German population-based case–control study. The DACHS (Darmkrebs: Chancen der Verhütung durch Screening) study contributed 1798 cases and 1810 matched controls recruited from 01/2003 to 12/2007 in South-West Germany [33,34]. The patients included in the study had a first diagnosis of invasive CRC. As controls, individuals were randomly selected from lists of residents provided by the population registries. In the detailed standardized questionnaires, information about BMI at least five years before sampling and diabetes status was available in addition to general information about sex and age [34]. Table 2 outlines the characteristics of the Czech and the DACHS population relevant for the study.

Ethical standards

The study was approved by the Ethics Committees of the Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, Prague (Czech Republic); Institute for Clinical and Experimental Medicine and Faculty, Thomayer Hospital, Prague (Czech Republic); Medical Faculty of the University of Heidelberg (Germany) and the State Medical Boards of Baden-Württemberg and Rheinland-Pfalz (Germany). Written informed consent was obtained from all study participants.


Genotyping was performed using a competitive allele-specific PCR genotyping system (KASPar®; KBiosciences, UK). PCR reactions were carried out in a 384-plate format using 3ng DNA per reaction in a 4μl reaction volume according to the optimal PCR protocol suggested by KBiosciences. The genotype detection was performed using an ABI PRISM 7900-HT Sequence Detection System with SDS 2.2 software (Applied Biosystems). For internal quality control, 7% of the Czech samples and 5% of the DACHS samples were randomly selected and included as duplicates. The concordance rate between the original and the duplicate samples was ≥ 99%. The average call rate was 98.2% in the Czech population and 97.1% in the DACHS population.

Statistical analysis

The observed genotype frequencies in the controls were tested for Hardy-Weinberg equilibrium (HWE) using χ2 tests [35]. Odds ratios (ORs) and 95% confidence intervals (CIs) for associations between genotypes and CRC risk were estimated by logistic regression (PROC LOGISTIC, SAS Version 9.2; SAS Institute, Cary, NC) [36]. The estimated effects for all SNPs refer to the ancestral allele (A). Due to the low allele frequency shown by the majority of the tested polymorphisms the dominant model was applied in all estimations. The ORs were adjusted for age and sex. Additionally, a pooled analysis of the two studies was conducted. The ORs for the pooled analysis were adjusted for age, sex and study population.

Gene-gene interaction was studied for pair-wise interaction using logistic regression. To acknowledge the fact that men are in higher risk for CRC than women and that BMI is one of the most important risk factors for non-syndromic CRC, an analysis stratified by sex and BMI was performed. The threshold value for BMI was chosen according to the median BMI in the respective study population. To assess effect modification by sex and BMI, multiplicative interaction terms were utilized in multivariate regression models.

P values ≤ 0.05 were considered statistically significant. Bonferroni correction was not applied because it would have been overly conservative since the SNP selection was hypothesis-driven and all selected SNPs have previously been associated with a phenotype predisposing to CRC. Instead, a replication study using the German sample population was conducted to validate the initial results in the Czech population.

Signatures of selection

Next to allele frequency differences between the African and the non-African populations, the study aimed to detect additional signatures of selection in those genes that were associated with CRC in the case–control study. Highly variable allele frequencies in different populations might be attributable to processes such as genetic drift, bottleneck events or founder effects that occur during the separation from the ancestral population. In order to encounter this problem, methods, which are less susceptible to demographic influences, were applied to investigate signatures of selective pressure. Instead of the traditional FST value and Tajiman D test, Fay-Wu’s H and the Standardized Integrated Haplotype Score (|iHS|) were estimated using the Haplotter web application that was developed on genome-wide HapMap data ( webcite) [13,15]. The Fay-Wu’s H algorithm detects unusual excess of high frequency derived alleles in a gene region. Strong negative Fay-Wu’s H values are considered as signatures for a selective sweep [12,15]. The iHS measures the length of haplotypes around a given SNP in comparison to the whole genome. Values < −1.5 and > 1.5 (|1.5|) give conclusive evidence for natural selection while values < −2 or > 2 (|2.0|) give evidence for a powerful selection signal [15,37]. Values were estimated for the YRI, the CEU and the East Asian (ANS) population.


Candidate SNP selection

With the keywords “diabetes”, “obesity”, “metabolic syndrome” and “hypertension”, we found in PubMed 30 publications, which reported 246 polymorphisms to be associated with the key-disease or with a related quantitative trait. Supplementary material (Additional file 1) provides information about all the 246 polymorphisms with the corresponding chromosomal position, allele status and frequency, reported associations with the diseases or the traits, information about the functionality of the SNPs and a complete reference list.

Carriers of the ancestral (A) allele of 130 SNPs had an increased risk of a key-disease (52.8%), whereas 106 SNPs were associated with an increased risk due to the derived (D) allele. The majority of the SNPs were located in introns (all 43.1%; A 48.5%; D 40.6%) or intergenic regions (all 38.6%; A 34.6%; D 47.2%). Less than 15% of all SNPs (A 17%; D 13%) were nonsense, missense, cds-synonymous or UTR SNPs.

The frequency differences estimated for the ancestral susceptibility alleles among the four worldwide populations were highly variable (YRI vs. any: mean 34.8; range 0.0-94.4; YRI vs. CEU: mean 20.7; range 0.0-87.3). Twenty-nine SNPs fulfilled the selection criteria of allele frequency difference >45% between the African and the non-African populations and >25% between the YRI and the CEU populations. These SNPs with their corresponding genes were considered as candidates for the CRC case–control study. After considering the location and the function of the SNPs and the LD characteristics of the gene regions, 28 SNPs in 15 genes were selected for the case–control study (Table 1).

Association study

The genotype distribution of 27 of the 28 SNPs genotyped in the Czech control population was according to HWE. For ERBB3 rs11171739, the genotype distribution deviated from HWE (p <0.0001) and the SNP was not considered in the further analyses. Except for CYP3A5 rs776746, which was monomorphic in the Czech cohort, none of the observed allele frequencies differed significantly from the allele frequencies given in the NCBI database (CEU population).

Three SNPs in three genes showed modest associations with the risk of CRC in the Czech population (Table 3). The ancestral alleles of two SNPs were associated with an increased risk of CRC: AGT rs699 (OR 1.26; 95% CI 1.01-1.57) and CYP3A7 rs10211 (OR 1.38; 95% CI 1.04-1.83). In contrast, the ancestral allele of ENPP1 rs1044498 SNP was associated with a decreased risk of CRC (OR 0.79; 95% CI 0.63-1.00). The gene-gene interaction analysis showed no evidence of epistasis (data not shown).

Table 3. Genotype distribution of the polymorphisms in the Czech population

A replication study in the German DACHS population was carried out for the five SNPs with the lowest p values (rs699, rs10211, rs1044498, rs12592797, rs7298565). None of the analysed polymorphisms were associated with the risk of CRC in the German population alone or in the pooled analysis of the two populations (Figure 2).

thumbnailFigure 2. Comparative data plot of the OR and 95% CI of the SNPs analyzed in the Czech and the DACHS cohort; dominant model, individual Czech and DACHS data adjusted for age and sex; joint data adjusted for age and sex, stratified by study; CZ Czech cohort; OR odds ratio; CI confidence interval.

In the data stratified by sex, ENPP1 rs1044498 was associated with a decreased risk of CRC in the male subgroup of the Czech population (OR 0.61; 95% CI 0.45 - 0.83; p 0.0015; pinteraction 0.01). No association was detected in the Czech female subgroup and in the German study (data not shown).

In the data stratified by BMI, modest associations were detected in the Czech subgroup with a BMI >27 for AGT rs699 (OR 1.54; CI 1.05 - 2.25; p 0.027; pinteraction 0.036) and CYP3A7 rs10211 (OR 1.78; CI 1.08 - 2.93; p 0.023; pinteraction 0.113). No association was detected in the German study (data not shown).

Signatures of selection

One important signature of selective pressure was already a criterion to select the SNPs for the study: a high allele frequency difference between HapMap populations. Thus, all SNPs that were associated with the risk of CRC fulfilled this criterion. The highest allele frequency difference among all analysed SNPs was found in ENPP1 rs1044498 (YRI vs. HCB 94.4%; YRI vs. CEU 87.3%). The SNPs AGT rs699 and CYP3A7 rs10211 showed an allele frequency difference of 52.5% and 67% (YRI vs. CEU), respectively.

Fay-Wu’s H and |iHS| were analysed as further signatures of selection. All estimates of Fay-Wu’s H and |iHS| refer to SNPs that are linked to the genotyped SNPs (r2=1.0), because direct values were not available in the Haplotter web application webcite[13,15]. Signatures of a selective sweep indicated by Fay-Wu’s H were found for CYP3A7 rs2687075 (r2 to rs10211=1.0), with strong negative H estimates in the ANS population (−28.8) and in the CEU population (−49.6). The other two analysed polymorphisms did not show strong negative H values (> −8.0) or even showed positive values (Figure 3a). Values of |iHS| >1.5 that give conclusive evidence for natural selection in the CEU population were found for SNPs in AGT (rs2148582, r2 = 1.0 to rs699; |iHS| = 1.62), ENPP1 (rs6926970, r2 = 1.0 to rs1044498; |iHS| = 1.58) and CYP3A7 (rs2687075, r2 = 1.0 to rs10211; |iHS| = 1.67) (Figure 3b).

thumbnailFigure 3. Plot of Fay-Wu’s H (a) and plot of the Standardized Integrated Haplotype Score (|iHS|) (b). Estimates for the SNPs that were associated with CRC in the Czech population. Comparison of the African (YRI), European (CEU) and East Asian (ANS) population. All estimates refer to SNPs that are linked to the genotyped SNPs because direct values were not available for the genotyped SNPs. * indicate values that provide conclusive evidence for natural selection [12,15,37].


The present study intended to crosslink susceptibility variants of nutrition-related complex diseases to CRC. In fact, the results in the Czech hospital-based case–control study suggested that polymorphisms in AGT, CYP3A7 and ENPP1 may be associated with the risk of CRC. However, replication in the population-based German DACHS population did not confirm the associations.

From the 246 SNPs that have been reported to be associated with a nutrition-related disease, 130 showed ancestral susceptibility to overall risk of obesity, T2D, metabolic syndrome or hypertension. However, only 29 SNPs fulfilled the initial selection criterion of ≥45% allele frequency difference between the YRI and any HapMap population, indicating selective pressure. Except ABCA1 that has been found to be mutated in CRC tumour samples [38,39], none of the 15 genes of the present study has previously been associated with the risk of CRC ( webcite; CancerGEMKB/ webcite) [40].

The association study in the Czech population indicated ancestral susceptibility to the risk of CRC for the missense AGT SNP rs699 and to the 3’UTR SNP rs10211 in CYP3A7. Interestingly, SNPs in these two genes feature similar phenotypic effects, such as predisposing to hypertension and salt sensitivity [18].

Published data about AGT suggests that the ancestral allele of the probable pathogenic SNP rs699 (M268T), as well as the ancestral alleles of the missense SNP rs4762 (T207M) and the 5′UTR SNP rs5051, predispose to essential hypertension, increased plasma angiotensinogen and increased frequency of preeclampsia (OMIM ID: 189800) [18,41,42]. Additionally, rs5051 (r2=0.95 to rs699) has been demonstrated to affect the transcription rate of AGT[41,42]. AGT (angiotensinogen [serpin peptidase inhibitor, clade A, member 8]) is an important member of the renin-angiotensin system that regulates blood pressure and fluid homeostasis probably through influencing sodium sensitivity [41,42].

Also the intronic CYP3A5 SNP rs776746 has previously been associated with hypertension and salt sensitivity [9,18]. This SNP has been reported to result in an incorrectly spliced mRNA and in a truncated non-functional protein. In the Czech population, rs776746 was monomorphic. However, the CYP3A7 SNP rs10211 - also located within the same cytochrome P450 gene cluster and linked to rs776746 in the HapMap CEU population (r2=0.82) - showed nominally significant association with the risk of CRC. The genes of the cytochrome P450 gene family encode for some of the most important enzymes involved in the metabolism of various xenobiotics and endogenous substrates such as cholesterol, steroids, environmental carcinogens and drugs [43]. In particular, CYP3 enzymes are responsible for the metabolism of eicosanoids.

The allele frequencies of the genotyped SNPs rs699 and rs10211 and the two functional SNPs rs5051 and rs776746 are highly variable among worldwide populations, with higher frequencies of the derived alleles in non-African populations while the ancestral alleles predominate in the African population. The values of |iHS| determined for SNPs that are fully linked to rs699 and rs10211 provided conclusive evidence for natural selection in the population with European ancestry. Additionally, rs2687075 (r2=1.0 to rs10211) in CYP3A7 showed strong negative values of Fay-Wu’s H that were considered as signatures for a selective sweep in non-African populations [12,15]. Previous studies have already suggested AGT and CYP3A5 as targets of selection, and have additionally connected the two genes directly by their related function [18,42]. In AGT, selection was particularly suggested to work on the promoter that contains rs699 and on SNPs in high LD with it. This selection was attributed to altered requirements for the human to maintain sodium homeostasis [42]. Since the derived allele, that predisposes to salt tolerance, is not yet fixed in non-African populations, the remaining ancestral allele, that predisposes to salt sensitivity, shows ancestral susceptibility to related diseases such as hypertension, preeclampsia [41,42] and CRC.

A possible association with the risk of CRC and signatures of recent selection were also observed for one polymorphism in ENPP1. However, in contrast to the initial hypothesis, the ancestral allele of rs1044498 was associated with a decreased risk of CRC in the Czech population. ENPP1 is a member of the ecto-nucleotide pyrophosphatase/phosphodiesterase (ENPP) family. The encoded protein interacts with the insulin receptor thereby inhibiting subsequent signalling. In previous studies, the ancestral allele of the missense polymorphism rs1044498 (Q121K) has been associated with more vivid insulin receptor binding, stronger inhibition of insulin signalling, insulin resistance, an increased risk of T2D and an increased risk of myocardial infarction (OMIM ID: 608446) [9,44,45]. The associations were most pronounced in cohorts that underwent lifestyle interventions to improve an individual’s weight or cholesterol level [45]. It is possible that the effect of ENPP1 on the risk of metabolic syndrome and the risk of CRC is highly dependent on additional environmental factors or modifiers.

Unfortunately, we were not able to validate our results in the independent German case–control study. Since the Czech and the German populations should not differ significantly in their genetic constitution, differences in nutrition or other environmental factors may contribute to the observed results or the associations may be a chance finding [46,47]. Since the selection of candidate SNPs was based on complex gene-environmental interactions - with the SNP contributing to a phenotype that predisposes to CRC - the detected associations are expected to be weaker than the associations for the original intermediate phenotype. Already in the original reports about the associations of SNPs with nutrition-related diseases, low ORs were detected in most of the cases. As we studied only a few polymorphisms per gene, other polymorphisms with low LD (r2<0.8) or rare SNPs (MAF<0.05) that may contribute to the risk of CRC might have been missed. However, considering previously reported associations of several SNPs in the three described genes with components of the metabolic syndrome, the ancestral nature of the risk alleles, and the detected signatures of selection, a true nature of the modest effects on CRC risk in the Czech population cannot be excluded [48]. Especially the close resemblance of the detected associations and function of SNPs in AGT and CYP3A7 may indicate a true effect of the polymorphisms on CRC susceptibility.


Our study showed evidence of association of the ancestral alleles of polymorphisms in AGT and CYP3A7 and the derived allele of a polymorphism in ENPP1 with an increased risk of CRC in Czechs, but not in Germans. The ancestral alleles of these SNPs have previously been associated with the nutrition-related diseases hypertension (AGT and CYP3A7) and insulin resistance (ENPP1). Future studies may shed light on the complex genetic and environmental interactions between different types of nutrition-related diseases. The application of additional selection criteria, such as ancestral susceptibility, signatures of selection, or pathway membership, might help to narrow down the numerous published polymorphisms and to find the most promising candidates for association studies. Large study populations that provide the possibility to define large subgroups with specific pre-diagnostic features may be used to review the actual function of such polymorphisms and may provide further insights into the evolution of common complex diseases.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The concept and design of the experiments was conceived by SH, AF, and KH. The experiments which provided the basis to this study were performed by SH. Statistical analyses in the association studies and population genetics were performed by SH, MB, AR, RH and AF. The sample collection and database management for the Czech study population was organized and performed by BP, AN, LV, JN and PV. The sample collection and database management for the DACHS study population was organized by MH, HB and JC-C. The manuscript was written by SH and AF. All authors revised the manuscript and contributed to the discussion of the results. The final manuscript was read and approved by all authors.


This study was supported by the German National Genome Research Network (NGFN-Plus). In the Czech Republic, the study was supported by Grant agency of the Czech Republic (GACR) [CZ:GACR:GA P304/10/1286 and P304/12/1585]. The DACHS Study was supported by grants from the German Research Council (Deutsche Forschungsgemeinschaft, grant numbers BR 1704/6-1, BR 1704/6-3, BR 1704/6-4 and CH 117/1-1), and the German Federal Ministry of Education and Research (grant numbers 01KH0404 and 01ER0814).


  1. Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM: Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008.

    Int J Cancer 2010, 127(12):2893-2917. OpenURL

  2. Jemal A, Siegel R, Ward E, Hao Y, Xu J, Murray T, Thun MJ: Cancer statistics, 2008.

    CA Cancer J Clin 2008, 58(2):71-96. OpenURL

  3. Laland KN, Odling-Smee J, Myles S: How culture shaped the human genome: bringing genetics and the human sciences together.

    Nat Rev Genet 2010, 11(2):137-148. OpenURL

  4. Bernstein CN, Blanchard JF, Kliewer E, Wajda A: Cancer risk in patients with inflammatory bowel disease: a population-based study.

    Cancer 2001, 91(4):854-862. OpenURL

  5. Huxley RR, Ansary-Moghaddam A, Clifton P, Czernichow S, Parr CL, Woodward M: The impact of dietary and lifestyle risk factors on risk of colorectal cancer: a quantitative overview of the epidemiological evidence.

    Int J Cancer 2009, 125(1):171-180. OpenURL

  6. Il'yasova D, Colbert LH, Harris TB, Newman AB, Bauer DC, Satterfield S, Kritchevsky SB: Circulating levels of inflammatory markers and cancer risk in the health aging and body composition cohort.

    Cancer Epidemiol Biomarkers Prev 2005, 14(10):2413-2418. OpenURL

  7. Bellisari A: Evolutionary origins of obesity.

    Obes Rev 2008, 9(2):165-180. OpenURL

  8. Chen R, Corona E, Sikora M, Dudley JT, Morgan AA, Moreno-Estrada A, Nilsen GB, Ruau D, Lincoln SE, Bustamante CD, et al.: Type 2 diabetes risk alleles demonstrate extreme directional differentiation among human populations, compared to other diseases.

    PLoS Genet 2012, 8(4):e1002621. OpenURL

  9. Di Rienzo A, Hudson RR: An evolutionary framework for common diseases: the ancestral-susceptibility model.

    Trends Genet 2005, 21(11):596-601. OpenURL

  10. Hancock AM, Witonsky DB, Ehler E, Alkorta-Aranburu G, Beall C, Gebremedhin A, Sukernik R, Utermann G, Pritchard J, Coop G, et al.: Colloquium paper: human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency.

    Proc Natl Acad Sci USA 2010, 107(Suppl 2):8924-8930. OpenURL

  11. Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers RM, Cavalli-Sforza LL, Feldman MW, Pritchard JK: The role of geography in human adaptation.

    PLoS Genet 2009, 5(6):e1000500. OpenURL

  12. Fay JC, Wu CI: Hitchhiking under positive Darwinian selection.

    Genetics 2000, 155(3):1405-1413. OpenURL

  13. Oleksyk TK, Smith MW, O'Brien SJ: Genome-wide scans for footprints of natural selection.

    Philos Trans R Soc Lond B Biol Sci 2010, 365(1537):185-205. OpenURL

  14. Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

    Genetics 1989, 123(3):585-595. OpenURL

  15. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome.

    PLoS Biol 2006, 4(3):e72. OpenURL

  16. The National Center for Biotechnology Information.


  17. Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps.

    Bioinformatics 2005, 21(2):263-265. OpenURL

  18. Thompson EE, Kuttab-Boulos H, Witonsky D, Yang L, Roe BA, Di Rienzo A: CYP3A variation and the evolution of salt-sensitivity variants.

    Am J Hum Genet 2004, 75(6):1059-1069. OpenURL

  19. Kathiresan S, Willer CJ, Peloso GM, Demissie S, Musunuru K, Schadt EE, Kaplan L, Bennett D, Li Y, Tanaka T, et al.: Common variants at 30 loci contribute to polygenic dyslipidemia.

    Nat Genet 2009, 41(1):56-65. OpenURL

  20. Willer CJ, Sanna S, Jackson AU, Scuteri A, Bonnycastle LL, Clarke R, Heath SC, Timpson NJ, Najjar SS, Stringham HM, et al.: Newly identified loci that influence lipid concentrations and risk of coronary artery disease.

    Nat Genet 2008, 40(2):161-169. OpenURL

  21. The.Wellcome.Trust CCC: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

    Nature 2007, 447(7145):661-678. OpenURL

  22. Groop L: From fused toes in mice to human obesity.

    Nat Genet 2007, 39(6):706-707. OpenURL

  23. Bacci S, Ludovico O, Prudente S, Zhang YY, Di Paola R, Mangiacotti D, Rauseo A, Nolan D, Duffy J, Fini G, et al.: The K121Q polymorphism of the ENPP1/PC-1 gene is associated with insulin resistance/atherogenic phenotypes, including earlier onset of type 2 diabetes and myocardial infarction.

    Diabetes 2005, 54(10):3021-3025. OpenURL

  24. Yanagiya T, Tanabe A, Iida A, Saito S, Sekine A, Takahashi A, Tsunoda T, Kamohara S, Nakata Y, Kotani K, et al.: Association of single-nucleotide polymorphisms in MTMR9 gene with obesity.

    Hum Mol Genet 2007, 16(24):3017-3026. OpenURL

  25. Lo JC, Zhao X, Scuteri A, Brockwell S, Sowers MR: The association of genetic polymorphisms in sex hormone biosynthesis and action with insulin sensitivity and diabetes mellitus in women at midlife.

    Am J Med 2006, 119(9 Suppl 1):S69-S78. OpenURL

  26. Benzinou M, Walley A, Lobbens S, Charles MA, Jouret B, Fumeron F, Balkau B, Meyre D, Froguel P: Bardet-Biedl syndrome gene variants are associated with both childhood and adult common obesity in French Caucasians.

    Diabetes 2006, 55(10):2876-2882. OpenURL

  27. Meyre D, Delplanque J, Chevre JC, Lecoeur C, Lobbens S, Gallina S, Durand E, Vatin V, Degraeve F, Proenca C, et al.: Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations.

    Nat Genet 2009, 41(2):157-159. OpenURL

  28. Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu F, Bonnen PE, de Bakker PI, Deloukas P, The.International.HapMap3.Consortium, et al.: Integrating common and rare genetic variation in diverse human populations.

    Nature 2010, 467(7311):52-58. OpenURL

  29. The.International.HapMap.Consortium: The International HapMap Project.

    Nature 2003, 426(6968):789-796. OpenURL

  30. de Bakker PI, Yelensky R, Pe'er I, Gabriel SB, Daly MJ, Altshuler D: Efficiency and power in genetic association studies.

    Nat Genet 2005, 37(11):1217-1223. OpenURL

  31. Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey.

    Nucleic Acids Res 2002, 30(17):3894-3900. OpenURL

  32. Pechlivanis S, Bermejo JL, Pardini B, Naccarati A, Vodickova L, Novotny J, Hemminki K, Vodicka P, Forsti A: Genetic variation in adipokine genes and risk of colorectal cancer.

    Eur J Endocrinol 2009, 160(6):933-940. OpenURL

  33. Brenner H, Chang-Claude J, Seiler CM, Rickert A, Hoffmeister M: Protection from colorectal cancer after colonoscopy: a population-based, case-control study.

    Ann Intern Med 2011, 154(1):22-30. OpenURL

  34. Sainz J, Rudolph A, Hein R, Hoffmeister M, Buch S, von Schonfels W, Hampe J, Schafmayer C, Volzke H, Frank B, et al.: Association of genetic polymorphisms in ESR2, HSD17B1, ABCB1, and SHBG genes with colorectal cancer risk.

    Endocr Relat Cancer 2011, 18(2):265-276. OpenURL

  35. Emigh TH: A Comparison of Tests for Hardy-Weinberg Equilibrium.

    Biometrics 1980, 36(4):627-642. OpenURL

  36. Cornfield J: A Method of Estimating Comparative Rates from Clinical Data - Applications to Cancer of the Lung, Breast, and Cervix.

    J Natl Cancer I 1951, 11(6):1269-1275. OpenURL

  37. Southam L, Soranzo N, Montgomery SB, Frayling TM, McCarthy MI, Barroso I, Zeggini E: Is the thrifty genotype hypothesis supported by evidence based on confirmed type 2 diabetes- and obesity-susceptibility variants?

    Diabetologia 2009, 52(9):1846-1851. OpenURL

  38. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, et al.: The consensus coding sequences of human breast and colorectal cancers.

    Science 2006, 314(5797):268-274. OpenURL

  39. Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, et al.: The genomic landscapes of human breast and colorectal cancers.

    Science 2007, 318(5853):1108-1113. OpenURL

  40. Yu W, Gwinn M, Clyne M, Yesupriya A, Khoury MJ: A navigator for human genome epidemiology.

    Nat Genet 2008, 40(2):124-125. OpenURL

  41. Inoue I, Nakajima T, Williams CS, Quackenbush J, Puryear R, Powers M, Cheng T, Ludwig EH, Sharma AM, Hata A, et al.: A nucleotide substitution in the promoter of human angiotensinogen is associated with essential hypertension and affects basal transcription in vitro.

    J Clin Invest 1997, 99(7):1786-1797. OpenURL

  42. Nakajima T, Wooding S, Sakagami T, Emi M, Tokunaga K, Tamiya G, Ishigami T, Umemura S, Munkhbat B, Jin F, et al.: Natural selection and population history in the human angiotensinogen gene (AGT): 736 complete AGT sequences in chromosomes from around the world.

    Am J Hum Genet 2004, 74(5):898-916. OpenURL

  43. Nebert DW, Dalton TP: The role of cytochrome P450 enzymes in endogenous signalling pathways and environmental carcinogenesis.

    Nat Rev Cancer 2006, 6(12):947-960. OpenURL

  44. McAteer JB, Prudente S, Bacci S, Lyon HN, Hirschhorn JN, Trischitta V, Florez JC: The ENPP1 K121Q polymorphism is associated with type 2 diabetes in European populations: evidence from an updated meta-analysis in 42,042 subjects.

    Diabetes 2008, 57(4):1125-1130. OpenURL

  45. Müssig K, Heni M, Thamer C, Kantartzis K, Machicao F, Stefan N, Fritsche A, Haring HU, Staiger H: The ENPP1 K121Q polymorphism determines individual susceptibility to the insulin-sensitising effect of lifestyle intervention.

    Diabetologia 2010, 53(3):504-509. OpenURL

  46. Lao O, Lu TT, Nothnagel M, Junge O, Freitag-Wolf S, Caliebe A, Balascakova M, Bertranpetit J, Bindoff LA, Comas D, et al.: Correlation between genetic and geographic structure in Europe.

    Curr Biol 2008, 18(16):1241-1248. OpenURL

  47. Nelis M, Esko T, Magi R, Zimprich F, Zimprich A, Toncheva D, Karachanak S, Piskackova T, Balascak I, Peltonen L, et al.: Genetic structure of Europeans: a view from the North-East.

    PLoS One 2009, 4(5):e5472. OpenURL

  48. Calle EE, Kaaks R: Overweight, obesity and cancer: epidemiological evidence and proposed mechanisms.

    Nat Rev Cancer 2004, 4(8):579-591. OpenURL

Pre-publication history

The pre-publication history for this paper can be accessed here: