In this study we perform a genome-wide association study (GWAS) for bovine milk fatty acids from summer milk samples. This study replicates a previous study where we performed a GWAS for bovine milk fatty acids based on winter milk samples from the same population. Fatty acids from summer and winter milk are genetically similar traits and we therefore compare the regions detected in summer milk to the regions previously detected in winter milk GWAS to discover regions that explain genetic variation in both summer and winter milk.
The GWAS of summer milk samples resulted in 51 regions associated with one or more milk fatty acids. Results are in agreement with most associations that were previously detected in a GWAS of fatty acids from winter milk samples, including eight ‘new’ regions that were not considered in the individual studies. The high correlation between the –log10(P-values) and effects of SNPs that were found significant in both GWAS imply that the effects of the SNPs were similar on winter and summer milk fatty acids.
The GWAS of fatty acids based on summer milk samples was in agreement with most of the associations detected in the GWAS of fatty acids based on winter milk samples. Associations that were in agreement between both GWAS are more likely to be involved in fatty acid synthesis compared to regions detected in only one GWAS and are therefore worthwhile to pursue in fine-mapping studies.
Keywords:Milk fatty acids; Dairy; Genome-wide association
Dairy producers are looking for ways to optimize bovine milk fat composition for human health, and to improve physical and functional properties of milk. Increasing the knowledge about the synthesis of milk fatty acids, by unravelling the genetic background of milk fatty acids, can aid in modifying bovine milk fat composition. Polymorphisms in diacylglycerol-O-acyltransferase 1 (DGAT1) and stearoyl-CoA desaturase 1 (SCD1) are known to have an effect on milk fatty acids, e.g. the DGAT1 K232A polymorphism explains 40% of the genetic variation of C16:0  and the SCD1 A293V polymorphism explains 23% of the genetic variation of C16:1 . However, there is still a considerable amount of genetic variation in milk fat composition that has not been assigned to polymorphisms.
Chromosomal regions associated with milk fatty acids can be detected by screening the whole genome in a genome-wide association study (GWAS). In GWAS studies many thousands of single nucleotide polymorphisms (SNPs) are being tested for associations. In general it is expected that only a small proportion of the SNPs will have a true association and only those that have an effect that is large enough will be significant. Setting a significance threshold is finding the balance between limiting the number of false positives and maintaining sufficient power. Replication of results in independent samples is a strategy to separate false positives from true associations [3,4].
In a previous GWAS we identified interesting regions of the bovine genome associated with milk fatty acids from winter milk samples . To our knowledge, at present that is the only GWAS on bovine milk fatty acids. Not many populations that are large enough for GWAS have been phenotyped for milk fatty acids, because accurate measurement of milk fatty acids using gas chromatography is expensive and time consuming. In addition, genotyping a large number of individuals for a large numbers of SNPs is costly. The population that was used for our previous GWAS, based on milk samples taken in winter, has been phenotyped for milk fatty acids in a second milk sample which was taken in summer. Repeating the GWAS for milk fatty acids based on winter samples using the summer samples can confirm the previously detected associations and result in new associations.
In this study we performed a GWAS for fatty acids based on summer milk samples. This study repeats our previous GWAS for fatty acids based on winter milk samples from the same population and largely the same animals. Fatty acids from summer and winter milk are genetically similar traits and we therefore compared the regions detected in summer milk to the regions previously detected in winter milk to confirm associations.
The phenotypes of the winter milk samples were described earlier in Bouwman et al., the phenotypes of the summer milk samples will be described in detail, as well as the differences between winter and summer samples.
The fat composition of summer milk samples from 1,795 first-lactation Dutch Holstein Friesian cows was available for this study. The cows were housed on 383 commercial farms throughout the Netherlands. At least three cows were sampled per farm. The cows were between 97 and 335 days in lactation when the summer samples were taken. The pedigree of the cows was provided by CRV (Cooperative cattle improvement organization, Arnhem, the Netherlands) and consisted of 26,300 animals.
Milk fat composition was measured by gas chromatography as described in Stoop et al. . Many fatty acids were measured, but only the major fatty acids are reported here: even-chain saturated fatty acids C4:0 to C18:0, even-chain (cis9) monounsaturated fatty acids C10:1 to C18:1, and the polyunsaturated fatty acid C18:2cis9,trans11 (CLA). These fatty acids made up 88% of the total milk fat. The fatty acids are expressed in terms of weight-proportion of total fat weight (w/w%).
The winter and summer milk samples were taken from the same cows during the same lactation. The moment of sampling resulted in some differences between the two samples. Winter milk samples were taken from February to March 2005, when Dutch cows are mainly kept indoors and fed silage. Summer milk samples were taken from May to June 2005, when Dutch cows are often grazing for at least some part of the day. Some cows sampled in winter were not in lactation anymore during summer, therefore additional cows were sampled in summer to assure at least 3 cows per herd. The cows were on average 167 days (63-282) in lactation when the winter samples were taken, and 247 days (97-335) in lactation when the summer samples were taken.
Statistical analysis of phenotypes
Variance components, heritabilities, and correlations between the fatty acids from summer and winter milk samples were estimated using a bivariate animal model in ASReml :
where y was the phenotype; μ was the overall mean; dim was the covariate describing the effect of days in milk; afc was the covariate describing the effect of age at first calving; season was the fixed effect of the class of calving season (June-Aug 2004, Sept-Nov 2004, or Dec 2004-Jan 2005); scode was the fixed effect accounting for differences in genetic level between proven sire daughters and test-sire daughters, proven sires are selected and therefore their daughters might have a different genetic level than daughters of test sires; herd was the random effect of herd, distributed as N(0, I σherd2), with identity matrix I and herd variance σherd2; animal was the random additive genetic effect, distributed as N(0, A σa2), with the additive genetic relationship matrix A and the additive genetic variance σa2; and e was the random residual, distributed as N(0, I σe2), with identity matrix I and residual variance σe2.
Blood samples were collected from the cows for DNA isolation. Of the 1,795 cows phenotyped for the summer milk sample 1,656 were successfully genotyped for 50,855 single nucleotide polymorphisms (SNPs) using a custom Infinium Array (Illumina, San Diego, CA, USA) designed by CRV. The assumed map positions of the SNPs were based on the bovine genome assembly BTAU 4.0 . The average distance between SNPs was 52,452 bp. Of the 50,855 SNPs, 591 SNPs were located on the X chromosome, and 776 SNPs could not be mapped to any of the Bos taurus (BTA) chromosomes and were assigned to BTA 0. The SNPs on BTA 0 and the X chromosome were included in the study. Single nucleotide polymorphisms with a genotyping rate < 80% (n=392), monomorphic SNPs (n=236), and SNPs with 1-9 observations for one of the genotype classes (SNPs with such low number of observations in one of the genotype classes were excluded from further analyses to reduce the number of spurious associations) (n=5,646) were discarded from the original SNP set, resulting in the final marker set of 44,581 SNPs used for the summer GWAS.
Genomic DNA of the cows was isolated from whole blood samples of the cows. Blood samples were collected in accordance with the guidelines for the care and use of animals as approved by the ethical committee on animal experiments of Wageningen University (protocol: 200523.b).
Genome-wide association based on summer milk samples
For the GWAS based on the summer milk samples, both phenotype and genotype data were available for 1,656 individuals. A single SNP GWAS was performed using an univariate animal in ASReml that was the same as model 1 but extended with a fixed effect for the SNP genotype.
The genome-wide FDR was based on the P-values from the animal model using the R package ‘qvalue’ . A genome-wide FDR was calculated for each trait individually. Associations with a genome-wide FDR<0.05 were considered significant.
SNPs significant for one trait located close to each other were termed a “region”. All significant SNPs in a region might be associated with the same causal mutation. A region was defined as follows: it started at the first significant SNP on a chromosome that was followed by an additional significant SNP within 10 Mbp; the region was extended as long as another significant SNP occurred within 10 Mbp from the previous one and ended at the last significant SNP that was not followed by another significant SNP within the next 10 Mbp. Thus, each region contained at least two SNPs significant for the trait. More than one region could be present on the same chromosome when there were groups of significant SNPs located within 10 Mbp from each other but further than 10 Mbp from the other region(s) on that chromosome.
The genetic variance explained by a SNP was calculated from the estimated genotype effects from the statistical model and the observed genotype frequencies. The result was expressed as a percentage of the total additive genetic variance obtained from model 1. These percentages can be overestimated due to the so called Beavis effect, especially when the effect of a SNP is small . For each trait, the proportion of genetic variance explained by the most significant SNP in a region was reported. Note that the most significant SNP in a region can differ between traits.
Comparison of summer and winter GWAS results
The GWAS results from the winter milk samples from Bouwman et al.  and from the summer milk samples were compared to each other. A total of 1,564 cows were studied in both the summer and the winter GWAS, 92 cows were only studied in the summer GWAS, and 142 only in the winter GWAS.
In Bouwman et al.  a two-step single SNP approach was used for the GWAS of fatty acids from winter milk samples. Due to computation time, in the study by Bouwman et al.  only regions which showed significant associations in analyses using a general linear model were re-analysed using an animal model to account for all relations between the animals. Here we used a one-step approach using the animal model only. To make results from the previous study and the present study comparable, the GWAS based on winter milk samples was redone using an animal model for all SNPs. The Pearson correlation between the -log10(P-values) of the general linear model for winter milk samples and the animal model for winter milk samples was 0.95, which indicates that the general linear model used correctly identified the regions of interest and that the results for winter milk samples of Bouwman et al.  are comparable with the results presented in the current study.
For the individual GWAS studies for winter and summer milk samples a FDR<0.05 was used. A FDR<0.05 is stringent, especially when looking for agreement of results. Therefore, a SNP was qualified as associated with both summer and winter milk fatty acids when the FDR threshold was smaller than 0.20 in both GWAS studies. We choose a FDR threshold of 20%, because the FDR of agreement between the two GWAS studies would then be 4% (20%*20%) if the studies were independent . SNPs in agreement were reported when at least more than one SNP was in agreement between the summer and winter GWAS in a region (were region was defined as above) or when the SNP was in agreement between the summer and winter GWAS for more than one trait.
Table 1 shows that the phenotypic variation of milk fatty acids was larger in summer than in winter. Summer milk samples contained more long chain fatty acids (C18:0, C18:1, and CLA) and less C16:0 than winter milk samples (Table 1). Phenotypic correlations between fatty acids of summer and winter milk samples ranged between 0.36 and 0.67 (Table 2). Genetic correlations between fatty acids of summer and winter milk samples ranged between 0.77 and 1.00 (Table 2). Genetic correlations between summer and winter samples of C4:0, C6:0, C12:0, C18:0, C10:1, C12:1, C14:1, C16:1, and C18:1 (ranging between 0.90-1) were not significantly different from one (Table 2). The genetic correlations for C8:0, C10:0, C14:0, C16:0, and CLA were significantly different from one but showed strong positive correlations (0.77-0.94, Table 2). Herd correlations between fatty acids of summer and winter milk samples ranged between 0.16 and 0.54 (Table 2).
Table 1. Mean (in w/w%), phenotypic variance (σp2 = σa2 + σherd2 + σe2), intra-herd heritability, and proportion of variance due to herdfor the fatty acids of summer and winter milk samples, with their standard errors in subscript
Table 2. Phenotypic (rp), additive genetic (ra), herd (rherd), and residual correlation (re) between winter and summer milk samples, with their standard errors (se)
Genome-wide association based on summer milk samples
Figure 1 shows the genome-wide plots of -log10(P-values) of the GWAS of the fatty acids based on the summer milk samples. In total, 51 regions were associated with one or more fatty acids. Table 3 gives all detected regions and the percentage of the total additive genetic variation explained by the most significant SNP in that region for each of the fatty acids (for more detailed information about those most significant SNPs see Additional file 1). The most significant SNPs per region explained 2.2% up to 50.1% of the total additive genetic variation (Table 3). When all these most significant SNPs per region per fatty acid were analysed simultaneously they explained between 5.5% for C4:0 and 92.5% for C16:1 (Table 3). Three regions with major effects were associated with multiple fatty acids: BTA 14, BTA 19, and BTA 26. First we will describe the results for these three regions with major effects and then for the other regions associated with more than one fatty acid. Regions associated with only one fatty acid are given in Table 3 but will not be described here.
Additional 1. Table S1. Most significant SNP per trait for each region significantly associated with fatty acids of the summer milk sample (corresponding to table 3), SNP position, significance level and the percentage of total additive genetic variance explained by the SNP.
Format: DOCX Size: 28KB Download file
Figure 1. Genome-wide association plots for bovine milk fatty acids of summer milk samples. Genome-wide plots of -log10 (P-values) (y-axis) for association of SNPs with saturated fatty acids (A) and unsaturated fatty acids (B). The genomic position is represented along the x-axis and chromosome numbers are given on the x-axis. The dashed horizontal lines represent the 0.05 false discovery rate thresholds. The orange squares represent the significant SNPs (FDR<0.05). The y-axis are cut off at -log10(P-value) of 15.
Table 3. Regions significantly associated with fatty acids of the summer milk samples and the percentage of the total additive genetic variance explained by the most significant SNP in that region
The association detected on BTA 14, between 0 and 18.9 Mbp (region 14a), with C6:0, C8:0, C14:0, C16:0, C16:1, C18:1, and CLA was most significant for three SNPs located in the DGAT1 gene at 0.4 Mbp (including the two SNPs underlying the DGAT1 K232A dinucleotide polymorphism). The DGAT1 SNPs explained 8.1-50.1% of the total additive genetic variation of these fatty acids (Table 3). The association detected on BTA 14 with C10:0 and C10:1 was also significant for the SNPs in the DGAT1 gene. However, for these fatty acids the SNPs in the DGAT1 gene were not the most significant ones in this region on BTA 14. The most significant SNPs were located at 3.6 Mbp for C10:0 and 3.0 Mbp for C10:1. The association detected on BTA 14 with C12:1 was not significant for the SNPs in the DGAT1 gene (-log10(P-value)=1.89). The association detected on BTA 14 with C12:1 was most significant for a SNP located at 3.2 Mbp. After correcting C10:0, C10:1, and C12:1 for the effect of the DGAT1 K232A polymorphism these most significant SNPs remained significant. The linkage disequilibrium (LD) between these SNPs and the DGAT1 K232A SNPs was moderate (r2 = 0.14-0.34).
The association detected on BTA 19, between 37.3-62.3 Mbp (region 19b), with C8:0, C10:0, C12:0, C14:0, and C16:0 was most significant around 46 Mbp for C10:0, C12:0, and C16:0; around 52 Mbp for C14:0; and around 58 Mbp for C8:0. The most significant SNPs explained 4.3-12.3% of the total additive genetic variation of these fatty acids (Table 3).
The association detected on BTA 26, between 1.4 and 39.0 Mbp, with C10:0, C10:1, C12:1, C14:1, and C16:1 was most significant near the SCD1 gene. The SCD1 gene is not mapped on the BTAU 4.0 , but is mapped at 21 Mbp on BTA 26 according to the UMD 3.0 map . Two SNPs located in the SCD1 gene (including the SCD1 A293V polymorphism) were the most significant SNPs for the traits associated with the region on BTA 26. The SCD1 SNPs explained 4.5% of the genetic variation of C10:0 and 15-46.4% of the total additive genetic variation of the medium chain unsaturated fatty acids (Table 3).
Beside these three regions with major effects, nine additional regions were associated with more than one fatty acid (Table 3). Region 5c was associated with C6:0, C8:0, C10:0, C14:0, C10:1, C14:1, C16:1, and C18:1. Region 10b was associated with C10:0, and C12:0. Region 11b was associated with C4:0, and C6:0. The region on BTA 13 was associated with C6:0, C8:0, and C10:0. On BTA 17 there were three regions associated with multiple traits: region 17a between 15.0 and 23.9 Mbp was associated with C6:0, C8:0, and C16:1; region 17c between 28.6 and 34.3 Mbp was associated with C6:0, C8:0, and C10:0; region 17d between 49.6 and 68.7 Mbp was associated with C14:1, and C16:1. The region on BTA 20 was associated with C6:0, C18:0, and C16:1. The region on BTA 28 was associated with C8:0, and C10:0.
Some SNPs located on BTA 0 were also significant. Blasting these SNPs against the UMD 3.0 map showed that they were mainly located in regions that already showed significant effects, such as region 14a, 19b and 26.
Comparison of summer and winter GWAS results
Figure 2 shows two -log10(P-values) for each SNP, one for the summer (significant SNPs are orange squares in Figure 2) and one for the winter (significant SNPs are blue triangles in Figure 2) GWAS. The -log10(P-values) of SNPs that had a FDR<0.20 in both the winter and summer GWAS are indicated with green addition signs and show the regions that were found for both samples. Table 4 gives an overview of the regions associated with the summer milk fatty acids that were in agreement with the previous study of winter milk fatty acids. Only the regions that showed agreement between the summer and winter GWAS for more than one SNP or more than one trait are reported in Table 4, resulting in 34 regions.
Figure 2. Results from winter and summer genome-wide association studies for bovine milk fatty acids combined in Manhattan-plots. Genome-wide plots of -log10 (P-values) (y-axis) for association of SNPs with saturated fatty acids (A) and unsaturated fatty acids (B). The genomic position is represented along the x-axis and chromosome numbers are given on the x-axis. The orange squares represent SNPs only significant for the summer milk samples (FDRsummer<0.05, FDRwinter≥0.20). Blue triangles represent SNPs only significant for the winter milk samples (FDRwinter<0.05, FDRsummer≥0.20). Green addition signs represent SNPs that were detected significant in both milk samples (FDR<0.20 in both studies). Note that each SNP is represented twice in this figure, once at the -log10(P-value) for the summer GWAS and once at the -log10(P-value) for the winter GWAS. The y-axis are cut off at -log10(P-value) of 15.
Table 4. Regions associated (FDR<0.20) with milk fatty acids in both the summer and winter GWAS
Three regions with major effects, BTA 14, 19, and 26, were found for both summer and winter milk fatty acids. These regions were highly significant in the GWAS based on winter milk samples and were therefore expected to be found for the summer milk samples too. More interesting are the additional regions that were found in both GWAS studies and especially the eight regions (1, 2a, 3, 5a, 10, 14b, 17c, and 24 (Table 4)) that were not reported for the individual studies based on winter or on summer milk samples because their FDR was between 0.05 and 0.20. In some regions agreement between the summer and winter GWAS was based on a single SNP but in other regions agreement was based on multiple SNPs. Also, some regions were associated with multiple fatty acids. We will report here the regions found to be associated with more than two fatty acids in both GWAS studies: regions 5c, 6, 13, 17b, and 27 (Table 4).
On BTA 5 the associations in region 5c with C6:0, C8:0, C10:0, C10:1, and C18:1 were in agreement with the winter GWAS . There are no obvious candidate genes located in this region. In both GWAS studies this region was also associated with C14:0, but different SNPs were significant in the two studies (see Figure 2). The agreement between summer and winter GWAS for C14:1 seems to be in a separate region, region 5d at 108.8 Mbp.
On BTA 6 the associations with C6:0, C8:0, and C16:1 (Table 4) were in agreement with the winter GWAS . This region contains the candidate gene peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PPARGC1A). Our SNP set contained 10 SNPs located in PPARGC1A, however, none of these SNPs were significant in winter nor in summer, but SNPs around (73,865bp before and 797,923bp after) the gene showed association in both the summer and winter GWAS. In the winter GWAS the region was also associated with C12:1 and C14:1, but this was not the case in the summer GWAS.
On BTA 13 the associations with C6:0, C8:0, C10:0, and C10:1 (Table 4) were in agreement with the winter GWAS . The region on BTA 13 was associated with short chain fatty acids. A possible candidate gene in this region is acyl-CoA synthetase short-chain family member 2, which activates acetate for de novo fatty acid synthesis . In the winter GWAS the region was also associated with C14:1 and C16:1, but this was not the case in the summer GWAS.
On BTA 27 the associations with C12:0, C14:1, and C16:1 (Table 4) were in agreement with the winter GWAS . This region contains the candidate gene 1-acylglycerol-3-phosphate O-acyltransferase 6, which is involved in attaching fatty acids on the second position of the triglyceride backbone. In the winter GWAS this region was also associated with C14:0, C16:0, C10:1, and C12:1, but this was not the case in the summer GWAS.
Besides agreement of association for both summer and winter milk fatty acids it is also interesting to see how well the significance levels and effects of SNPs from the summer and winter GWAS correlate. High correlation implies that the effect of the QTL is similar in winter as in summer and, thus, that there is no genotype by season interaction for the regions that were found in both GWAS studies. Winter and summer -log10(P-values) of SNPs that had a FDR<0.20 in both GWAS studies are plotted in Figure 3A and showed a correlation of 0.89. Winter and summer additive SNP effects of SNPs that had a FDR<0.20 in both GWAS studies, expressed in phenotypic standard deviation, are plotted in Figure 3B and show a correlation of 0.97.
Figure 3. Significance level (A) and additive SNP effects expressed in phenotypic standard deviation(B) of the SNPs that were found significant in both the summer and winter GWAS (FDR<0.20).
The aim of this study was to perform a GWAS of bovine fatty acids based on summer milk samples and to compare them to previous results of a GWAS of fatty acids based on winter milk samples . For this GWAS we used different milk samples from largely the same set of cows with the same genotypes at a different stage of the same lactation. The main difference between the two seasons was the herd management, including feeding. In winter, all herds were kept indoors and fed silage, while in summer about half of the herds were grazing outside for at least part of the day and a few other herds were fed fresh grass. Diets of cows including fresh grass are known to alter milk fat composition e.g. [13,14]. This was reflected in our results: summer milk contained more long chain fatty acids and less C16:0 compared to winter milk. The phenotypic correlations for the different fatty acids between summer and winter samples ranged between 0.36-0.67 (Table 2), indicating that the summer samples provide additional information compared to the winter samples. Genetic correlations between summer and winter samples of C4:0, C6:0, C12:0, C18:0, C10:1, C12:1, C14:1, C16:1, and C18:1 (ranging between 0.90-1, Table 2) showed that these fatty acids are genetically the same trait in summer and winter . The genetic correlations for C8:0, C10:0, C14:0, C16:0, and CLA were significantly different from one  but showed strong positive correlations (0.77-0.94, Table 2), suggesting that also for these fatty acids summer and winter samples have most genetic variation in common. It is important here to consider that strong positive genetic correlations are required to ensure that the traits have the same genetic background. However, for the summer milk sample to provide additional information to our previous GWAS phenotypic correlations should be weak.
This GWAS of fatty acids based on summer milk samples shows agreement with most associations detected in our previous GWAS of fatty acids based on winter milk samples . Three regions with major effects detected in the winter GWAS  were also found in the summer GWAS. On BTA 14 a dinucleotide polymorphism in DGAT1 is causing the major effect. The DGAT1 K232A polymorphism is known to be associated with fat content and composition, so our results are in line with other studies [1,16,17]. The rather large region associated with the short and medium chain saturated milk fatty acids on BTA 19 confirm previous linkage studies [18-20]. Several candidate genes related to fat synthesis are located in this region, e.g. ATP citrate lyase, sterol regulatory element-binding transcription factor 1, signal transducer and activator of transcription 5A, growth hormone, and fatty acid synthase. There might be more than one QTL in this region given the size of this region and the many candidate genes, but the actual polymorphism(s) causing the effect(s) has not yet been identified. On BTA 26 a polymorphism in SCD1 is causing the major effect. The gene SCD1 is known to be associated with medium-chain unsaturated fatty acids, so our results are in line with other studies [16,21-24].
Our results from both GWAS studies also suggest that there are additional QTL on BTA 14 besides DGAT1 that were associated with fatty acids. These additional QTL on BTA 14 were located at 3.0-3.8 Mbp, 32.7 Mbp, 40.8-50.9 Mbp, and 73.0-76.5 Mbp, and confirm detected QTL for milk production traits in linkage analyses reviewed in . Candidate genes for these regions might be corticotropin releasing hormone at 30.5 Mbp, fatty acid binding protein 5 at 41.9 Mbp, and fatty acid binding protein 4 at 42.0 Mbp.
The three highly significant regions with major effects mentioned above were expected to be found in both GWAS studies. More interesting are the additional regions that were found in both GWAS studies such as the regions associated with more than two fatty acids: region 5c, 6, 13, 17b, and 27. Also worthwhile to mention are the ‘new’ regions that had a suggestive FDR between 5-20% and were not considered in the individual studies based on winter or on summer milk samples only, but became of interest because they were found in both studies. There are eight ‘new’ regions like this: region 1, 2a, 3, 5a, 10, 14b, 17c, and 24 (Table 4).
There are two different confirmation strategies regarding GWAS: replication and validation. Replication studies are meant to confirm that the actual association is a true association and should therefore be based on samples from the same population with minimal systematic differences [26,27]. Validation studies are meant to see if the association can be generalized over different populations and should therefore be based on samples from a different population, where this population can be different concerning genetic background, phenotype definition, sampling strategy, and time point of investigation [26,27]. A correctly performed replication is more likely to be successful in finding the same association again than validation, however, when an association is validated the associated SNP is probably closer to the actual polymorphism. In literature, replication and validation are often used interchangeably which complicates the interpretation of the results, especially when a study is called replication study but population, phenotype or study design are too different from the original study to be a replication study. Our GWAS has elements of a replication as well as of a validation study; it met criteria for a replication such as sufficient sample size, phenotypes were measured using the same method, same set of markers and a very similar population was used. It also met criteria for validation because the phenotypes were measured at different time points. The difference in season of measuring the phenotype provides additional information. Ideally an independent population of cows should have been sampled, but this was practically not feasible.
Replication of the study with largely the same set of animals and the same genotypes led to minor differences in LD and allele frequencies between studies, therefore it is more likely to confirm previously detected results . However, spurious associations due to population structure or genotyping errors are more likely to be detected twice using the same set of animals and genotypes.
Agreement between the two GWAS studies was based on a FDR threshold of 20% in each study. If the winter and summer GWAS would be independent, the FDR of a region found in both studies would be 4% (20%*20%) . A FDR of 4% gives enough reason to investigate such a region further. Lowering the threshold from 5% to 20% FDR resulted in the eight ‘new’ regions mentioned above, besides the regions that were already discovered in one of the individual studies and were in agreement with the other.
Even though we used largely the same set of animals, same genotypes, same phenotype measurement and a lower threshold for agreement between summer and winter GWAS not all regions were found in both summer and winter GWAS. This can be due to genotype by season interaction, due to lack of power (false negative QTL) or because these QTL were false positive QTL. It is not possible to determine which of these three reasons apply. However, the genetic correlations indicated that fatty acids are genetically similar traits in summer and winter, which suggests that genotype by season interaction may have only a small effect on the results. So lack of agreement between summer and winter GWAS is either due to lack of power or false positives.
This GWAS of fatty acids based on summer milk samples is in agreement with most associations that were previously detected in a GWAS of fatty acids based on winter milk samples. Lowering the FDR threshold from 5% in individual studies to 20% for agreement between both the summer and winter GWAS led to eight ‘new’ regions that were not considered in the individual studies, but had a suggestive FDR between 5-20% in both studies. It is more likely that genomic regions are involved in fatty acid synthesis when associations are found in both summer and winter GWAS compared to regions detected in only one GWAS. Detected associations that were in agreement between summer and winter GWAS are therefore worthwhile to pursue in fine-mapping studies.
ACB carried out the analysis, prepared and drafted the manuscript. MHPWV helped to draft and edit the manuscript. JAMA initiated and established the overall project design. HB participated in the design of the study, the coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
This study is part of the Dutch Milk Genomics Initiative and the project ‘Melk op Maat’, funded by Wageningen University (the Netherlands), the Dutch Dairy Association (NZO, Zoetermeer, the Netherlands), the cooperative cattle improvement organization CRV (Arnhem, the Netherlands), the Dutch Technology Foundation (STW, Utrecht, the Netherlands), the Dutch Ministry of Economic Affairs (The Hague, the Netherlands) and the Provinces of Gelderland and Overijssel (Arnhem, the Netherlands).
Duchemin S, Bovenhuis H, Stoop WM, Bouwman AC, Van Arendonk JAM, Visker MHPW: Genetic correlation between composition of bovine milk fat in winter and summer, and DGAT1 and SCD1 by season interactions.
J Dairy Sci 2012.
Conte G, Mele M, Chessa S, Castiglioni B, Serra A, Pagnacco G, Secchiari P: Diacylglycerol acyltransferase 1, stearoyl-CoA desaturase 1, and sterol regulatory element binding protein 1 gene polymorphisms and milk fatty acid composition in Italian Brown cattle.
Grisart B, Coppieters W, Farnir F, Karim L, Ford C, Berzi P, Cambisano N, Mni M, Reid S, Simon P, et al.: Positional Candidate Cloning of a QTL in Dairy Cattle: Identification of a Missense Mutation in the Bovine DGAT1 Gene with Major Effect on Milk Yield and Composition.
Mele M, Conte G, Castiglioni B, Chessa S, Macciotta NPP, Serra A, Buccioni A, Pagnacco G, Secchiari P: Stearoyl-Coenzyme A Desaturase Gene Polymorphism and Milk Fatty Acid Composition in Italian Holsteins.
Moioli B, Contarini G, Avalli A, Catillo G, Orrù L, De Matteis G, Masoero G, Napolitano F: Short Communication: Effect of Stearoyl-Coenzyme A Desaturase Polymorphism on Fatty Acid Composition of Milk.
Schennink A, Heck JML, Bovenhuis H, Visker MHPW, van Valenberg HJF, van Arendonk JAM: Milk Fatty Acid Unsaturation: Genetic Parameters and Effects of Stearoyl-CoA Desaturase (SCD1) and Acyl CoA: Diacylglycerol Acyltransferase 1 (DGAT1).