Analyzes of genome-wide association follow-up study for calving traits in dairy cattle
1 Department of Molecular Biology and Genetics, Aarhus University, P.O. Box 50, Tjele, DK-8830, Denmark
2 VikingGenetics, Ebeltoftvej 16, Assentoft, Randers, SØ, DK-8960, Denmark
3 Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, P.O. Box 7070, Uppsala, 750 07, Sweden
BMC Genetics 2012, 13:71 doi:10.1186/1471-2156-13-71Published: 14 August 2012
There is often a pronounced disagreement between results obtained from different genome-wide association studies in cattle. There are multiple reasons for this disagreement. Particularly the presence of false positives leads to a need to validate detected QTL before they are optimally incorporated or weighted in selection decisions or further studied for causal gene. In dairy cattle progeny testing scheme new data is routinely accumulated which can be used to validate previously discovered associations. However, the data is not an independent sample and the sample size may not be sufficient to have enough power to validate previous discoveries. Here we compared two strategies to validate previously detected QTL when new data is added from the same study population. We compare analyzing a combined dataset (COMB) including all data presently available to only analyzing a validation dataset (VAL) i.e. a new dataset not previously analyzed as an independent replication. Secondly, we confirm SNP detected in the Reference population (REF) (i.e. previously analyzed dataset consists of older bulls) in the VAL dataset.
Clearly the results from the combined (COMB) dataset which had nearly twice the sample size of other two subsets allowed the detection of far more significant associations than the two smaller subsets. The number of significant SNPs in REF (older bulls) was about four times higher compare to VAL (younger bulls) though both had similar sample sizes, 2,219 and 2,039 respectively. A total of 424 SNP-trait combinations on 22 chromosomes showed genome-wide significant association involving 284 unique SNPs in the COMB dataset. In the REF data set 101 associations (73 unique SNPs) and in the VAL 24 associations (18 unique SNPs) were found genome-wide significant. Sixty-eight percent of the SNPs in the REF dataset could be confirmed in the VAL dataset. Out of 469 unique SNPs showing chromosome-wide significant association with calving traits in the REF dataset 321 could be confirmed in the VAL dataset at P < 0.05.
The follow-up study for GWAS in cattle will depend on the aim of the study. If the aim is to discover novel QTL, analyses of the COMB dataset is recommended, while in case of identification of the causal mutation underlying a QTL, confirmation of the discovered SNPs are necessary to avoid following a false positive.