This article is part of the supplement: Proceedings of the 15th European workshop on QTL mapping and marker assisted selection (QTLMAS)
Linear models for breeding values prediction in haplotype-assisted selection - an analysis of QTL-MAS Workshop 2011 Data
- Equal contributors
Department of Genetics, Wrocław University of Environmental and Life Sciences, Kożuchowska 7, Wrocław 51-631, Poland
BMC Proceedings 2012, 6(Suppl 2):S11 doi:10.1186/1753-6561-6-S2-S11Published: 21 May 2012
The aim of this study was to estimate haplotype effects and then to predict breeding values using linear models. The haplotype based analysis enables avoidance of loosing information due to linkage disequilibrium between single markers. There are also less explanatory variables in the linear model which makes the estimation more reliable.
Different methods and criteria for marker and haplotype selection were considered. First, markers with MAF lower than 5% where excluded from the data set. Then, SNPs in complete linkage disequilibrium where selected. Next step was to construct haplotypes and to estimate their frequencies basing on selected SNPs. The haplotypes with a frequency lower than 1% were not considered in further analysis. Chosen haplotypes were used as the explanatory variables in the linear models for breeding values prediction. Linear models with fixed and random haplotype effects as well as animal model were tested.
The number of markers was limited to 1206, 1189, 1249, 1288 and 1167 for chromosome 1, 2, 3, 4 and 5, respectively due to MAF criterion. In total 409 subsets of SNPs with r2=1 were found. 1476 haplotypes with different lengths were inferred. The frequencies of 817 haplotypes were higher than 1% - 184 for the first chromosome, 172 for the second, 131 for the third, 146 for the forth and 184 haplotypes for the fifth chromosome. The haplotype effects estimated using random models were comparable and more precise in prediction for individuals with unknown phenotypes. A few haplotypes with large effects were found when their effects were defined as fixed in the linear model . The correlations of the predicted breeding values with true breeding values were not that high. This could be brought about by selection criteria imposed on the genotype data which led to substantial reduction of number of markers.
Although not many markers were considered in the study, the results obtained show that the implemented approach can be considered as quite promising. The haplotype approach let to avoid high dimensional models as compared with single SNPs models.