Institute for Bee Research Hohen Neuendorf, 16540 Hohen Neuendorf, Germany

Institute of Mathematics, Freie Universitaet Berlin, Berlin, Germany

Leibniz Institute for Farm Animal Biology, 18196 Dummerstorf, Germany

Abstract

Background

The honey bee is an economically important species. With a rapid decline of the honey bee population, it is necessary to implement an improved genetic evaluation methodology. In this study, we investigated the applicability of the unified approach and its impact on the accuracy of estimation of breeding values for maternally influenced traits on a simulated dataset for the honey bee. Due to the limitation to the number of individuals that can be genotyped in a honey bee population, the unified approach can be an efficient strategy to increase the genetic gain and to provide a more accurate estimation of breeding values. We calculated the accuracy of estimated breeding values for two evaluation approaches, the unified approach and the traditional pedigree based approach. We analyzed the effects of different heritabilities as well as genetic correlation between direct and maternal effects on the accuracy of estimation of direct, maternal and overall breeding values (sum of maternal and direct breeding values). The genetic and reproductive biology of the honey bee was accounted for by taking into consideration characteristics such as colony structure, uncertain paternity, overlapping generations and polyandry. In addition, we used a modified numerator relationship matrix and a realistic genome for the honey bee.

Results

For all values of heritability and correlation, the accuracy of overall estimated breeding values increased significantly with the unified approach. The increase in accuracy was always higher for the case when there was no correlation as compared to the case where a negative correlation existed between maternal and direct effects.

Conclusions

Our study shows that the unified approach is a useful methodology for genetic evaluation in honey bees, and can contribute immensely to the improvement of traits of apicultural interest such as resistance to

Background

A colony trait (e.g. honey and wax production) in the honey bee is comparable to maternally influenced traits in mammals such as birth and weaning weight; thus, it can be partitioned into the additive genetic effect of the queen (maternal genetic effects) and the additive genetic effects of the progeny workers (direct genetic effects). The queen mediates its effect through heritable characters like egg laying rate or pheromone production in the hive whereas workers affect a trait through their hoarding behaviour or production of and responsiveness to pheromones. Until now, genetic evaluation in the honey bee has been implemented using a pedigree based BLUP-animal model with maternal and direct genetic effects

We performed a simulation study to investigate the impact of the unified approach on the accuracy of estimated breeding values in honey bees. Similar to the case of the honey bee, other species also have a situation where genetic evaluation needs to account for maternal effects and uncertain paternity, e.g. weaning weight in beef cattle is a maternally influenced trait. Besides, cows can be exposed to more than one male in a herd and pasture paddock within the same breeding season, thus generating uncertainty on paternity assignments and adversely affecting the accuracy of breeding value predictions

Methods

This study consisted of two main steps. In the first step, a dataset was simulated for a honey bee population, which involved the modelling and simulation of the population structure, genome, correlation between maternal and direct effects, heritability, true breeding values, genetic, phenotypic and residual variances. The simulated dataset was close to realistic scenarios and in agreement with the genetic and reproductive peculiarities of the honey bee. In the second step, genetic evaluation was performed using the unified approach and the traditional pedigree based BLUP approach.

Population structure

Base population in linkage disequilibrium

A random mating population was simulated for 1000 generations to obtain a base population in mutation-drift equilibrium with linkage disequilibrium (LD)

Mating and selection scheme

Five additional overlapping generations were simulated from the base population. Each of the generations consisted of 500 potential-dam queens and 250 drone-producing queens. From these 500 potential-dam queens, 10% were randomly selected as dam queens. The 50 selected dam queens produced 500 potential-dam queens (Figure

Selection scheme

**Selection scheme.** Selection scheme showing that in each generation 10% potential-dam queens were selected randomly to serve as parent. These selected queens produced potential-dam queens and drone-producing queens for the next generation.

Population characteristics specific to the honey bee

To construct a population similar to that used in the genetic evaluation program of honey bees, we constructed a dummy sire and an average worker (representing direct effects) in the pedigree. Generations following the base population were overlapping and mating was polyandrous as in the normal breeding population. These characteristics are described in more detail in the following section.

Construction of a dummy sire and an average worker

As a consequence of polyandry in honey bees, offspring have an unclear paternal descent. To overcome the problem of representing the paternal descent, Bienefeld et al.

A pedigree diagram

**A pedigree diagram.** In the pedigree diagram, the expanded rectangular box shows a dummy sire which consists of 10 drone-producing sister queens represented by smaller circles. Each drone-producing sister queen contributes two drones which are represented by the smaller square boxes. All drone-producing sister queens comprising a dummy sire have a common dam queen and dummy sire, thus are related as sisters. The pedigree shows that mating takes place between overlapping generations.

A colony is formed by a queen and its progeny comprising several thousand workers. Since it is impossible to include all workers of a colony for genetic evaluation, an average worker was constructed that represented all workers of a colony. It was assumed that one average worker existed for each potential-dam queen/dam queen in the pedigree.

Modelling polyandry and overlapping generations

In each generation, 50 dam queens and 25 dummy sires were randomly selected as mating partners (Figure ^{th} generation and queens constituting a dummy sire were taken from the (n-1)^{th} generation i.e. one generation preceding the dam queens (Figure

The mating scheme

**The mating scheme.** The mating scheme is illustrated in this figure. Dummy sires connected through dashed arrow to queens are the mating partners. Q-b are queens belonging to the base population and Q-1 to Q-5 are queens belonging to generations 1–5. Each dummy sire is equivalent to 10 drone-producing sister queens. Therefore, in total, 25 dummy sires represent 250 queens. Each generation consisted of queens and a corresponding average worker for each queen to represent a colony structure, and additionally, drone-producing sister colonies in the form of dummy sire. The mating scheme shows that all generations following the base population were overlapping.

Pedigree, phenotypic and genomic information

A phenotypic value in the honey bee represents an observation for the whole colony and thus, cannot be decomposed into individual phenotypic values of a queen and an average worker. Therefore, both the queen and the average worker of a colony were assigned the same colony phenotypic value. It was assumed that pedigree records were available for all generations; phenotypes were available for all dam queens (and the corresponding average worker) in the base generation and all potential-dam queens (and the corresponding average worker) in all but the last generation. Genotyping information was available for all dam queens in the base generation and all potential-dam queens.

Genome

We simulated a realistic genomic dataset which helped to assess the impact and applicability of the unified approach to the honey bee. A diploid genome consisting of 16 linkage groups was simulated for every queen

**Chromosome**

**Chromosome length (in base-pairs)**

**Number of SNP**

Chromosome 1

29 893 408

14 137

Chromosome 2

15 549 267

6 335

Chromosome 3

13 234 341

7 119

Chromosome 4

12 718 334

5 589

Chromosome 5

14 363 272

6 330

Chromosome 6

18 472 937

7 877

Chromosome 7

13 219 345

5 973

Chromosome 8

13 546 544

6 235

Chromosome 9

11 120 453

5 578

Chromosome 10

12 965 953

5 068

Chromosome 11

14 726 556

6 957

Chromosome 12

11 902 654

5 812

Chromosome 13

10 288 499

5 082

Chromosome 14

10 253 655

4 874

Chromosome 15

10 167 229

3 879

Chromosome 16

7 207 165

3 155

Total

219 629 612

100 000

Correlation between maternal and direct effects

Studies in honey bees (_{
qw
}) was obtained by estimating the correlation between the maternal and direct true breeding values.

True breeding values and phenotypic values

Maternal and direct true breeding values were simulated for all dam queens of the base population and all potential-dam queens from generations 1–5. True breeding values for maternal (_{
q
}) and direct effects (_{
w
}) for a queen were calculated using the formula _{
q
}
^{
i
} and _{
w
}
^{
i
} are the maternal and direct true breeding values for the ^{
th
} queen, respectively. _{
q
}
^{
ij
} and _{
w
}
^{
ik
} are QTL genotypes of the ^{
th
} queen at the ^{
th
} and ^{
th
} QTL controlling the maternal and direct effects, respectively and has a value of 1 or −1 for the homozygous genotypes or 0 for the heterozygous genotype. ^{
j
} and ^{
k
} are allele substitution effects at the ^{
th
} and ^{
th
} QTL.

The overall true breeding value of a queen was the sum of its maternal and direct true breeding values. The phenotype of each queen was obtained by adding the overall true breeding value of a queen to a residual value drawn from a normal distribution _{
e
}
^{2}). The way the value for residual variance (_{
e
}
^{2}) was chosen is explained in the later section.

Genetic variance

Variance and covariance of maternal and direct effects

Variances of maternal (_{
q
}
^{2}) and direct (_{
w
}
^{2}) effects were obtained by calculating the variance of the simulated maternal and direct true breeding values, respectively. The covariance between maternal and direct effects (_{
qw
}) was obtained by calculating the covariance between the maternal and direct true breeding values.

Total genetic variance

Usually a breeding value is defined as twice the expected deviation of an individual's progeny from the mean, or twice the ‘transmitting ability’ of an individual _{
g
}
^{2}) would become _{
q
}
^{2} + 0.25_{
w
}
^{2} + _{
qw
} (the latter from 2 × 1 × 0.5 × _{
qw
}). However, for the sake of easy comparison and interpretation, the overall breeding value was taken as a sum of the direct and maternal breeding values of a queen and the total genetic variance was taken as a sum of variance of maternal effects, direct effects and twice the covariance between them, and can be expressed as _{
g
}
^{2} = _{
q
}
^{2} + _{
w
}
^{2} + 2_{
qw
}.

Phenotypic variance, residual variance and maternal and direct heritability

A colony trait in honey bees is determined by the heritability of maternal (_{
m
}
^{2}) and direct (_{
d
}
^{2}) effects. In our study, we simulated a fixed maternal heritability of 0.15, 0.25 and 0.35 (e.g. honey yield, hygienic behaviour) that can be expressed as a ratio of the variance of maternal effects to the phenotypic variance and is given as follows:

After rearranging, we get, _{
p
}
^{2}) was obtained from the expression _{
e
}
^{2}) was obtained by subtracting the total genetic variance (_{
g
}
^{2}) from the phenotypic variance (_{
p
}
^{2}) i.e. _{
e
}
^{2} = _{
p
}
^{2} − _{
g
}
^{2}. The ratio of variance of direct effects to the phenotypic variance provided a measure of the heritability of direct effects, as given below.

Table

**Simulated****
h
**

**Corr**
_{
(m,d)
}

**Achieved****
h
**

_{
m
}
^{2}, Maternal heritability; _{
d
}
^{2}, Direct heritability; _{
(m,d)
}, Correlation between maternal and direct effects;

0.150

0

0.162 (0.005)

0.150

−0.46

0.155 (0.005)

0.250

0

0.270 (0.008)

0.250

−0.46

0.259 (0.008)

0.350

0

0.377 (0.011)

0.350

−0.46

0.362 (0.011)

Estimation of breeding values

A BLUP-animal model with maternal and direct effects

where **y** is a vector of records of the colonies, **b** is a vector of fixed effects, **u**
_{
1
} is a vector of random direct effects, **u**
_{
2
} is a vector of random maternal effects, **e** is a vector of random residual effects, **X** is an incidence matrix relating observations to the corresponding environment, **Z**
_{
1
} and **Z**
_{
2
} are the incidence matrices relating observations to the corresponding direct effects and maternal effects, respectively.

Estimation of breeding values was done using the following two approaches: (1) the traditional BLUP approach (PED_BLUP) based on a numerator relationship matrix (**A**) constructed from pedigree information and (2) the unified approach (UNI_BLUP) based on a combined relationship matrix (**H**) constructed from pedigree and genomic information.

Relationship matrix constructed from pedigree data

Elements of the numerator relationship matrix (**A**) were calculated according to the method proposed by Bienefeld et al. _{
p
}) of 0.367 to account for polyandry. This value is currently used for Germany-wide genetic evaluation of the honey bee populations where all mating sites are managed according to unified guidelines (with respect to number of drone-producing colonies and their relationship). The details for constructing the **A** matrix recursively are given in the Additional file **A** matrix for all 5275 individuals in the pedigree. The **A** matrix was partitioned into **A**
_{
11
}, **A**
_{
22
}, **A**
_{
12
} and **A**
_{
21
} where subscripts 1 and 2 represent genotyped and non-genotyped individuals, respectively. The inverse of the partitioned **A** matrix can be expressed as

**Details for constructing the honey bee’s numerator relationship matrix recursively.**

Click here for file

Relationship matrix constructed from pedigree and genomic data

In the honey bee pedigree, a dummy sire and an average worker represent a group of individuals and thus, it is not possible to get individual genotyping data. Moreover, it is not possible to obtain genotyping information from all queens in the population. Using the unified approach is advantageous for honey bees as genomic information for genotyped queens can be integrated with pedigree information from genotyped as well as non-genotyped individuals resulting in a combined relationship matrix **H**. A genomic matrix (**G**) was constructed for the 2550 queens with genotyping data. Different methods have been developed to derive the **G** matrix **G** matrix was obtained from **ZZ**’/2 ∑ _{
i
}(1 − _{
i
}), where **Z** is equal to **M** − **P**, **M** is the matrix specifying marker alleles inherited by each individual and **P** is equal to 2(_{
i
} − 0.5) with _{
i
} being the frequency of second allele at locus **G** matrix **G**
_{
w
}) was constructed using a weighing factor (**G**
_{
w
} = **G** + (1 − **A**
_{
11
}. Christensen and Lund

The inverse of the combined relationship matrix (**H**
^{-1}), described by Legarra et al.

Simulated values for the genetic and residual variance were used for estimating the breeding values. For both approaches, statistics for the achieved heritability of direct effects and accuracies for the overall, maternal and direct estimated breeding values were based on 20 replicated simulations. The accuracy was reported as a correlation between the estimated and true breeding values

Results

Accuracy of the overall estimated breeding values

In the honey bee breeding programs, the criterion used for selecting queens is its overall breeding value which is a sum of the maternal and direct estimated breeding values. Therefore, in this study we report the accuracy of overall estimated breeding values. Table

**
h
**

**Method**

**Corr**
_{
(m,d)
}

**Accuracy for JQ (SE)**

**Accuracy for AQ (SE)**

_{
m
}
^{2}, Maternal heritability; _{
(m,d)
}, Correlation between maternal and direct effects;

Significant difference in accuracy with P < 0.05 (Welch’s ^{a}UNI_BLUP and PED_BLUP; ^{b}no correlation and negative correlation for UNI_BLUP; ^{c}heritabilities 0.15 and 0.25 for UNI_BLUP; ^{d}heritabilities 0.15 and 0.35 for UNI_BLUP; ^{e}heritabilities 0.25 and 0.35 for UNI_BLUP.

**0.15**

UNI

0

0.468 ^{a,b,c,d} (0.010)

0.661 ^{a,b,c,d} (0.005)

PED

0

0.363 (0.017)

0.603 (0.007)

UNI

−0.46

0.381 ^{a,b,c,d} (0.021)

0.555 ^{a,b,c,d} (0.010)

PED

−0.46

0.295 (0.023)

0.489 (0.009)

**0.25**

UNI

0

0.542 ^{a,b,c,e} (0.009)

0.756 ^{a,b,c,e} (0.006)

PED

0

0.420 (0.015)

0.710 (0.008)

UNI

−0.46

0.449 ^{a,b,c} (0.018)

0.640 ^{a,b,c,e} (0.009)

PED

−0.46

0.348 (0.021)

0.577 (0.008)

**0.35**

UNI

0

0.604 ^{a,b,d,e} (0.009)

0.832 ^{a,b,d,e} (0.008)

PED

0

0.467 (0.012)

0.800 (0.010)

UNI

−0.46

0.498 ^{a,b,d} (0.017)

0.700 ^{a,b,d,e} (0.008)

PED

−0.46

0.388 (0.019)

0.642 (0.008)

For juvenile queens, the accuracy of overall estimated breeding values was significantly higher with the UNI_BLUP approach (P < 0.05) as compared to the PED_BLUP approach for all values of heritability and correlation between maternal and direct effects. The increase in accuracy by UNI_BLUP was approximately 0.1 (or 29%) for most of the cases.

Similar to juvenile queens, the accuracy of overall estimated breeding values for all queens was higher with the UNI_BLUP approach (P < 0.05) than the PED_BLUP for all values of heritability and correlation between maternal and direct effects. The percentage increase in accuracy for the case of no correlation between maternal and direct effects at maternal heritabilities of 0.15, 0.25 and 0.35 was approximately 9.6%, 6.5% and 4.0%, respectively. In case of a negative correlation of −0.46, the percentage increase in accuracy was approximately 13.5%, 10.9% and 9.0% at maternal heritabilities of 0.15, 0.25 and 0.35, respectively.

From these results we can conclude that the UNI_BLUP approach performed better than the PED_BLUP and the accuracy of overall estimated breeding values increased considerably with the UNI_BLUP approach.

Accuracy of the maternal and direct estimated breeding values

Table

**
h
**

**Method**

**Corr**
_{
(m,d)
}

**Accuracy of direct EBV for JQ (SE)**

**Accuracy of maternal EBV for JQ (SE)**

**Accuracy of direct EBV for AQ (SE)**

**Accuracy of maternal EBV for AQ (SE)**

_{
m
}
^{2}, Maternal heritability; _{
(m,d)
}, Correlation between maternal and direct effects;

Significant difference in accuracy with P < 0.05 (Welch’s ^{a}UNI_BLUP and PED_BLUP; ^{b}no correlation and negative correlation for UNI_BLUP; ^{c}heritabilities 0.15 and 0.25 for UNI_BLUP; ^{d}heritabilities 0.15 and 0.35 for UNI_BLUP; ^{e}heritabilities 0.25 and 0.35 for UNI_BLUP.

**0.15**

UNI

0

0.323 ^{a,b,c,d} (0.015)

0.279 ^{a,b,c,d} (0.015)

0.446 ^{a,b,c,d} (0.011)

0.420 ^{a,b,c,d} (0.009)

PED

0

0.227 (0.023)

0.225 (0.019)

0.406 (0.012)

0.381 (0.010)

UNI

−0.46

0.115 ^{b,d} (0.023)

0.127 ^{b} (0.031)

0.225 ^{b,d} (0.018)

0.223 ^{b,d} (0.013)

PED

−0.46

0.059 (0.025)

0.103 (0.030)

0.186 (0.016)

0.208 (0.012)

**0.25**

UNI

0

0.373 ^{a,b,c} (0.016)

0.330 ^{a,b,c,e} (0.014)

0.510 ^{b,c,e} (0.013)

0.482 ^{a,b,c,e} (0.010)

PED

0

0.268 (0.021)

0.260 (0.018)

0.474 (0.013)

0.447 (0.011)

UNI

−0.46

0.154 ^{b} (0.023)

0.154 ^{b} (0.031)

0.272 ^{b} (0.017)

0.257 ^{b} (0.014)

PED

−0.46

0.085 (0.025)

0.125 (0.030)

0.231 (0.016)

0.240 (0.013)

**0.35**

UNI

0

0.418 ^{a,b,d} (0.017)

0.371 ^{a,b,d,e} (0.014)

0.566 ^{b,d,e} (0.015)

0.527 ^{b,d,e} (0.011)

PED

0

0.307 (0.018)

0.287 (0.017)

0.538 (0.015)

0.496 (0.013)

UNI

−0.46

0.186 ^{a,b,d} (0.024)

0.173 ^{b} (0.031)

0.308 ^{b,d} (0.017)

0.280 ^{b,d} (0.015)

PED

−0.46

0.110 (0.025)

0.138 (0.030)

0.268 (0.016)

0.258 (0.014)

Effect of correlation and heritability

Both low heritability and negative correlation contribute to a lower genetic variance which leads to a decrease in the accuracy. The accuracy of overall estimated breeding values was reduced as a result of negative correlation in comparison to the case where maternal and genetic effects had no correlation (Table

The accuracy of maternal and direct estimated breeding values (Table

Discussion

The study provided comparative insight into genetic evaluation performed using: (1) the traditional BLUP approach based on pedigree data and (2) the unified approach based on both pedigree and marker data. In this study, we investigated the accuracy of overall, direct and maternal estimated breeding values as well as the influence of heritability of the trait and the genetic correlation between maternal and direct effects on the accuracy.

It has been reported in honey bees that most economically important traits have low to medium heritability

Unlike previous studies **G** matrix. Likewise, Christensen and Lund

In our study, the accuracies of maternal and direct estimated breeding values for the pedigree based approach (PED_BLUP) with maternal and direct heritability of 0.15 were 0.38 and 0.41 at no correlation and 0.21 and 0.19 at a correlation of −0.46, respectively. In an earlier pedigree based study by Roehe and Kennedy

A complexity associated with the estimation of breeding values for maternally influenced traits is that negative correlation between maternal and direct effects severely impedes the response to selection

Conclusions

To provide a comparison between genetic evaluation methods based on the unified approach and the pedigree based approach, we modelled a complex scenario by taking into consideration characteristics such as varying heritability and correlation between maternal and direct genetic effects, uncertain paternity and other genetic and reproductive peculiarities of the honey bee. To the best of our knowledge, this is the first study that describes the use of molecular marker data for genetic evaluation in honey bees by employing the unified approach. The study provides background knowledge about the simulation of a genomic and a pedigree dataset in honey bees for genetic evaluation, therefore, it can serve as an important framework for future studies. Studies in other species

Abbreviations

BLUP: Best Linear Unbiased Prediction; LD: Linkage Disequilibrium; MAF: Minor Allele Frequency; QTL: Quantitative Trait Loci; SNP: Single Nucleotide Polymorphism

Competing interest

The authors declare that they have no competing interests.

Authors’ contributions

PG conducted the study and wrote the manuscript. NR and KB conceived the study, participated in discussions and helped to draft the manuscript. AS and TC participated in discussions and helped to draft the manuscript. All authors read and approved the final manuscript.

Acknowledgements

The work was a part of the project "Marker assisted selection of Varroa-tolerant honey bees" financially supported by the German Federal Ministry of Food, Agriculture and Consumer Protection (BMELV) [2808HS009].