Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA

Department of Zoology, University of Washington, Seattle, WA 98195, USA

Department of Biology, Emory University, Atlanta, GA 30322, USA

Abstract

Background

Doherty and Zinkernagel, who discovered that antigen presentation is restricted by the major histocompatibility complex (MHC, called HLA in humans), hypothesized that individuals heterozygous at particular MHC loci might be more resistant to particular infectious diseases than the corresponding homozygotes because heterozygotes could present a wider repertoire of antigens. The superiority of heterozygotes over either corresponding homozygote, which we term

Methods

An algebraic model is used to define the expected population-wide findings of an epidemiologic study of HLA heterozygosity and disease outcome as a function of allele-specific effects and population genetic parameters of the study population.

Results

We show that overrepresentation of HLA heterozygotes among individuals with favorable disease outcomes (which we term

Conclusion

To demonstrate allele-specific overdominance for specific infections in human populations, improved analytic tools and/or larger studies (or studies in populations with limited HLA diversity) are necessary.

Background

The role of genetics in modulating the immune response to infectious diseases is a topic of longstanding interest among epidemiologists, clinicians, population geneticists, and immunologists

The relationship between MHC genotype and infectious disease resistance can take several forms. Most simply, individual MHC alleles may be especially effective, or especially ineffective, at presenting antigens from particular infections, so that carrying one or two copies of a given MHC allele might predispose an infected individual to a more or less favorable disease outcome. A second, distinct but compatible hypothesis was suggested by Doherty and Zinkernagel soon after their discovery of MHC restriction: since each MHC allele provides an infected individual with the ability to present a particular set of antigens, individuals who are heterozygous (say, genotype XY) at a particular MHC locus may mount a more vigorous immune response to a given infection, resulting in a better outcome, than individuals who are homozygous for either of the corresponding alleles (XX or YY)

Determining whether either or both of these mechanisms operates for a particular disease is of interest for a variety of basic and applied purposes. Epidemiologically, HLA genotype partially accounts for inter-patient variation in disease severity or progression rates for such long-term viral infections as HIV, human T-cell lymphotropic virus, type 1 (HTLV-1), and hepatitis B and C

Of the two ways described above in which MHC genotype may affect disease outcome, the first – association between a particular allele and disease outcome – has been repeatedly documented in human populations by various methods of genetic epidemiology

As an alternative to directly examining the hypothesis of allele-specific overdominance, several investigators have compared the infectious disease outcomes of heterozygotes at a given HLA locus, as a group, to the outcomes of homozygotes at the same locus, as a group. In many cases, heterozygotes as a group have shown better infectious disease outcomes (slower disease progression or more rapid clearance of viral infection) than homozygotes as a group

In this report, we show that although population heterozygote advantage is compatible with allele-specific overdominance, it is also compatible with the opposite; i.e., population heterozygote advantage could arise in a population in which every heterozygote had a

In the first section below, we outline a general model for the relationship between genotype at a particular HLA locus and infectious disease outcome (which we dichotomize into favorable and unfavorable), and we use the model to define the conditions under which population heterozygote advantage is expected. We show that, depending on allele frequencies at the HLA locus of interest, population heterozygote advantage may occur when there is allele-specific overdominance, but may also occur under other conditions, including allele-specific underdominance. Two examples are given to illustrate the reasons for the lack of concordance between population and allele-specific effects. In the Discussion, we suggest some possible approaches for estimating allele-specific effects.

General Model

To determine the precise conditions under which population heterozygote advantage will be observed, we consider a general model that predicts the expected outcome of a comparison between heterozygotes and homozygotes in an epidemiological study as a function of (i) the frequencies of resistant and susceptible alleles at a particular locus and (ii) the relationship between genotype at that locus and phenotype. Note that this is

The model is summarized in Table _{1},_{2},.,_{m }with frequencies _{1},_{2},.,_{m }in the population, and _{1},_{2},.,_{n }with frequencies _{1},_{2},.,_{n}. Let

Frequency and disease risk of 5 classes of genotypes under the model.

Class

Susceptible homozygotes

Heterozygote of susceptible alleles

Heterozygote with one R and one S allele

Heterozygote of resistant alleles

Resistant homozygotes

Genotype

_{
i
}
_{
i
}

_{i}_{i},

_{
i
}
_{
j
}

_{i}_{j},

_{
i
}
_{
i
}

Frequency

Θ

^{2 }- Θ

2

^{2 }- Π

Π

Probability of Favorable Outcome

Notation: Frequency of allele _{i }is _{i}; frequency of allele _{i }is _{i}. _{i}; _{i }= 1 -

We assume that SS homozygotes have a probability

Under these assumptions, we can calculate _{hom }and _{het}, the probability of a favorable disease course for homozygotes (the first and fifth classes in Table ^{nd}, 3^{rd}, and 4^{th }classes in Table

The relative risk (RR) of a favorable outcome for a heterozygote compared to a homozygote is defined as:

Population heterozygote advantage corresponds to RR > 1. Various formulations for relative risk (or odds ratio, used in case-control studies, such as

Figure

Population heterozygote advantage as a function of allele-specific effects and allele frequencies

**Population heterozygote advantage as a function of allele-specific effects and allele frequencies. **Parameter regions in which heterozygotes will on average have a higher probability of a favorable disease outcome than homozygotes (regions of population heterozygote advantage) are shown in black. Population heterozygote advantage occurs when diversity of resistant alleles is sufficiently high and diversity of susceptible alleles is sufficiently low i.e., toward the bottom right of the parameter space in each panel of the figure. Different panels indicate various assumptions about the genotype-specific relative risks

Figure

The converse is also true, though only in what seem to be very special circumstances. That is, even when allele-specific overdominance holds, it is possible that heterozygotes on average will do worse than homozygotes, so population heterozygote advantage will not be observed. This occurs with genotype frequencies sufficiently far toward the top left of Figure

Although neither population heterozygote advantage nor allele-specific overdominance implies the other, the two phenomena are of course related. Specifically, the conditions to observe population heterozygote advantage are broadest when allele-specific overdominance holds and become narrower as the underlying genetics becomes more "different" from overdominance of resistance (dominance -> additivity -> recessiveness -> underdominance).

Two Examples

The intuition behind our result that population heterozygote advantage need not reflect allele-specific overdominance can be seen in two simple examples.

Example 1: Suppose that there are two alleles, each at 50% frequency in the population, one (R) conferring resistance and one (S) conferring susceptibility in the homozygous state. Population heterozygote advantage will be observed if the probability of a favorable outcome for heterozygotes is greater than the arithmetic mean of the probabilities of favorable outcomes for RR homozygotes and for SS homozygotes. In this situation, population heterozygote advantage does not require overdominance, but merely allele effects that are more than additive (partial dominance). Dominance of particular HLA alleles conferring resistance, and/or recessiveness of susceptibility (poor outcome) alleles, have been documented for schistosomiasis

When allele frequencies are equal, as in Example 1, partial dominance is sufficient to create population heterozygote advantage

Example 2: Suppose there are 10 alleles, each with frequency 5%, each conferring resistance to a particular disease, and one allele, with frequency 50%, conferring susceptibility. In the notation of Table ^{2 }= 0.025; Θ = 0.25. In this case, >90% of homozygotes in the population will be homozygous for a susceptible allele, since the frequency of SS homozygotes is Θ = 0.25, while the frequency of RR homozygotes is Π = 0.025. In contrast, 100% of heterozygotes will have at least one resistant allele. In epidemiological terms, this is a form of confounding, in which possession of a resistant allele is positively associated with heterozygosity (the exposure of interest) and positively associated with having a favorable disease course (the outcome of interest). Because of this confounding, under some parameter values, population heterozygote advantage can occur even when heterozygotes are not at an advantage relative to their corresponding homozygotes. Specifically, population heterozygote advantage may be observed when resistance is additive (heterozygotes have risks equal to the average of the risks of the corresponding homozygotes), when resistance is recessive, or even when it is underdominant (heterozygotes have higher risk of disease than either corresponding homozygote).

Continuing this example, suppose that the resistant alleles are recessive to the susceptible one, so that individuals with one or two copies of the susceptible allele have favorable outcomes with probability .3 and individuals with two resistant alleles have favorable outcomes with probability .7; these assumptions correspond to ^{2} - Π = 0.225 = 22.5% of the population. The probability of a favorable outcome for heterozygotes will be the weighted average of the probabilities of a favorable outcome for SR and RR heterozygotes (using equation 1):

For homozygotes (using equation 2), the probability of a favorable outcome _{hom }will be the weighted average of the probabilities for SS homozygotes (25% of the population) and for RR homozygotes (2.5% of the population):

Thus, heterozygotes on average will be 1.26 times more likely to have a favorable outcome, even though each heterozygote has the same outcome as if s/he were homozygous for the worse of the two alleles s/he carries. In epidemiological terms, heterozygotes would have a relative risk of 0.87 of an unfavorable outcome compared to homozygotes.

Both of these examples were chosen for the purposes of clarity, rather than for precise reflection of the allele frequencies in real populations; in particular, few if any populations have a single HLA allele with a frequency of 50%. Moreover, we have simplified the effects of alleles into two categories, resistant and susceptible (R and S), which are simply relative notions. In fact, real alleles will likely have a spectrum of effects, ranging from highly resistant, to highly susceptible, with some alleles having "no effect." Note, however, that "no effect" is also a relative term, and refers to an allele whose effect on disease outcome is close to the population average.

Discussion

The foregoing examples show that the finding of population heterozygote advantage, as in the infectious disease studies cited, does not support an inference of allele-specific overdominance, the condition of primary interest as an immunological hypothesis and a mechanism for the maintenance of MHC diversity. Put another way, population heterozygote advantage may appear due to a combination of the two distinct mechanisms we defined in the Introduction: the protective or detrimental effects of particular alleles (R and S alleles in our model), and the effects of heterozygosity itself. The effects of R and S alleles appear as effects of heterozygosity vs. homozygosity because heterozygotes and homozygotes will in general carry different distributions of S and R alleles; thus, in an analysis that fails to condition on the alleles carried, heterozygosity is confounded with the alleles carried.

One advantage of correctly separating the effects of individual alleles from the effects of heterozygosity conditional on those alleles, is that each of these measures is a characteristic of an individual, rather than a population. For example, if genotypes XX, XY and YY have relative risks 0.6, 2.1, and 1 for clearance of a viral infection, then this should hold true regardless of who else is in the population. In contrast, we have shown that population heterozygote advantage depends on not only the effects of individual genotypes on disease outcome, but also on allele frequencies. Therefore, even if biological and epidemiologic mechanisms were identical in two populations, but allele frequencies differed in those two populations, it would be perfectly reasonable to find population heterozygote advantage in one but not the other.

The problem we have described with measuring population heterozygote advantage is not in principle limited to susceptibility/resistance studies of

We have shown that when allele-specific overdominance exists, it will often be manifest as population heterozygote advantage, but that a finding of population heterozygote advantage may be consistent with other patterns of allele-specific effects; for example, when resistant and susceptible alleles are equally common and equally diverse, population heterozygote advantage will occur if allele-specific effects are additive, dominant or overdominant. It is difficult to generalize, without doing a specific epidemiologic study, about how the prevalence and diversity of R and S alleles would be likely to occur in a given population. O'Brien et al.

Our results do not deny that allele-specific overdominance at HLA exists in human populations with respect to infectious disease resistance, but simply raise doubts about the reliability of the major evidence that has been adduced in support of this phenomenon. Existing data suggest that allele-specific overdominance may exist in some animal infectious diseases, but also that simple dominance and other outcomes are commonly observed _{1 }offspring of two genetically divergent parent strains were more resistant than either parent strain, but there seems to be no demonstration that heterozygosity at an MHC locus is responsible

From the perspective of the population genetics debate concerning the role of overdominance in maintaining polymorphism at the MHC, we should note that this mechanism requires allele-specific overdominance for total fitness, not for resistance to individual diseases. As noted by Doherty and Zinkernagel

Simple and accurate methods exist to determine for a single pair of alleles how the three possible genotypes (2 homozygotes and one heterozygote) affect disease outcome, and these methods have been used frequently in the literature on autoimmunity and HLA

Competing interests

None declared.

Authors' contributions

The methodological principles in this report grew out of discussions held by the three authors. ML formulated the problem and drafted the manuscript, which was edited and revised by CB and RA. All authors read and approved the final manuscript.

Appendix

In this appendix we prove that the relative risk defined in Equation (3) is increasing in p and Θ, and decreasing in Π.

As a preliminary result that will be useful later, note that

** Proof that the relative risk is increasing in p**. We must show that

By definition,

** Proof that relative risk is decreasing in Π**. Since

This is justified as follows: The first term is positive by defnition. Inside the curly brackets in the penultimate line above, we have the sum of two terms, each of which itself is the product of a nonpositive term and a nonnegative term; thus, the curly brackets are nonpositive. Specifically, ^{2 }- Θ] is nonnegative as noted at the beginning of the appendix. The term in curly brackets is nonpositive, and the term outside is positive, so the whole expression is nonpositive, as stated.

It is apparent by inspection that the denominator of the relative risk, _{hom}, is increasing in Π. Since the numerator of the relative risk is nonincreasing in Π, and the denominator of the relative risk is increasing, the relative risk is decreasing in Π.

The argument that the relative risk is increasing in Θ is exactly symmetric to the argument for Π.

Acknowledgements

This work was supported in part by National Insitutes of Health grants R01AI48935 to ML and R29GM54268 to RA. The authors thank the Helen Riaboff Whiteley Center at Friday Harbor for providing a beautiful and conducive environment for beginning this work. Dr. Brian Greenwood, Megan McCloskey, and Dr. John Mittler are thanked for comments on early versions of the MS, and Dr. Wayne Potts is thanked for a particularly helpful referee's report. R.H. Lipsitch is thanked for comments during the final revision.

Pre-publication history

The pre-publication history for this paper can be accessed here: