Altruistic behavior is defined as helping others at a cost to oneself and a lowered fitness. The lower fitness implies that altruists should be selected against, which is in contradiction with their widespread presence is nature. Present models of selection for altruism (kin or multilevel) show that altruistic behaviors can have ‘hidden’ advantages if the ‘common good’ produced by altruists is restricted to some related or unrelated groups. These models are mostly deterministic, or assume a frequency dependent fitness.
Evolutionary dynamics is a competition between deterministic selection pressure and stochastic events due to random sampling from one generation to the next. We show here that an altruistic allele extending the carrying capacity of the habitat can win by increasing the random drift of “selfish” alleles. In other terms, the fixation probability of altruistic genes can be higher than those of a selfish ones, even though altruists have a smaller fitness. Moreover when populations are geographically structured, the altruists advantage can be highly amplified and the fixation probability of selfish genes can tend toward zero. The above results are obtained both by numerical and analytical calculations. Analytical results are obtained in the limit of large populations.
The theory we present does not involve kin or multilevel selection, but is based on the existence of random drift in variable size populations. The model is a generalization of the original Fisher-Wright and Moran models where the carrying capacity depends on the number of altruists.
Keywords:Frequency independent fitness; Genetic drif; Fixation probabilities; Non-structured populations
Light production in Vibrio fischeri[1,2], siderophore production in Pseudomonas aeruginosa, invertase enzyme production in Saccharomyces cerevisiae, stalk formation by Dictyostelium discoideum, [2,5] are but a few examples of individuals in a community who help others at their own cost by devoting part of their resources to this task. This behavior has been termed “altruistic”. From the evolutionary point of view, altruists have a lower fitness than other individuals in the community who don’t help, but are recipient of the benefits produced by altruists. Through this paper, we will call these latter individuals ‘selfish’.
From the inception of evolution theory, the problem of the existence of altruists has been puzzling: how can a mutant with lower fitness prevail? And how does a community of altruists resist the spread of selfish allele (see  for a historical perspective)? In the last 40 years many models have emerged to explain the apparent contradiction between the smaller fitness of altruists and their widespread presence in various communities (for a review, see [7,8]). It is shown in these models that the actual fitness of an altruistic gene can be increased by other factors such as ‘common good’ restricted to kin (inclusive fitness [9,10]), or advantages conferred at another level of selection (group or multilevel selection [11,12]). These models which can be formulated through the Price equation have seen various generalizations and they are sometimes widely debated (see  and the numerous replies it has elicited).
The above models are either deterministic, i.e. populations change their size exactly according to their relative fitness, or involve frequency dependent fitness [14,15]. We show here that another possibility exists: an altruistic individual can produce a common good benefiting everybody in the community regardless of its nature (altruistic or selfish) and therefore increasing the carrying capacity of the habitat. Even though selfish individuals have always a higher fitness, genetic drift effects can favor the altruists.
It was established by the founding fathers of Population Genetics that a mutation that confers a relative fitness 1 + s does not automatically spread and take over the whole community, but has only a higher probability, called the fixation probability, to do so [16-18]. For a community of fixed size N of haploid individuals, the fixation probability π of a mutant appearing at one copy, for small selection pressure Ns < <1, is
The fixation probability is composed of two terms: even in the absence of selection, the population will become homogenic; in this neutral case, all individuals at generation zero have an equal probability 1/N of becoming fixed. When a beneficial mutation is present, the fixation probability of its carrier is increased by the relative excess fitness.
For populations of fixed size, as can be seen from expression (1) or the more precise expression (10) obtained by Kimura  and Moran , the fixation probability is a monotonically increasing function of the sole relative fitness. In the competition between alleles, arguments based on fitness parameter alone or the fixation probability lead to the same conclusions . However, if population size is not fixed, the fixation probability π, which takes into account both randomness due to finite size and selection, can lead to other conclusions than the fitness parameter alone.
Consider an altruistic gene that by some means (production of a ‘common good’, limited grazing of natural resources, …) allows the carrying capacity to increase: if the community were composed only of altruists its population size would be Nf; if it were composed only of selfish individuals the population size would be Ni (Ni < Nf) (Figure 1a). The production of common good decreases the relative fitness of altruists by s.
Figure 1. Variable carrying capacity.
(a): A community where the carrying capacity is an increasing function of altruist number, varying from Ni when the population is composed only of selfish individuals to Nf (Nf > Ni) when only altruists are present. (b) Two examples of random walks describing the stochastic behavior a such a system (transition probabilities 4–7), where m,n are the number of selfish and altruistic individuals. Red line: loss of S’s; Blue line: loss of A. A Moran process in this scheme corresponds to a random walk constrained to remain on an anti-diagonal line.
Consider now the fixation probability πA of one altruist mutant appearing in a community of Ni selfish individuals. A crude use of expression (1) shows that . On the other hand, the fixation probability πs of one selfish individual appearing in a community of Nf altruists is . We see that if
i.e. the cost to the altruist is smaller than the benefits in term of relative population increase, then an altruist has a larger fixation probability than a selfish one, even though its relative fitness is smaller. The relative advantage of a selfish mutant is compensated by the increased ‘random noise’ to which it is exposed. Note that in a deterministic model of the above process, the A always lose, since S individuals always increase their proportion.
The above argument will be refined in the following. In the next section, we formulate precisely the stochastic process of altruism outlined above by generalizing the Moran model for non-structured, well mixed populations and we show that altruists can indeed be favored in their competition with selfish individuals. We outline the amplification of this advantage in geographically structured, viscous populations in the third section. The final section is dedicated to concluding remarks.
Results and discussion
Stochastic model for altruism
The fundamental aspects of population genetics were clarified in the framework of the classical Fisher-Wright (FW) stochastic model of non-overlapping generations or its continuous time alternative introduced by Moran . Moran and FW are equivalent in the limit of large populations, where both are well approximated by the same diffusion equation . These are the simplest models that capture the key elements of population genetics (genetic drift, fixation probability, fixation time,…) with the fewest possible ingredients.
In the Moran model, a population of size N is composed of two types of individual, say A and S. Empty spots are created randomly with fixed rate α, increasing the carrying capacity by unity. Once an empty spot has been created, it will be colonized by the progeny of either an A or an S individual according to their proportion in the community. In order to keep the population constant, Moran added the constraint that the colonization of a new spot be followed immediately by the death of an individual in the community, restoring the population size to N. Moran is therefore a simultaneous model of duplication and annihilation; the transition probability densities for the A to increase or decrease their number n by one individual are
where m is the number of S individuals and c is the ‘cost’: 1/ c is the relative fitness of the A and c > 1 indicates a selective disadvantage. W+ stands for the probability density that the new spot is colonized by an A and death occurs among the S. In principle, a similar set of equations must be written for the S individuals; however, as the population size is fixed, , the quantity m in eq.(3) can be replaced by N-n and the whole stochastic process treated as a one dimensional random walk for the A.
We generalize this model by including two ingredients. First, the fixed size constraint can be relaxed and we let N vary between two bounds Ni and Nf: empty spots are created-colonized and individuals die, without these two events necessarily succeeding each other. More importantly, in order to include the effects of altruists, we suppose that the rate of creation of empty spots is proportional to the number of altruists and is equal to αn; in contrast, the death rate is proportional to the number of S individuals and is equal to α m. This is the simplest hypothesis that implies that the increase in the carrying capacity of the habitat is proportional to the number of altruists (see also Methods, mean field approximation).
The stochastic model that captures all these features is a two dimensional random walk with the following transition probability densities (Figure 1b):
Consider for example the first two lines of the above equations, which are about birth events: the factor is the relaxation of Moran constraints and insures that population size remains below Nf; the factor αn accounts for the fact that empty spot creations are proportional to the number of A; finally, once a birth event has occurred, the probability for it to be an A or an S is proportional to the number of the corresponding sub-populations present at this time. The last two lines, which govern population decrease, are similar: the factor ensures that population size remains above Ni; the factor αm is the death rate (population decrease) for everybody due to the presence of selfish individuals. The cost of altruism is included in these equations: the proportion of A is , but once a death event has occurred, the probability for it to be an A is:
if c > 1. The results below don’t change significantly if the cost of altruism is included in other rates. For example, a higher probability for an S to reproduce, or any combination that favors S over A. Note that if the increase/decrease rates were independent of m and n, we recover the Moran model by setting , in which case each birth/death is succeeded by a death/birth event (see Methods, relation to Moran model).
The above rates ensure that if A are lost ( n = 0), the population size tends toward Ni and if S are lost ( m = 0), it tends toward Nf. Note that in the mean field approximation of the above process where fluctuations are neglected and the deterministic limit is taken, the A are always eliminated if c > 1 (see Methods, mean field approximation).
In finite size populations however, fluctuations play an important role. The focus of this paper is the computation of the fixation probability of the above process and the probability that altruists or selfish mutants take over the community. The fixation probability π( k) of a general stochastic process beginning with the initial state k and fixing either to k0 or is the solution of Kolmogorov backward equation which is a linear set of equations 
where the sum is over all the states q attainable from the state k with transition probabilities . For one dimensional, one step processes such as Moran, k = n and the solution of the linear system is easily obtained :
where is the proportion of the A. The approximation corresponds to the Kimura solution obtained through a backward diffusion equation  and . Expression (1) is the first order expansion of the above expression in s.
For the two dimensional process (4–7) where is the initial number of the S and A, no closed form solution can be obtained. We can however solve equation (8) numerically by standard linear solvers or else resort to a Gillespie algorithm  to solve the stochastic equations (4–7) directly. Both these methods are used in this paper and the analytical approximations obtained below are compared to them.
For large populations, we use the usual diffusion equation approximation of eq.(8) [19,22]. For weak selection pressure, the diffusion approximation error for the simple Moran process is ; for more general cases, the validity of the approximation has been discussed by Zhou and Qian . Setting and denoting π( x, y) the fixation probability for the initial composition ( x, y), the diffusion equation reads:
and . This is a complicated elliptic partial differential equation. In the absence of selection (c = 1) however, the trivial neutral solution is which as expected, is just the proportion of altruists. Building upon this solution, and denoting for the proportion of altruists and , we can check that to the first order of perturbation , the solution reads
and γ is a numerical coefficient: . The first order perturbation solution (12), which was derived for small selection pressures , proves in fact to be an excellent approximation for selection pressure as high as , (Figure 2).
Figure 2. Fixation probabilities.
Comparison of analytical solution (12) (solid lines) to numerical solution of eq.(8) for increasing selection pressure indicated by the arrows: . Nf = 100, Ni = 90.
The general solution (12) allows for the computation of the fixation probability of one individual introduced into a community of the other type. To the first order of perturbation in , the fixation probabilities πA of one A introduced in a community of S reads:
and the fixation probabilities πs of one S introduced in a community of A is
Figure (3a) shows the evolution of these probabilities as a function of selection pressure for various Ni and Nf. Equations (1314) show that the condition for the altruist to be favored, , is simply
where and and we have kept only the leading terms. is the equilibrium relative excess cost of altruism at which A and S individuals become equivalent. Figure 3b shows the excellent agreement between the above results and exact numerical results. Altruists have a selective advantage if the selection pressure against them, i.e. the combined effect of fitness and population size, is smaller than the relative increase in population size. Unlike a Hamilton rule, criterion (15) is a finite size effect and is of purely stochastic nature : because of the demographic effect, selfish mutants are submitted to a higher stochastic noise than altruist; this can be sufficient to prevent them from prevailing. Note that the above computations were performed for the limiting case of weak selection (Nfs < < 1), which is considered by most, but not all, scientists, to be the relevant limit of evolutionary dynamics [26,27]. Direct numerical resolution of eq. (8) shows however that an equilibrium excess fitness exists even at high selection pressure, given a high enough relative increase in population size.
Figure 3. Criterion for Altruists selection. ( a) fixation probabilities πs (red squares) and πA (blue circles) as a function of selection pressure , for Nf = 100 and . Values are obtained by numerically solving eq.(8). Solid lines are theoretical values (eqs. 13,14). Increasing Ni are indicated by the arrow. (b) equilibrium selection pressure for which πA = πs for multiple combinations of and , as a function of relative population increase, obtained numerically. The solid line is the theoretical value (eq.15).
Geographically structured populations
The altruists’ advantage can be enhanced for large structured populations [28-31]. Geographically structured populations can be modeled as divided into colonies that exchange migrants . The Moran model on graph is a non trivial problem ; we restrict our treatment here to the simplest case where the migration time scale is small compared to fixation time of one mutant (viscous populations): a migrant is either lost or fixed before a new migration event happens. The argument we develop below is similar to the two level model of Traulsen and Nowak . Consider a one dimensional community subdivided into M colonies (Figure 4), exchanging migrants with neighboring patches at rate m. As the migration event is rare, these colonies are fixed either into an A or S state. The probability density per unit time pSA for an S colony on the border to become an A colony is to receive one migrant from the neighboring A colony multiplied by the probability that this mutant gets fixed:
Figure 4. Geographically structured populations.
Geographically structured population where patches can exchange migrants. For low migration rates, the border between A and S domains can be modeled as a biased random walk.
Similarly the probability density for an A colony on the border to become S is
Therefore, the movement of the border itself can be considered a biased random walk. The probability ΠA for an altruist mutant to take over the whole community is thus the probability for a mutant to take over one colony and then for this colony to take over the whole community:
where . If the criterion (15) is satisfied, then obviously r < 1 and for large number of communities M > > 1,
On the other hand, the probability Πs for a selfish mutant to be fixed is
and for M > > 1: once altruists dominate, the chances for a selfish mutant to invade the community is close to zero! Increased random noise due to production of common good and a small migration rate are an efficient way of keeping selfishness in check.
The above computation concerns the low migration limit. In the high migration limit, the community is non-structured and its effective size is . Criterion (15) shows that in this regime, altruists cannot emerge; this is indeed equivalent to the deterministic case where emergence of altruists calls for other mechanisms. Between these two regimes of high and low migration rate, there is a rich interval where migration rate is a key ingredient in the competition between altruists and cheaters.
The main concepts of Population Genetics were clarified in the framework of the original model of Fisher-Wright and Moran (FWM). These models introduced the key ingredient of population size and its role in the randomness of selection. It became clear in the 1920-30’s that a beneficial mutation does not spread automatically to the whole population, but has to overcome the “random noise” of population sampling over generations. The idea that random noise plays also a role for the selection of altruism has been introduced in two kind of models, which have a marked difference with the model we present here. The first class of models, formulated mostly through evolutionary game theory formalism, concerns fixed size populations, where the transition rates are frequency dependent : the fitness of an A individual can be superior to the fitness of an S individual if the number of A individuals already present is high enough. It can then be shown, upon very general conditions, that the fixation probability of altruists can become superior to that of selfish ones. These models can be seen as the generalization of Hamilton’s original idea, where “altruistic” help is restricted to genetically related individuals, even though Traulsen  has argued that the underlying mathematics is fundamentally different. The second class of models concerns group (or multilevel) selection. It has been shown  that the fixation probability of altruists can be higher than those of selfish ones, if the population is structured into groups and the splitting of one group leads to the elimination of another. It has also recently been noticed that random noise in a growing population can favor altruists during a transient period .
The model we present here is not frequency dependent: an A individual has always a lesser chance of reproducing than an S individual; the mean field description of this model has only one stable fixed point which corresponds to the disappearance of altruists. Moreover, The mechanism we propose is for non-structured populations, even though the altruist effect can be amplified when the population is structured into groups with small migration rate between groups. Imagine a group of M islands composed only of altruists and another group of M islands composed only of selfish individuals. Introduce one S mutant in each island of the first group and one A mutant in each islands of the second group. After some time, the number of islands in the first group is increased if the criterion (15) is satisfied.
In summary, we have shown, by a slight generalization of the Moran model, that in finite size populations, the fixation probability of altruists can be higher than that of selfish ones, even though their fitness is lower and their emergence is ‘forbidden’ by a Hamilton rule. We have also shown that in large, structured populations, and in the limit of small migration rate, the same arguments hold. Production of the ‘common good’ and increase in the carrying capacity of the habitat increase the random noise for selfish individuals and can therefore favor altruists.
The aim of the present article is not to contest the merits of kin/group selection models which have been investigated during the last forty years with a large number of case studies. We believe we are providing an alternative way of thinking about altruism which is complementary to the above models and which restores the key ingredients of population genetic to this topic.
Diffusion equation derivation
In the discrete backward Kolmogorov eq. (8) set and q all the states reachable from k, i.e. all states of the form ( m ± 1, n) and m, n ± 1. The equations read
For large populations , we set , and develop the above expression to the second order in (Kramers-Moyal expansion). Combining all the resulting terms leads to the partial differential equation (11). It is fruitful to express this equation in terms of total relative population and proportion of altruists ; the inside domain shown in Figure 1 then maps into the rectangle, where . In these coordinates, the diffusion equation reads:
Mean field approximation
In the deterministic approximation, fluctuations are neglected. Denoting by m and n the ensemble average of the number of S and A individuals, their deterministic evolution equation reads:
It is more fruitful to write directly the evolution of the proportion of A-individuals . Using the expression for transition probabilities (4–7), we have
where and . It is then obvious that for , . In the deterministic model, A-individuals always disappear.
The equation for total population reads
for , the stationary solution of this equation, assuming that μ is held constant is
which shows that the increase in carrying capacity of the habitat η - k, at small selection pressure, is mostly proportional to the number of A-individuals. A closer look at the above equations (16,18) shows that is the only stable fixed point when c > 1.
Relation to Moran model
In a simple model where population size is variable, but birth and death rates are independent of the number of altruists and selfish individuals, a constant α will replace (αn) and (α m) in equations (4–7). In the case where , the stochastic movement pictured in Figure 1b reduces to a movement on the anti-diagonal staircase: births and deaths occur only when the total population is respectively equal to Ni and Nf. The analog of the Moran process is obtained by computing the two steps transition probabilities . If , this implies first the birth of one individual of one type and then the death of an individual of the other type. Combining the rates given by eqs.(4–7) where birth and death rates are constant, we obtain
The same expression is obtained if .
Numerical resolution of fixation probabilities
Two different kinds of numerical resolution were used to check the validity of our analytical results on the fixation probabilities: A Gillespie stochastic algorithm and direct resolution of eq. (8).
The stochastic equations given by the rates (4–7) can be seen as 2 chemical reactions for the species :
which we solve by the classical Gillespie algorithm  written in C++. We are interested here only in the fixation probability and not in the fixation time; the program can therefore be accelerated by computing only the nature of the event that occurs at each turn (and not its time of occurrence). In general, to solve for the fixation probability, R = 106 stochastic trajectories are generated.
Equation (8) constitutes a linear system and can therefore be solved by standard numerical packages. For the present case however, the unknowns, i.e. the fixation probabilities π( m,n) don’t constitute a vector but a second rank tensor; the tensor formed from the rates W is of rank 4. To adapt our linear system to standard linear solvers, we have to re-index the unknowns and decrease their rank by one: . We have chosen the following scheme, which corresponds to a sequential scanning of the anti-diagonal lines (Figure 5a):
where . The points belong to the interior of the trapezoid , n ≥ 1, m ≥ 1.
Figure 5. Tensor reindexation.
(a) To each 2 d index ( m,n), a new 1 d index k is associated by scanning sequentially the anti-diagonal lines. ( b) The re-indexation transforms the tensorial equation (8) into a normal linear system , where πk are the unknowns.
The re-indexation transforms the equation (8) into a normal linear system
where I(k) designates the 1 d indexes of the four nearest neighbors of the point ( m,n), where . The above equations can be written in standard matrix notation
where are the unknowns. is a sparse matrix, which apart from the diagonal elements, has at most four non-zero elements per line: if k is the image of element ( m,n), then only if k′ is the image of one of the four nearest neighbors of ( m,n), in which case its value is given according to rates (4–7). The right hand side vector Bk is a sparse vector provided by the limit conditions : if k′, one of the 4 nearest neighbors of the element k belongs to the border m = 0, then and the corresponding is transferred to the right hand side to constitute the vector Bk. Note that because we index the interior of the trapezoid, the index k itself can never belong to the border.
Once the linear system (20) has been constituted, it can be solved by any linear solver. We have used the commercial package matlab for these manipulations.
BH designed the reasearch, performed the numerical work and wrote the article. BH and MV performed the analytical computations. Both authors read and approved the final manuscript.
We are grateful to O. Rivoire and E. Geissler for fruitful discussions.
I. J Theor Biol 1964, 7:1-16. Publisher Full Text
Nature 1964, 201(4924):1145-1147. Publisher Full Text
Annu Rev Ecol Syst 1970, 1:1-18. Publisher Full Text
Annu Rev Ecol Syst 1983, 14:159-187. Publisher Full Text
J Phys Chem 1977, 81(25):2340-2361. Publisher Full Text
New J Phys 2011, 13:073020. Publisher Full Text