Department of Life Sciences and Biotechnology, University of Ferrara, via Borsari 46, Ferrara I-44121, Italy

National Institute for Mathematical and Biological Synthesis (NIMBios), The University of Tennessee, Knoxville, TN 37996, USA

Institute for Maternal and Child Health, IRCCS, University of Trieste, via dell’Istrai 65, Trieste I-34137, Italy

School of Biology, Scottish Oceans Institute, University of St Andrews, St Andrews, Fife KY16 8LB, UK

School of Environmental Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK

Abstract

Background

Demographic bottlenecks can severely reduce the genetic variation of a population or a species. Establishing whether low genetic variation is caused by a bottleneck or a constantly low effective number of individuals is important to understand a species’ ecology and evolution, and it has implications for conservation management. Recent studies have evaluated the power of several statistical methods developed to identify bottlenecks. However, the false positive rate, i.e. the rate with which a bottleneck signal is misidentified in demographically stable populations, has received little attention. We analyse this type of error (type I) in forward computer simulations of stable populations having greater than Poisson variance in reproductive success (i.e., variance in family sizes). The assumption of Poisson variance underlies bottleneck tests, yet it is commonly violated in species with high fecundity.

Results

With large variance in reproductive success (_{
k
} ≥ 40, corresponding to a ratio between effective and census size smaller than 0.1), tests based on allele frequencies, allelic sizes, and DNA sequence polymorphisms (heterozygosity excess, M-ratio, and Tajima’s

Conclusions

Our results suggest caution when interpreting the results of bottleneck tests in species showing high variance in reproductive success. Particularly in species with high fecundity, computer simulations are recommended to confirm the occurrence of a population bottleneck.

Background

Demographic fluctuations, including changes in population size and growth rate, are common events in natural populations. Severe population size declines (bottlenecks), however, may have detrimental consequences including increased inbreeding, decreased adaptive potential, increased disease susceptibility, lowered fecundity, and disruption in expression of quantitative traits

Bottlenecks may leave a population genetic signature, such as decreases in number of alleles and heterozygosity, and loss of rare alleles _{
c
}) is known, cryptic bottlenecks (change in effective size, _{
e
}, without change in _{
c
}) may occur; and (3) bottleneck outcomes are highly stochastic, meaning that genetic diversity following the bottleneck is somewhat unpredictable even when the demographic history is known

Previous investigations have demonstrated that the statistical power of these tests is highest when the bottleneck is severe or prolonged, and when many loci are used. In addition, factors such as the mutation model and the rate of post-bottleneck recovery may also play an important role

A lack of statistical power in bottleneck tests may result in an underestimation of the extinction risk. On the other hand, identifying bottleneck signatures when they have not occurred may represent a complementary risk

Investigations of type I error of bottleneck detection methods are few, and have mostly concerned mutation models in microsatellite markers. For example, the probability of type I error can be substantial or extreme (from 40 to 100%) when the wrong mutation model is assumed or when multi-step mutations occur _{
e
}

Here we focus on type I error rates that may arise in bottleneck tests when the variance of reproductive success (hereafter _{
k
}) is larger than the Poisson variance assumed by simple models underlying the bottleneck detection methods. Larger than Poisson _{
k
} could cause strong intergenerational genetic drift, because it introduces additional stochasticity (e.g. unaccounted loss of alleles) when only few parents contribute to the next generation _{
k
} reduces the _{
e
}
_{
c
} ratio, which may explain _{
e
}
_{
c
} in the order of 10^{-2} - 10^{-3} observed in many amphibians, fish, marine invertebrates, and plants. Even more extreme _{
e
}
_{
c
} ratios, as low as 10^{-3} to 10^{-5}, have been reported for lobster, cod, red drum, and oyster _{
k
}

Theoretically, the relationship between _{
k
} and the _{
e
}
_{
c
} ratio has been derived under different models _{
k
} can be converted into a predictable reduction in _{
e
}. However, the effect of _{
k
} on the shape of a coalescent tree and on the relationship between different genetic diversity measures (which are the basis for bottleneck testing) have not been investigated _{
k
} will show signature of small but constant size, or whether large _{
k
} results in a false signal of a genetic bottleneck. Here we investigate this question for different combinations of _{
e
} and _{
k
} values, using simulated data to estimate type I errors in two tests commonly applied to microsatellite data to detect bottlenecks, the M-ratio _{
k
} on the Tajima’s _{
k
} = 2.

Methods

Genetic variation data were generated by simulating demographically stable populations with different effective size (_{
e
}) and different variance in reproductive success (_{
k
}). For each combination of parameters, 100 replicates were generated. Each data set, consisting of 15 microsatellite markers, was analysed with the M-ratio and the heterozygosity excess tests, and with the MSVAR method. The fraction of replicates significantly supporting a bottleneck can be considered as an estimate of the FPR (false positive rate), i.e., the type I error rate. Then a smaller set of simulations was used to analyse two additional markers (microsatellite loci with constrained allelic range and DNA sequence polymorphisms).

Generating the primary set of synthetic data

The software simuPOP _{
k
}) can be simulated straightforwardly. We analysed 16 combinations of _{
e
} (50, 500, 2500 and 5000) and _{
k
} (2, 40, 400, and 2000). Population size was assumed to be constant, and the mean number of offspring per mating was always equal to two. In order to obtain the same _{
e
} for different _{
k
} values, the census sizes required in the simulations were computed using the approximate relationship _{
e
}
_{
c
} = 4/(_{
k
} +2) _{
k
} =2, family sizes were Poisson distributed (as assumed by most population genetics models) and the ratio _{
e
}
_{
c
} =1. For larger _{
k
}, we used a modified gamma distribution of family sizes with decimal values rounded down to the nearest integer (resulting in a discrete distribution approximating a negative binomial, Figure _{
k
} values of 40, 400, and 2000 correspond approximately to _{
e
}
_{
c
} equal to 0.1, and 0.01, 0.002, respectively.

An example of the distribution of offspring per parent in the simulations

**An example of the distribution of offspring per parent in the simulations.** The three panels correspond to the distributions obtained in simulations with _{k} =2 (top), _{k} = 40 (middle), and _{k} = 400 (bottom).

Fifteen neutral, independent microsatellites evolving under a strict stepwise mutation model with mutation rate ^{-4} were considered. Mutation-drift equilibrium was obtained by running simulations for _{
e
} generations, starting from individuals with a Dirichelet distribution of allele frequencies. After verifying that the population had reached a stable equilibrium confirmed by the convergence of the number of alleles (_{
e
}), and the inbreeding coefficient _{
is
}, 50 individuals were randomly sampled and analysed using ARLEQUIN v3.5

Additional simulations

Some specific situations were investigated using additional simulations. First we simulated microsatellite markers where the maximum number of alleles is limited to five, to represent expressed (EST) microsatellites which tend to have a limited allelic range; a restricted allele range may affect the M-ratio. Second we simulated DNA sequences of 500 base pairs evolving under an infinite site mutation model with mutation rate ^{-7} per site per generation. These simulations were conducted to understand whether the spurious signal of a bottleneck produced by _{
k
} > 2 is specific to microsatellites markers, or whether a similar signal would be found when Single Nucleotide Polymorphisms (SNPs) are considered.

Bottleneck tests

Microsatellite data was analysed first with the commonly used M-ratio test _{
k
} = 2. Also we set the parameters _{
e
} and _{ft} test the approach based on the fixed threshold, and M-ratio_{sim} test the approach that uses simulations to compute the critical value. The heterozygosity excess test is based on a relationship between heterozygosity and number of alleles, which is predicted to deviate from theoretical expectations after a bottleneck because the former decreases more slowly than the latter. Statistical significance for this test is computed using the Wilcoxon’s signed rank test to compare the expected heterozygosity calculated from the data (H_{e}) to an expected heterozygosity based on the number of alleles present (H_{a}) _{a} is computed by simulation using the program BOTTLENECK

We performed also a more sophisticated analysis which is frequently used to detect changes in population size

**N**
_{
e
}

**V**
_{
k
}

**N**
_{
e
}**/N**
_{
c
}

**H**
_{
e
}**(SD)**

**K (SD)**

**F**
_{
is
}**(SD)**

**M-ratio (SD)**

**%P**

**FPR**

**M-ratio**
_{
ft
}

**M-ratio**
_{
sim
}

**Het excess**

**MSVAR**

Mean values of summary statistics (with standard deviations) across 100 replicates are given. The last four columns report the rate of false positives (FPR = type I error) estimated as the fraction of replicates with an M-ratio smaller than the commonly used threshold of 0.68 (M-ratio_{ft}), with a M-ratio smaller and the critical value computed by simulation using the same parameter θ = 4_{
e
}
_{sim}), where a significant (P< 0.05) heterozygoty excess was detected using the program BOTTLENECK, and where a significant difference between ancestral and current population size is detected by MSVAR, respectively. _{
e
} = effective population size; _{
c
} = census population size; _{
e
} = expected heterozygosity; _{
o
}/_{
e
}, where _{
o
} is the observed heterozygosity; M = M-ratio; %P = fraction of replicates producing a polymorphic locus; the starting values, in the log_{10} scale, for the mean and variance of the prior distributions in MSVAR, are as follows: ancestral size (3,1), current size (3,1), mutation rate ( -3.3,1), time since the decline (2,0.5); means and variances (and their means and variances) of the hyperprior distributions used in MSVAR are as follows: ancestral size (3,1,0,0.5), current size (3,1,0,0.5), mutation rate (-3.3,0.25,0,0.5), time since the decline (2,0.5,0,0.5).

50

2

1

0.11 (0.17)

1.53 (0.59)

0.00 (0.09)

1.00 (0.03)

48

0.01

0.02

0.01

0.00

40

0.1

0.07 (0.14)

1.30 (0.52)

-0.03 (0.12)

1.00 (0.00)

27

0.0

0.04

0.02

0.00

400

0.01

0.05 (0.14)

1.24 (0.45)

-0.01 (0.11)

1.00 (0.03)

23

0.01

0.04

0.10

0.00

2000

0.002

0.07 (0.13)

1.25 (0.35)

-0.15 (0.17)

0.97 (0.06)

25

0.00

0.17

0.11

0.00

500

2

1

0.44 (0.16)

3.08 (0.72)

-0.02 (0.12)

1.00 (0.03)

100

0.0

0.09

0.04

0.00

40

0.1

0.42 (0.20)

2.74 (0.81)

-0.07 (0.21)

0.98 (0.07)

96

0.03

0.36

0.32

0.62

400

0.01

0.43 (0.23)

2.91 (1.10)

-0.17 (0.29)

0.87 (0.18)

89

0.21

1.00

0.53

0.97

2000

0.002

0.44 (0.21)

3.17 (1.20)

-0.19 (0.31)

0.71 (0.21)

88

0.43

1.00

0.54

1.00

2500

2

1

0.71 (0.06)

6.3 (1.3)

0.01 (0.05)

0.95 (0.08)

100

0.0

0.03

0.06

0.06

40

0.1

0.69 (0.1)

5.7 (1.8)

-0.08 (0.11)

0.89 (0.13)

100

0.07

0.51

0.20

0.66

400

0.01

0.64 (0.09)

4.5 (1.2)

-0.19 (0.13)

0.82 (0.15)

99

0.35

1.00

0.39

0.99

2000

0.002

0.61 (0.12)

4.2 (1.4)

-0.20 (0.12)

0.69 (0.18)

99

0.49

1.00

0.42

1.00

5000

2

1

0.76 (0.08)

7.70 (1.60)

-0.016 (0.08)

0.94 (0.09)

100

0.0

0.05

0.07

0.14

40

0.1

0.72 (0.09)

6.06 (1.76)

-0.11 (0.17)

0.81 (0.19)

100

0.23

0.93

0.22

0.97

400

0.01

0.66 (0.13)

4.80 (1.51)

-0.22 (0.16)

0.68 (0.23)

100

0.50

1.00

0.40

1.00

2000

0.002

0.67 (0.11)

4.90 (1.66)

-0.24 (0.14)

0.66 (0.20)

99

0.58

1.00

0.43

1.00

DNA sequences were analysed with the Tajima’s

Results

Primary set of simulations

As expected, the average level of genetic variation (expected heterozygosity, _{
e
}, and number of alleles, _{
e
}. The average _{
e
} observed for _{
k
}=2 is similar to theoretical predictions _{
e
} values 50, 500, and 5000. The number of alleles does not have a simple expectation under the single-step mutation model, but the observed values are compatible with other results _{
k
} increases, we observe a trend of decreased genetic variation within each set of simulations with the same _{
e
}, and this effect is stronger for _{
e
}. For _{
k
} > 2, populations also appear to deviate from the Hardy-Weinberg equilibrium, with larger observed than expected heterozygosity and consequent negative values of the estimated inbreeding coefficient.

The false positives rate (FPR) clearly increases with _{
k
}. With _{
k
}=2, FPR for the M-ratio_{ft} test is either 1% or 0% (indicating probably that this criteria is too conservative) and it varies between 2% and 9% using the M-ratio_{sim} test. For the heterozygosity excess test, the FPR with _{
k
}=2 is around the nominal 5% or less, and varies between 0% and 14% for the MSVAR analysis (this analysis being more permissive with large values of genetic variation). Very different results are obtained for _{
k
} > 2 (Table _{
e
} equal or larger than 500 (i.e., when level of polymorphisms is not too low). All or almost all replicates analysed with the M-ratio_{sim} test or with the MSVAR analysis support a bottleneck when _{
k
} ≥ 400 and N_{e} ≥ 500. When the more conservative M-ratio_{ft} test or the heterozygosity excess test are applied, the FPR decreases, but never below 21%. For _{
k
}=40, i.e., when the ratio between effective and census size is equal to 0.1, FPR can reach values as high as 93% or 97% in the M-ratio_{sim} test and the MSVAR analysis, respectively. Furthermore, we observe a general trend of FPR to increase with _{
e
} (Table _{
e
} (which is decreasing when _{
e
} increases), deserves further investigation. In summary, with high variation in reproductive variance, the M-ratio and heterozygosity excess tests produce many false positives, and the probability to detect a spurious bottleneck signal tends to increase with increasing effective population size. MSVAR results are in general similar to those obtained with the M-ratio_{sim} test.

The false positive rate (FPR) as a function of the ratio _{e}_{c} under different statistical approaches

**The false positive rate (FPR) as a function of the ratio **_{e}_{c}**under different statistical approaches.** FPR refers to simulations with _{e} = 2500.

The false positive rate (FPR) as a function of the effective population size _{e} under different statistical approaches

**The false positive rate (FPR) as a function of the effective population size **_{e}**under different statistical approaches.** FPR refers to simulation with _{k} =40.

Additional simulations

_{
k
}
_{
k
} was increased from 2 to 400. This increase in FPR is similar to that observed in the simulations with size-unconstrained loci. However, none of the replicates with high _{
k
} with constrained loci produced small and significant M-ratios. The likely explanation is that a reduced allelic range prevents the opening of gaps in the allelic size distribution. In other words, the M-ratio test does not tend to suggest a false signal of a bottleneck when analyzing size-constrained EST microsatellites.

_{
e
}
_{
k
}
_{
k
} = 2 in case of constant population size and absence of natural selection, is shifted towards positive values, with a mean of 1.24. The FPR, i.e. the fraction of values significantly larger than 0, is 37%. Thus, the Tajima’s _{
k
} >> 2.

Discussion

In many organisms with high fecundity, the contribution of each individual or pair to the next generation can be highly skewed, with few “winners” (i.e. those who produce many offspring) and many “losers” who do not contribute to the gene pool of the next generation. Under this scenario of Sweepstakes Reproductive Success (SRS) _{
k
}) is larger than assumed by the Wright-Fisher model. Population genetics theory predicts that the ratio of _{
e
} (the effective population size) over _{
c
} (the census population size) rapidly decreases from one as _{
k
} increases. The SRS model is thus considered a likely explanation for the empirical observation that many marine organisms have much lower genetic variation (and therefore _{
e
}) than predicted by their very large _{
c
}

While the negative relationship between genetic variation and _{
k
} is well known, the effect of _{
k
} on the gene genealogy shape reconstructed from a sample of DNA fragments is yet unclear. It is possible that large _{
k
} values may introduce distortions in this genealogy, in turn distorting the relationships between genetic variation measures. This is relevant as many statistical analyses for identifying deviations from neutrality and demographic stability assume _{
k
}=2 and are based on the relationships between genetic variation measures.

We addressed this question by comparing simulated datasets of single populations with different _{
k
} values. Specifically we estimated the impact of large _{
k
} on the results from four statistical tests commonly used to detect population size variation: the M-ratio test, the heterozygote excess test, a test derived from a Bayesian estimate of ancient and current population sizes, and the Tajima’s _{
k
}=2. Rejection of this hypothesis may be interpreted as population decline, but may be also due to large _{
k
} in isolated, demographically stable populations. This is relevant in conservation genetics as violation of the assumption of low _{
k
} made by these tests can produce incorrect inference, and may suggest incorrect management interventions.

Our simulations show that high _{
k
} can strongly increase the rate of false positives (FPR = type I error = incorrect inference of population decline) for all the tests. Further, the larger _{
k
}, the larger the rate. FPR is also dependent, to some extent, on _{
e
} (and thus the level of genetic variation), but this relationship appears test-specific. Based on our results, it appears that the MSVAR method is most prone to errors, followed by the M-ratio with the critical threshold computed by simulations (M-ratio_{sim}). The heterozygote excess and M-ratio with the traditional threshold are less prone to false positives when _{
k
} is large and may be preferred for use, if the goal is to reduce type I errors when evidence of large _{
k
} is available. The results we obtained show also that high _{
k
} could cause wrong conclusions when the aim of the analysis is to identify signatures of selection. In particular, the negative _{
is
} values and positive Tajima’s _{
k
} could be misinterpreted as signals of balancing selection.

When _{
k
} is large, a large fraction of siblings is observed every generation. In coalescent terms, several lineages merge in one generation going back in time, producing many short external branches in the gene genealogy and therefore a deviation from the standard Kingman coalescent _{
k
} can generate large FPR. We also note that the constant population size scenario we simulated appears similar, in its effects, to a scenario of a recent and extreme bottleneck in an additional way, with a small recent effective size producing negative

Due to different parameterization of the model of the biological system, our results are not directly comparable with the genetic prediction of recent theoretical models of populations with skewed offspring number and overlapping generations _{
k
}), the chances to obtain star-like genealogies and excess of rare alleles, i.e., signatures of population expansion, is increased compared to the _{
k
} =2 case; this is opposite the result obtained in our study. A possible explanation for the discrepancy is the fact that our simulations considered non-overlapping generations, and overlap in generations may provide a buffer against the effects of drift and consequent high allele sharing caused by high _{
k
}. Additional efforts should be dedicated to make the results produced by theoretical models with multiple merger and those obtained in our study comparable.

Practical applications

Certainly, our results suggest that the genetic signature of a bottlenecks should be interpreted with caution when found in species known to have moderate to large variance in offspring number (as for example in the killer whale,

An alternative to using the standard bottleneck tests for species with large _{
k
} is using computer simulations _{
k
} effects on the population genetic signal and, more specifically, generating species-specific null distributions of the bottleneck tests (as the M-ratio statistic) more appropriate for _{
k
} larger than 2. Simulating stable populations, and populations with different intensities of demographic decline, can allow statistical comparison to the observed data (with or without formal approaches like Approximate Bayesian Computation,

The high FPR we uncover may not present a problem for studies that detect a bottleneck by comparing temporal samples, as comparing a modern sample to museum or ancient samples _{
k
} should not be expected to arise because large _{
k
} should affect diversity in both samples. However, this assumes that _{
k
} is constant through time. If census size decreases, _{
k
} may change through time _{
k
} on temporal comparisons.

Finally, considering that our simulations assumed non-overlapping generations, and also considering that effect of drift decreases proportionally to the number of generations that overlap

Conclusions

We have shown that high reproductive variance increases the rate of false positives in four widely used bottleneck detection tests. Failing to detect a genuine bottleneck is widely acknowledged as harmful in conservation. However, given the limited resources and myriad of necessary conservation actions that are required to protect vulnerable species and populations

Competing interests

The authors declare no competing interests.

Authors’ contributions

MM, CVO, and GB conceived and designed the study, MM and AB performed simulations and data analysis, SH drafted the manuscript and GB and OEG worked on it. All authors examined data, discussed results, contributed to manuscript revision and approved the final draft.

Authors’ information

All authors are interested in the demographic and genetic dynamics of small or isolated populations, and in the development and testing of statistical approaches to infer population processes from genetic variation data.

Acknowledgements

Funding was provided by the University of Ferrara, Italy. CvO was funded by the Earth and Life Systems Alliance (ELSA), Norwich Research Park, UK. We thank Lorenzo Zane and Richard Nichols for helpful discussions. GB thanks Camila Mazzoni and Simone Sommer for their hospitality at the at the Berlin Center for Genomics in Biodiversity Research during the revision of this paper.