Abstract
Background
Plotless density estimators are those that are based on distance measures rather than counts per unit area (quadrats or plots) to estimate the density of some usually stationary event, e.g. burrow openings, damage to plant stems, etc. These estimators typically use distance measures between events and from random points to events to derive an estimate of density. The error and bias of these estimators for the various spatial patterns found in nature have been examined using simulated populations only. In this study we investigated eight plotless density estimators to determine which were robust across a wide range of data sets from fully mapped field sites. They covered a wide range of situations including animal damage to rice and corn, nest locations, active rodent burrows and distribution of plants. Monte Carlo simulations were applied to sample the data sets, and in all cases the error of the estimate (measured as relative root mean square error) was reduced with increasing sample size. The method of calculation and ease of use in the field were also used to judge the usefulness of the estimator. Estimators were evaluated in their original published forms, although the variable area transect (VAT) and ordered distance methods have been the subjects of optimization studies.
Results
An estimator that was a compound of three basic distance estimators was found to be robust across all spatial patterns for sample sizes of 25 or greater. The same field methodology can be used either with the basic distance formula or the formula used with the KendallMoran estimator in which case a reduction in error may be gained for sample sizes less than 25, however, there is no improvement for larger sample sizes. The variable area transect (VAT) method performed moderately well, is easy to use in the field, and its calculations easy to undertake.
Conclusion
Plotless density estimators can provide an estimate of density in situations where it would not be practical to layout a plot or quadrat and can in many cases reduce the workload in the field.
Background
Plotless density estimators are those that based on distance measures rather than counts per unit area (quadrats or plots) to estimate the density of some fixed event, e.g. burrow openings, damage to plant stems, etc. Plotless density estimators can provide an estimate of density in situations where it would not be practical to layout a plot or quadrat, e.g. difficult terrain, crops, situations where a low impact is required. These techniques make certain assumptions about the spatial distribution of the event that in the worst case assume that the event is randomly distributed, a situation that occurs infrequently in nature. Other techniques permit greater degrees of nonrandomness. It is important therefore to understand when a certain plotless density estimator is robust to departures from nonrandomness.
An evaluation of which plotless density estimator (PDE) is suitable for a given field situation requires examination of fully enumerated field populations and is ideally suited to computer simulation. Inferences about PDEs using simulated populations [1] are limited because field data rarely consists of a single type of spatial pattern. Instead natural populations tend to occur as a mixture of spatial patterns at various levels of intensity and grain (intensity is the variability in pattern seen from place to place and grain is an expression of the amount of spacing between them, [2]). Some plotless density estimators are better at handling departures from randomness due to the intensity and grain of the overall spatial pattern.
Methods
Estimation Methods Used
We selected the eight best estimators from the 24 evaluated by [1] to test using seventeen fully enumerated field data sets. In the discussion that follows the closest individual (CI) is the individual that is closest to the random sample point and this individual can have a nearest neighbor (NN). The closest individual to the NN is referred to as the second nearest neighbor (2NN). One or more of the following distances need to be measured depending on the estimator: from the i^{th }random point to the first, second or third closest individual; from the closest individual to the first or second nearest neighbor and; the distance from a transect baseline of width w, to the g^{th }event such that all g events are within the transect. Estimators used in this study (Table 1) comprise four general types: basic distance; KendallMoran; ordered distance and angle order; and variable area transect. The quadrat method was done to check that the simulation routines were working correctly (see Additional file 1) and not as an explicit test of this method as this has been done elsewhere [1,3]. No attempt was made to optimize the dimensions of the quadrat or the VAT. The latter has been dealt with explicitly elsewhere [4].
Table 1. Summary of estimators used, their formulae and main reference.
Additional file 1. Complete results from all simulations.
Format: PDF Size: 33KB Download file
This file can be viewed with: Adobe Acrobat Reader
 Basic distance estimators
 KendallMoran estimators
Figure 1. Schematic representation of how KM2P and BDAV3 are implemented in the field. Shading shows the search area less intersection used in the calculation of KM2P. R – the random sample point CI – closest individual; NN – nearest neighbor; 2NN – second nearest neighbor, R_{(1)i }= the distance from the i^{th }sample point to the CI; H_{(1)i }= the distance from the i^{th }CI to its NN; H_{(2)i }= the distance from the NN at the i^{th }random point.
 Ordered distance and angle order methods
The
 variable area transect method
Simulation Study Design and Data Sets
Eight plotless density estimators were examined in the present study using 5000 Monte Carlo simulations, Table 1. The simulation program was written in Fortran 77, and each simulation was a specific combination of a spatial data set and sample size (10, 25, 50 and 100 samples per simulation were undertaken). The uniform random number generator, UNIF [11], was used to locate sampling points and, where required, the VNORM routine [11] was used to convert uniform random numbers to normal random numbers to generate the synthetic data sets used for comparison with natural data sets (see below). The input for each simulation included: the name of the data file containing the location of all events as XY coordinates on a Cartesian plane; selects the number of samples to be taken; the sizes of the VAT width and quadrat; an output file specification and; the number of simulations to be performed. These inputs were provided within in a batchprocessing environment and could be left to run unattended. The output file, one for each data set comprised the estimated density, relative bias and relative root mean square error for each estimator.
Natural data sets
Seventeen data sets (Table 2) were obtained from unpublished studies by the authors and colleagues that included animal damage to rice and corn, bird nest locations, active rodent burrows and distribution of plants. Densities ranged from 0.06 m^{2 }(beeeater nest sites) to 19.3 m^{2 }(damaged sugar). A boundary strip of 10% of the length and width of the extent of the population of points was used to remove the bias associated with sampling close to the edge of the study area.
Table 2. Description of data sets used and density of the event.
For ground or cliff nesting birds the density of nest sites provide important information on the number of breeding females or pairs. Two data sets were used with densities of 0.06 (bee eater) and 3.2 m^{2 }(Alaskan waterfowl nests).
Burrowing species such as gophers and rabbits can be monitored through the presence of active burrows. Two data sets of a population of pocket gophers measured in two successive years were used to demonstrate the application of PDE as a suitable method for monitoring populations.
The use of PDEs for monitoring damage to crops was done using corn and rice in the Philippines, and sugar cane in Hawaii.
The remaining data set is from a coastal sand island, north of Brisbane, Australia. Grass trees, Xanthorrhoea sp., grow in heath communities inland from the foredunes. Unlike the crop data sets these are naturally occurring communities.
Simulated data sets
Five data sets whose spatial characteristics were predetermined were also included for comparison. The artificial data sets (where n is the number of individuals, λ is the density m^{2}) had distributions that were Poisson (n = 100, λ = 1), uniform – regular lattice (n = 100, λ = 1), hexagonal – regular triangular (n = 100, λ = 0.9), firstorder clumped (n = 100, λ = 1.1, number of offspring per parent (nop) = 10, clump radius (cr) = 0.5 m) and second order clumped (n = 100, λ = 2.1, nop = 10, cr = 0.5 m). The Poisson or random pattern was created by generating the required number of random coordinates within the designated area. The uniform data set was generated by first dividing the area into a grid of rectangles, the same number as the population size. One population member was randomly located within each grid cell. The hexagonal pattern was generated so that population members were located at the vertices of a lattice of equilateral triangles. For the clumped data sets, the required number of clump centers was randomly created within the designated area. In addition to the clump center point, offspring for the clumps were located within a designated radius from the parent. These offspring were located within the clump about the parent using coordinates randomly generated using a standard bivariate normal distribution. For the second order clumping, the individuals in the clump are used for parent points. The two individuals of the subclumps include the parent plus offspring points, which are randomly generated from the standard bivariate normal distribution. The radius for the subclump is limited to half that for the clump. The second order clumping approximates the situation that can occur with rodent damage in field crops.
Statistics
The relative root mean square error (RRMSE) was used as the basis of comparisons between the different PDEs [1,12], where I is the number of simulations (5000), D_{est }is the estimated density and λ is the true density in the population, such that:
In addition, relative bias (RBIAS) shows the bias relative to the true density and the direction of that bias such that:
The R index, [13], was calculated for all data sets (Table 3) including examples of simulated distributions such that:
Table 3. R index, standard error of expected mean, s, and z statistic [13] for the data sets used. When the pattern is entirely random R = 1, if the events are uniform then R > 1 (R = 2.149 for a perfect hexagonal uniform distribution) and conversely when the population of events is clumped R < 1 (R approaches 0 for maximally clumped distribution). The z test statistic considers the null hypothesis that the spatial distribution is random. Data sets comparable to those generated in [1] in italics.
where R_{O }is the average observed nearest neighbor distance, r_{i }is the nearest neighbor distance to the i^{th }sample point and n is number of nearest neighbor distances measured;
where R_{E }is the expected nearest neighbor distance for a random pattern of events;
R was calculated for the complete data set less a 10% buffer. When the pattern is entirely random R = 1, if the events are uniform then R > 1 (R = 2.149 for a perfect hexagonal uniform distribution) and conversely when the population of events is clumped R < 1 (R approaches 0 for maximally clumped distributions). The z test statistic was calculated that measured the difference between the observed and expected values of R, i.e. it considers a null hypothesis that the spatial distribution is random.
where se is the standard error of R_{E}
A Spearman (rank) correlation coefficient was calculated between the log of (λ) and the log of D_{est }for AO3Q, BDAV3, KM2P and VAT across all natural data sets.
Results and Discussion
Interpretation of the performance of estimators based on relative root mean square error (RRMSE) (Table 4) and relative bias (RBIAS) (Table 5) was undertaken for estimators that were ranked highly by [1] (Table 1) for the natural and simulated data sets described in Tables 2 and 3. Complete results of the simulations are provided in Additional file 1.
Table 4. Mean relative root mean square error for 10, 25, 50 and 100 samples/simulation for each density estimator and each spatial pattern for the natural data sets (see Table 3)
Table 5. Mean relative bias for 10, 25, 50 and 100 samples/simulation for each density estimator for each spatial pattern (see Table 3)
An ideal estimator is one that is robust across many spatial patterns, i.e. RRMSE and RBIAS are low, and where the amount of fieldwork required can be minimized or at least be undertaken efficiently. Basic distance estimators were largely dismissed by [1] because they showed poor performance for clumped data sets, however, they performed much better in this study than most other methods with the exception of the angleorder estimators (Table 4). Across all data sets the compound estimator, BDAV3 (Figure 1), was the bestranked method for sample sizes greater than 10 and performed well in terms of bias. BDAV3 was less suited for Poisson distributions. For these distributions KendallMoran estimator (KM2P) was ranked first when sample size was 10 or 25. For sample sizes of 50 or 100 the variable area transect (VAT) method was ranked first. The highest ranked estimators for the clumped distribution were the two angle order estimators AO3Q (Figure 2) and AO2Q. The VAT performed moderately well overall and is far easier to implement in many situations.
Figure 2. Schematic representation of how AO3Q is implemented in the field. The order of the quadrants is arbitrary. In practice much time is spent deciding which is the third closest individual and into which quadrant an individual lies. R_{(3)ij }= the distance from the i^{th }sample point to the third CI for the j^{th }quadrant.
Absolute relative bias (i.e. regardless of sign) for the AO and BD estimators was an order of magnitude smaller than the others for clumped data sets. However, AO estimators showed higher positive bias for Poisson data sets compared to the near zero for the others. In uniform data sets the OD and VAT estimators showed a RBIAS close to zero.
BDAV3 and KM2P use the same field methodology, however, data processing is much simpler for BD than for KM estimators. These estimators use information from the closest individual, distance to its nearest neighbor and the second nearest neighbor and that may help to explain why they are robust across all spatial patterns studied here, compared to estimators such as AO that rely on information derived from the closest individual.
Whereas the calculation for KM2P looks deceptively simple (Table 1, Figure 1), delineating search areas has to be done algorithmically when the number of samples is realistically large and this difficulty needs to be considered beforehand. The KM calculation is suggested when the distribution is likely to be uniform. The formulae AO3Q is simple to undertake and the methods are suited to situations where movement and/or vision is good, e.g. it may not be suitable for crops where excessive movement would cause damage. The estimator with the lowest RRMSE for each data type for a sample size of 50 was: uniform – OD3C, poisson – VAT, clumped – AO3Q, overall – BDAV3.
For uniform patterns the OD3C, VAT or KM2P methods were the most suitable, however, the method of searching in VAT is the simplest to implement. The fieldwork required for BDAV3 and KM2P are the same and although BDAV3 is much easier to calculate it is less able to cope with uniform data sets. The selection of the required sample size should be undertaken on a casebycase basis using a pilot study. Accuracy will be improved with larger sample sizes and the techniques used to minimize the variance through stratified sampling, randomization, etc. should be employed.
The VAT method would seem the most straightforward to utilize in most field situations, and under optimized sampling constraints the method holds promise for row crops [14]. In comparisons between the known density and the mean estimated density (Figure 3), the VAT had the lowest correlation coefficient of the four estimators tested in this way, although this was still 0.95. This suggests that ranking solely on RRMSE might lead one to favor methods that are difficult to implement in the field.
Figure 3. Correlation between mean density estimate against known density for all data sets. Line shows complete agreement between known and estimated density. Spearman's correlation coefficient shown in parentheses. Symbols denote spatial pattern of data set: Uniform – filled circle, Poisson – filled triangle, Clumped – open circle.
Furthermore, the present study aimed to examine PDE methods as originally presented, without attempting to improve performance through optimizing procedures. Thus we examined VAT sampling using g = 3. The number of individuals for which to search has been optimized with substantial improvements in estimation quality for g ≥ 5 [4,10,14]. Other than the KM2P estimator, most other PDE forms hold opportunity for improving estimation by optimizing the number of population members for which to search. [15] examined this for ordered distance estimation using simulated data sets similar to the approach taken by [4]. Angleorder methods could be optimized for the number of individuals to search in each sector, and the number of sectors into which the search area around the random sampling point is divided.
When damage is the event to be estimated and is caused by an animal that invades a crop or forestry coup it is usual to find the damage along the edge. Figures 4ad show the diversity of spatial patterns exhibited in the data sets. Figure 4a shows the distribution of pocket gopher burrows with a uniform distribution, while Figure 4b shows an aggregated nesting pattern of waterfowl. Figure 4c shows a random pattern of rodent damage in rice while 4d is highly clumped damage within a cornfield.
Figure 4. Examples of diversity of spatial patterns found. (a) uniform distribution of pocket gopher burrows; (b) aggregated nesting pattern of waterfowl; (c) random pattern of rodent damage in rice; (d) highly clumped damage within a cornfield.
Typically the data sets of damage were clumped, however, random and uniform patterns were also found for data sets that mapped the distribution of burrows or nest sites. It is a characteristic of field data that the spatial pattern can vary within the study area. This was demonstrated by recalculating the R index for regions within the Corn 2 data set (Figure 5, Table 6). It is therefore advisable to undertake an investigation of the spatial pattern present and this can be done using either the [13] R index or the [16] Hopkins and Skellam index as part of any preliminary study using blocking to detect regions of clumping as it is this spatial pattern that causes the greatest problems with many estimators. The latter index is probably more applicable for field studies as it does not require an estimate of density beforehand. Where clumping is present angle order methods should be used.
Conclusion
Plotless density estimators can provide an estimate of density in situations where it would not be practical to layout a plot or quadrat and can in many cases reduce the workload in the field.
Authors' contributions
NAW ran the simulations and with RME and HWK drafted and finalised the manuscript. RTS developed the original fortran code. All authors read and approved the final manuscript.
Acknowledgements
The authors wish to thank L. F Pank, R M Anthony and E Benigo for providing some of the field data sets and R K Schumacher and P Hallgren for their helpful comments on an earlier draft of the manuscript. The authors wish to thank the three anonymous referees for their comments and suggestions. This work was originally supported by the Queensland University of Technology.
References

Engeman R, Sugihara R, Pank L, Dusenberry W: A comparison of plotless density estimators using Monte Carlo simulation.
Ecology 1994, 75:17691779. Publisher Full Text

Steinke I, Hennenberg KJ: On the power of plotless density estimators for statistical comparisons of plant populations.
Can J Bot 2006, 84(3):421432. Publisher Full Text

Engeman R, Sugihara : Optimization of variable area transect sampling using Monte Carlo simulation.

Kendall M, Moran P: Geometrical Probability. London: Griffin; 1963.

James I: A computer study of corrected density estimators for distance sampling of nonrandom populations. In Diploma of agricultural science. Massey University, Palmerston North, New Zealand; 1971.

Morisita M: A new method for the estimation of density by spacing method applicable to nonrandomly distributed populations.
Physiol Ecol 1957, 7:134144.
[In Japanese. Available as Forest Service translation number 11116, USDA Forest Service, Washington, D.C., USA]

Pollard J: On distance estimators of density in randomly distributed forests.
Biometrics 1971, 27:9911002. Publisher Full Text

Parker K: Density estimation by variable area transect.
J Wildl Manag 1979, 43:484492. Publisher Full Text

Engeman RM, Nielson RM, Sugihara RT: Evaluation of optimized variable area transect sampling using totally enumerated field data sets.
Environmetrics 2005, 16(7):767772. Publisher Full Text

Bratley P, Fox B, Schrage L: A guide to simulation. New York: SpringerVerlag; 1983.

Patil S, Burnham K, Konover J: Nonparametric estimation of plant density by the distance method.
Biometrics 1979, 35:597604. Publisher Full Text

Clark P, Evans F: Distance to nearest neighbor as a measure of spatial relationships.
Ecology 1954, 35:445453. Publisher Full Text

Engeman R, Sterner R: A comparison of potential laborsaving sampling methods for assessing large mammal damage in corn.
Crop Prot 2002, 21:101105. Publisher Full Text

Nielson R, Sugihara R, Boardman T, Engeman RM: Optimization of ordered distance sampling.
Environmetrics 2004, 15:119128. Publisher Full Text

Hopkins B, Skellam J: A new method for determining the distribution pattern of plant individuals.

Seber G: The Estimation of Animal Abundance and Related Parameters. 2nd edition. London: Griffin; 1982.