Abstract
Background
Network analyses of plantanimal interactions hold valuable biological information. They are often used to quantify the degree of specialization between partners, but usually based on qualitative indices such as 'connectance' or number of links. These measures ignore interaction frequencies or sampling intensity, and strongly depend on network size.
Results
Here we introduce two quantitative indices using interaction frequencies to describe the degree of specialization, based on information theory. The first measure (d') describes the degree of interaction specialization at the species level, while the second measure (H_{2}') characterizes the degree of specialization or partitioning among two parties in the entire network. Both indices are mathematically related and derived from Shannon entropy. The specieslevel index d' can be used to analyze variation within networks, while H_{2}' as a networklevel index is useful for comparisons across different interaction webs. Analyses of two published pollinator networks identified differences and features that have not been detected with previous approaches. For instance, plants and pollinators within a network differed in their average degree of specialization (weighted mean d'), and the correlation between specialization of pollinators and their relative abundance also differed between the webs. Rarefied sampling effort in both networks and null model simulations suggest that H_{2}' is not affected by network size or sampling intensity.
Conclusion
Quantitative analyses reflect properties of interaction networks more appropriately than previous qualitative attempts, and are robust against variation in sampling intensity, network size and symmetry. These measures will improve our understanding of patterns of specialization within and across networks from a broad spectrum of biological interactions.
Background
The degree of specialization of plants or animals has been studied and debated extensively, and a continuum from complete specialization to full generalization can be found in various systems [16]. In general, two levels of specialization measures may be distinguished: first, the characterization of focal species and, second, the degree of specialization of an entire interaction network, representing an assemblage of species and their interaction partners (e.g. food webs, mutualistic networks, predatorprey relationships). When interactions are considered as ecological niche, the first level describes the niche breadth of a species and the second level the degree of niche partitioning across species. While the species level is more straightforward in its biological interpretation, analyses at the network level can be useful for comparisons across different types of networks. Such analyses have been performed to compare plantpollinator webs versus plantseed disperser webs [4,5], different plantpollinator networks along geographic gradients [1,7,8], or food webs of variable size [9,10]. Entire network analyses are also used to study patterns on a community level such as coevolutionary adaptations [3], ecosystem stability or resilience [1114].
Quantifying specialization at the species level
Specialization or generalization of interactions are most commonly characterized as the number of partners (or 'links'), e.g. the number of pollinator species visiting a flowering plant species or the number of food plant families a herbivore feeds upon. In this qualitative approach, interactions between a consumer and a resource species are only scored in a binary way as 'present' or 'absent', ignoring any distinction between strong interactions and weak or occasional ones. For example, binary representation of interactions do not distinguish a scenario where 99% of the individuals of a herbivore species feed on a single plant species only, but occasionally an individual is found on another plant, from a different scenario where a herbivore regularly feeds on both food plants. The problem is analogous to the measurement of biodiversity either as a crude species richness versus as a more elaborate diversity index including relative abundances [15]. Several approaches have thus been used to directly include variation in interaction frequencies (i.e., their evenness) in characterizing the diversity of partners, e.g. Simpson's diversity index for pollinators [16,17] or Lloyd's index for host specificity [18]. Alternatively, other studies indirectly controlled for abundance or sampling intensity using rarefaction methods [13,19]. Correspondingly, Bersier and coworkers [20] have suggested to quantify the diversity of biomass flows in food webs using a Shannon diversity measure. Niche breadth theory provides several additional indices that include some measure of resource frequency or resource use intensity [21], which can be viewed in analogy to 'partner diversity' in the context of association networks. However, Hurlbert [22] emphasized that not only proportional utilization, but also the proportional availability of each niche should be taken into account. A species that uses all niches in the same proportion as their availability in the environment should be considered more opportunistic than a species that uses rare resources disproportionately more. If variation in resource availability is large, diversitybased measures that ignore this availability may be highly misleading [22,23]. Several niche breadth measures thus combine proportional resource utilization with proportional resource availability [2224]. These concepts have been rarely applied in the context of species interaction networks, e.g. plantpollinator webs where binary data are more common than quantitative webs.
Quantifying specialization at the community level
The measurement used most commonly to characterize communitywide specialization is the 'connectance' index (C) [1,4,810,2527]. C is defined as the proportion of the actually observed interactions to all possible interactions. Consider a contingency table showing the association between two parties, with r rows (e.g., plant species) and c columns (e.g., pollinators). Connectance is defined as C = I/(r·c), with I being the total number of nonzero elements in the matrix. Therefore, like the number of partners or links (L) described above, C uses only binary information and ignores interaction strength. C is directly related to the mean number of links () of plant species or pollinator species as C = _{plants}/c = _{poll}/r.
This measure, , has also been used to compare networks [1,3,7,8,28]. Recently, it has been suggested to use instead of C to characterize networks [29]. However, note that comparisons across networks of different size (number of species) are problematic, since , unlike C, is not scaled according to the number of available partners (see also [2,10]). in a small network may represent a larger proportion of available partners compared to the same value of in a large network.
Analyses based on binary data – both at the species and the community level – have obvious shortcomings, since they are highly dependent on sampling effort, decisions which species to include or not, and the size of investigated networks. Several authors thus emphasized the need to move beyond binary representations of interactions to quantitative measures involving some measure of interaction strength [4,20,27,2932]. A way to at least partly overcome these deficiencies is to cut off all rare species or weak interactions below a frequency threshold [3,9,33,34] or to control for sampling effort in null models [7,8,13,19,25,35]. However, for interaction webs where a more detailed information is available, simplification to binary data as in C or remains unsatisfactory. Conveniently, the observed interaction frequency may represent a meaningful surrogate for interaction strength, at least in pollination and seeddispersal systems as shown by Vázquez et al. [30] (see also [16]). Incorporating interaction frequency or even a direct measure of interaction strength in a network measure of specialization would thus provide an important progress frequently called for.
A severe additional problem of connectance is that its lower and upper constraints are not scaleinvariant [25], which limits its use for comparisons across networks. The minimum possible value (C_{min}) to maintain at least one link per species declines in a hyperbolic function with the number of interacting species, since C_{min }= max(r, c)/(r·c), and an upper limit (C_{max}) may be constrained by, or a function of, total sampling effort. Across networks, C decays strongly with network size, which has been debated in detail in the context of food web analysis [9,10,26,27,36,37]. The strong relationship between C and network size generates a problem for disentangling any biologically meaningful effect from this mathematically inherent scale dependence. For instance, network comparisons may focus on residual variation in C after an average effect of network size has been controlled for [1,4], or C could be rescaled to account for this size effect (see [25,36]). For natural networks of similar size, the range of actual C values is typically very narrow [4], thus other structural forces may be poorly detectable.
The objective of this paper is to develop and discuss specialization measures that are based on frequency data and thus account for sampling intensity, and that overcome the problem of scale dependence. We then test these approaches by evaluating the effect of sampling effort and scale dependence on a published natural pollination network, and on randomly generated associations as a null model. We differentiate between specieslevel measures of specialization, useful to investigate variability among species within a web, and a single networkwide measure that can be used for comparisons across networks.
Results
Patterns in two pollinator networks
Two selected plantpollinator networks (British meadows studied by Memmott [32], Argentinean forests studied by Vázquez and Simberloff [33]) differ markedly in their degree of specialization when quantitative analyses are applied. The qualitative network index, connectance, is similar in both interaction webs (British web: C = 0.15, Argentinean web: C = 0.13). However, frequencies of pollinator visits are much more evenly distributed in the British community than in the Argentinean example. In the British web, the interaction between a dipteran species and Leontodon hispidus was the most frequent one, representing 6% of the total 2183 interactions observed. In the Argentinean network, visits of Aristotelia chilensis by a colletid bee species represented 20% of the 5285 interactions alone. Interactions between the top five plant and top five pollinator species made up 44% of the interactions in the British web, but 74% in the Argentinean web. This difference in the heterogeneity of interaction frequencies is not evident in measures based on binary information such as number of links (L) or connectance (C). In contrast, the degree of specialization shown by the frequencybased index H_{2}' (standardized twodimensional Shannon entropy, see Methods: Networklevel index) is much lower in the British community (H_{2}' = 0.24) compared to the Argentinean community (H_{2}' = 0.63).
The variation of specieslevel specialization measures (standardized KullbackLeibler distance, d') holds valuable information for the structural properties of a network (see Methods: Specieslevel index). The British pollination web is dominated by highly generalized pollinators (low d', both in terms of individuals as well as species), while putative specialists are represented by very few individuals and species (Fig. 1A). In contrast, most pollinators in the Argentinean web are moderately generalized to specialized, with the second highest level of specialization found in the most common species (Fig. 1B). Consequently, the weighted mean degree of specialization is much lower in the former web (<d'_{poll}> = 0.16) than in the latter (<d'_{poll}> = 0.54). The relationship between specialization of species i (d'_{i}) and its interaction frequency (A_{i}) across the pollinator species differs between the two webs. In the British web, d'_{i }and A_{i }were not correlated significantly (Spearman's r_{s }= 0.08, p = 0.46), while a highly positive correlation was found in the Argentinean web (r_{s }= 0.65, p < 0.0001). Note that designation of any specialization index to a species i that is only represented by a single individual may be critical. However, significances in the above correlations remain unaffected when pollinators with one single interaction are excluded. From the plants' point of view, the species in Memmott's web are also more generalized in terms of their pollinator spectrum (Fig. 1C) than the plants studied by Vázquez and Simberloff (Fig. 1D). The respective weighted means are <d'_{plants}> = 0.27 and <d'_{plants}> = 0.53. No significant correlation was found between the plants' frequency and specialization in either web (both p ≥ 0.16). Interestingly, plants were on average more specialized than pollinators in the British web (<d'_{plants}> > <d'_{poll.}>), but not in the Argentinean web. This distinction is not found when only the weighted mean number of links (L) are examined, since <L_{plants}> is much greater than <L_{poll.}> in both networks. The difference in <L> may be driven by the highly asymmetrical matrix architecture in both webs, where the number of pollinator species greatly exceeds the number of plant species. The unweighted mean is even directly linked to the matrix architecture (i.e., number of rows and columns, r and c) by a constant (connectance C), since _{r }= c·C and _{c }= r·C. In contrast, the matrix asymmetry does not affect d' (see also below, Null model patterns).
Figure 1. Patterns within pollinator networks. Frequency distribution of the specieslevel specialization index (d') for pollinators and plants from two published networks, one from Britain [32] and one from Argentina [33]. Bars show the number of individuals in each category (label '0' defines 0.00 ≤ d' < 0.05, etc.). Bars are separated for different species, and total number of species in each category is given on top. Arrows indicate cases where bars are invisible due to low numbers of individuals.
Simulation of sampling effort
In order to test whether specialization estimates are dependent on sampling and scale effects, we simulated a decreased sampling intensity in both networks using rarefaction (see Methods: Simulation of sampling effort and matrix architecture). In both networks, H_{2}' is robust and already very well estimated by a small fraction of the interactions sampled (Fig. 2). The coefficient of variance of H_{2}' remains below 5% from about half of the total number of visits onwards in the British web and even at onetenth of the total sampling effort of the Argentinean web. The estimation of connectance (C) is also relatively stable at least in the Argentinean web, although it shows a positive trend across sampling effort in the British web (Fig. 2). These findings suggest that networkwide measures of specialization, particularly H_{2}', do not necessarily require a very large or even complete association matrix, but can also be very well estimated from a smaller representative subset as long as there is no systematic sampling bias.
Figure 2. Sampling effect in pollinator networks. Rarefaction of sampling effort in a British and an Argentinean pollination web [32,33]. Two networklevel measures of specialization – the frequencybased specialization index (H_{2}') and the 'connectance' index (C) – are shown for networks in which the total number of interactions (m) has been reduced by randomly deleting interactions. Black dots show the effect of sampling effort for the original association matrix, gray dots the effect for a null model, i.e. five networks in which partners were randomly associated (same row and column totals as in the original matrix).
Null model patterns
The degree of specialization can be further characterized by comparison with a null model. The null model used here is that each species has a fixed total number of interactions (given by the observed association matrix), but interactions are assigned randomly. In the above pollinator networks, random associations yield a specialization index H_{2}' that remains close to zero for almost the entire range of sampling intensity, while connectance (C) shows a positive trend over the total number of interactions (m) (Fig. 2). Therefore, H_{2}' derived from real networks may typically be clearly distinguished from this null model, while the comparison of C is complicated by scale dependence and the relatively large values yielded by the null model.
Simulations of artificially generated random associations (see Methods: Simulation of sampling effort and matrix architecture) confirm that the networklevel specialization index H_{2}' is largely unaffected by network size (Fig. 3A), network architecture (Fig. 3B) or total number of interactions (m) for a fixed matrix size (Fig. 3C). For random associations as shown here, H_{2}' is usually close to zero. Connectance values (C) of random matrices show the known hyperbolic function over the number of associated species (Fig. 3A), changes with matrix asymmetry (Fig. 3B) and increase strongly with increasing m (Fig. 3C). For specialization measures at the species level, the average number of links per species () increases strongly with network size, number of available partners, and m (Fig. 3). While other niche breadth measures may also show some variation across different network scales (not shown), the weighted mean KullbackLeibler distance <d'> is poorly affected by network size, network asymmetry, and number of interactions (Fig. 3). Both H_{2}' and d' may thus be appropriate for comparisons across matrices of different scale.
Figure 3. Simulated random networks. Behavior of specialization measures in simulated random networks. Each point represents one matrix with random associations, based on specific row and column totals that follow a lognormal distribution. The size of squared matrices in (A) increased from 2 × 2 to 200 × 200. In (B), only the number of rows changed, while the number of columns was fixed at 20, rectangular matrices thus increased from 2 × 20 to 200 × 20. In (C), the network size was fixed at 20 × 20. The total number of interactions (m) increased with matrix size in (A), where each species had on average 20 individuals. In (B), m was fixed at 4000, resulting in a reduced interaction density for larger matrices. In (C), m increased from 20 to 4000. The index H_{2}' and connectance C are specialization measures of the whole matrix and thus reciprocal, while the average number of links (
), and weighted mean standardized KullbackLeibler distance (<d'>) are given for all columns (rows give a similar pattern).Discussion
Properties of specialization measures
The suggested indices, d' and H_{2}', quantify the degree of specialisation of elements within an interaction network and of the entire network, respectively. While the number of links (L) and connectance (C) represent specieslevel and communitylevel measures of interactions based on binary data, respectively, d' and H_{2}' represent corresponding measures for frequencybased data. The need to include information on interaction strength or interaction frequency into network analyses has been announced by various authors [4,20,27,30,31,38]. Parallel to earlier advances in diversity measures compared to species richness, quantitative network measures account for the heterogeneity in link strength rather than assigning equal weights to every link. Moreover, we have shown that d' and H_{2}' are largely robust against variation in matrix size, shape, and sampling effort. In several cases, C may be strongly affected by sampling effort [25,27], while H_{2}' remained largely unchanged in simulations of random associations over a range of network sizes, variable network asymmetries, and number of interactions. This scale invariance suggests that both d' and H_{2}' can be used directly for comparisons across different networks, while comparisons of L and C are more problematic [1,35].
Qualitative methods like the indices suggested here also allow a more detailed analysis of interaction patterns within and across networks. Fruitful areas include comparisons of networks across different interaction types [4], biogeographical gradients [1], biodiversity and land use gradients [13], robustness of networks against extinction risks [39], asymmetries between plants and animals [38], and relationships between specialisation and abundance [35]. While a comparison of the average number of partners between plants versus animals is solely dependent on the matrix architecture (i.e., the number of rows r versus columns c, since _{plants }= c·C and _{poll }= r·C), this limitation does not apply to d'. In the two selected pollinator webs, plants are either similarly or more specialised than pollinators in regard to weighted mean d'. This allows an scaleindependent evaluation of asymmetries in the degree of specialization between partners (see also [38]). Moreover, Vázquez and Aizen [35] noted that the number of links of a species (L_{i}) is strongly positively correlated with its overall frequency (A_{i}) in five pollination networks including the datasets analyzed above. They argued that this apparent higher generalization of common plants and common pollinators may be largely explained by null models, calling for an improved measurement of specialization. Our results for the correlation between d'_{i }and A_{i }in two pollinator webs suggest that the relationship between specialization and abundance may be more variable, and even positive as in the Argentinean network.
Caveats
Some problems apply to any measure of network analyses including the proposed indices. Measures of specialization mostly ignore phylogenetic relationships or ecological similarity within an association matrix. For example, a plant species that is pollinated by multiple moth species may be unsuitably regarded as more generalized than a plant pollinated by few insect species comprising several different orders [40]. In addition, the fact that herbivores are commonly specialized on host plant families rather than species may skew network patterns if not carefully accounted for. A first approach to investigate such effects may be to compare the level of specialization after a stepwise reduction of the matrix by pooling species to higher taxonomic units, such as genera, families, and orders. For known phylogenies, more advanced techniques for analyses with a particular evolutionary focus are available [4143]. Another deficiency may be that species or their partners are all given the same individual 'weight' in the analyses, whether they may be small bees or large bats visiting a small herb with little nectar or a mass flowering tree. Null models as in the calculation for both C and H_{2}.' imply that all individuals can be shifted around between resources in the same way, irrespective of their size or nonfitting parameters. The role of 'forbidden links' as constraints to network analyses has been discussed elsewhere [44,45]. Similarly, calculations of d' or other niche breadth measures are based on the implicit assumption that each species adjusts its interactions according to the availability of partners (niches), irrespective of morphological or behavioral constraints. Moreover, if data are collected from a large heterogeneous habitat or over a prolonged time period, calculations of the degree of specialization may be severely constrained by the spatiotemporal overlap or nonoverlap between partners for other reasons than resource preferences, e.g. when not all species are able to reach all sites in the same way, or when some resources and consumers have asynchronous phenologies. Consequently, network analyses as suggested here will be most useful to study resourceconsumer partitioning within a short time frame and limited spatial scale.
For both indices d' and H_{2}', we proposed above to use the total number of interactions for each species as a measure of partner availability (q_{j}) and as constraint for standardization (fixed row and column totals). It may be debated whether independent measures of plant and animal abundances could be more appropriate than using interaction frequency data as such. However, despite the fact that such abundance data barely exist for most networks, note that the actual number of interactions often more suitably reflects resource availability and consumer activity than an independent measure of species abundance. For instance, a flower of one species may have a much higher nectar production than another and consequently receive a higher number of visitors, while the local abundance of the plant species does not reflect such differences in resource quality and/or quantity. Both d' and H_{2}' thus focus on the actual partitioning between the interacting species. In studies where detailed knowledge or theoretical assumptions about resources (availability and quality) or consumers (activity density and consumption rate) are available or under experimental control, such data may be incorporated into the analysis (defining q_{j }and constraints) instead of interaction frequencies. The constraint of fixed row and column totals has been debated elsewhere in the context of species cooccurrence patterns, where it was found to be most appropriate in null model comparisons, although critics have argued earlier that these marginals themselves may already reflect competitive interactions ([46] and references therein). Any approach to compare networks based on fixed marginals for standardization will fail to detect potentially meaningful patterns displayed by these architectural features, namely the number of resource and consumer species and the heterogeneity of total interaction frequencies. This network architecture may already be shaped by past competitive interactions or indicate fundamental constraints, a largely unexplored hypothesis that merits additional investigations.
It should also be emphasized that analyses of frequency data may be susceptible for pseudoreplication of repeated associations of the same individuals or close associations derived from a single dispersal event (e.g. a social insect colony, aggregating individuals, multiple offspring from a single egg cluster, or monospecific plant clusters). These may lead to an overestimation of specialization. To be more meaningful on a population level, frequency analyses should thus be based on spatially independent association replicates. Note that all specieswise specialization measures such as d' are sensitive to the behavior of the other species. Any systematic sampling bias (e.g. a taxonomic focus within a guild) will therefore affect the conclusions of comparisons within or across networks.
Conclusion
In accordance with previous calls [4,20,27,30,31,38], we suggest that the explicit inclusion of frequency data reflects an important step forward in network analyses, as too many assumptions are implicit in any measure based on binary representation. Most notably, connectance and 'number of partners' imply an equal availability of all partners – an unlikely scenario. Qualitative indices are not robust against sampling effort. On the contrary, the proposed quantitative measures based on interaction frequencies explicitly account for this source of variation. Our study suggests that d' and H_{2}' represent scaleindependent and meaningful indices to characterize specialization on the level of single species and the entire network, respectively. These novel indices allow us to investigate patterns within and across networks that have not been detected with qualitative measures such as correlations with species frequencies, network size and asymmetries in specialization between partners. Recently, Bascompte et al. [38] showed that the incorporation of frequency data may unveil pervasive asymmetries within networks. Particularly since Vázquez et al. [30] demonstrated that interaction frequencies in plantpollinator and plantseed disperser systems often correlate with the magnitude of mutualistic services for the plant (although variation in pollinator effectiveness can be important, see [47]), an increased collection of frequency data and appropriate quantitative analyses would greatly benefit future network studies.
Methods
Specieslevel index
As specieslevel measure of 'partner diversity', we propose the KullbackLeibler distance (or KullbackLeibler divergence, relative entropy) in a standardized form (d'). Coming from information theory, this index quantifies the difference between two probability distributions [48]. While the standardized Hurlbert's and Smith's measure of niche breadth could be used alternatively [21,22,24], d' has some advantages in the context of networks. While all three indices regard an exclusive pairing between two species as high degree of specialization as long as interactions between the two partners are infrequent, Hurlbert's and Smith's indices show a undesired trend towards full generalization when the number of interactions between the two partners increase, although this should be considered a stronger indication of specialization (see below, Properties of alternative niche breadth measures). The interaction between two parties is commonly displayed in a r × c contingency table, with r rows representing one party such as flowering plant species, and c columns representing the other party such as pollinator species. In each cell, the frequency of interaction between plant species i and pollinator species j (or another useful measure of interaction strength) is given as a_{ij}, (Table 1).
Instead of frequencies (a_{ij}), each interaction can be assigned a proportion of the total (m) as
Table 1. Elements in a species association matrix. Interaction frequencies (a_{ij}) between c animal and r plant species and their respective totals (rows:A_{i}, columns: A_{j}, total elements: m).
Let p'_{ij }be the proportion of the number of interactions (a_{ij}) in relation to the respective row total (A_{i}), and q_{j }the proportion of all interactions by partner j in relation to the total number of interactions (m). Thus,
To quantify the specialization of a species i, the following index d_{i }is suggested. This d_{i }is related to Shannon diversity, similar to an index recently suggested to characterize biomass flow diversity in food webs [20]. However, an appropriate index in this context should not only consider the diversity of partners, but also their respective availability (see [22]). Consequently, the following index compares the distribution of the interactions with each partner (p'_{j}) to the overall partner availability (q_{j}). The KullbackLeibler distance for species i is denoted as
which can be normalized as
The theoretical maximum is given by d_{max }= ln (m/A_{i}), and the theoretical minimum (d_{min}) is zero for the special case where all p'_{ij }= q_{j}. However, a realistic d_{min }may be constrained at some value above zero given that p'_{ij }and q_{j }are calculated from discrete integer values (a_{ij}). To take this into account, d_{min }is more suitably computed algorithmically as in a program available from the authors and online [49], providing all d' for a given matrix. This standardized KullbackLeibler distance (d') ranges from 0 for the most generalized to 1.0 for the most specialized case. Thus, d' can be interpreted as deviation of the actual interaction frequencies from a null model which assumes that all partners are used in proportion to their availability. An average degree of specialization among the species of a party can be presented as a weighted mean of the standardized index, e.g. <d'_{i}> for pollinators as
While <d'_{i}> usually differs from <d'_{j}>, the weighted means of the nonstandardized KullbackLeibler distances are the same for both parties, hence <d_{i}> = <d_{j}>.
Networklevel index
The following networkwide measure is based on the bipartite representation of a two mode network of interactions such as plantanimal or other resourceconsumer interactions where members of each party interact with members of the other party but not among themselves (unlike many food webs). The twodimensional Shannon entropy (termed H_{2 }in order to avoid confusion with the common onedimensional H) is obtained as
H_{2 }decreases with higher specialization. This measure is closely related to the weighted mean of the nonstandardized KullbackLeibler distance of all species, since
<d_{i}> = <d_{j}> = H_{2max } H_{2}
(see below, Relationship between d_{i }and H_{2}). H_{2 }can be standardized between 0 and 1.0 for extreme specialization versus extreme generalization, respectively, when its minimum and maximum values (H_{2min }and H_{2max}) are known. H_{2min }and H_{2max }can be calculated for given constraints. The constraints used here are the maintenance of the total number of interactions of each species, thus all row and column totals, A_{i }and A_{j}, being fixed (see also [46]). Alternative constraints may be defined depending on the knowledge of the system studied.
H_{2 }reaches its theoretical maximum where each p_{ij }equals its expected value from a random interaction matrix (q_{i}·q_{j}), such that
while its theoretical minimum (H_{2min}) may be close to zero depending on the matrix architecture. Like for d_{min} above, H_{2max }and H_{2min }are constrained by the fact that they are derived from integer values. A program implementing a heuristic solution to obtain H_{2max }and H_{2min}, and to perform the entire analysis is available from the authors or online [49].
The degree of specialization is obtained as a standardized entropy on a scale between H_{2min }and H_{2max }as
Consequently, H_{2}' ranges between 0 and 1.0 for extreme generalization and specialization, respectively.
Comparison with random associations
H_{2 }can be tested against a null model of random associations (H_{2ran}). A number of random permutations of the matrix can be performed using a r × c randomization algorithm (also available at [49]). The probability (pvalue) that the observed H_{2 }is more specialized than expected by random associations is simply given as the proportion of values obtained for H_{2ran }that are equal or larger than H_{2}, a common procedure in randomization statistics [25,50]. H_{2ran }is usually only slightly larger than H_{2min}._{}Previously, permutations of r × c contingency tables often used a different test statistics instead of H_{2 }[25,51,52]:
The relationship between T and H_{2 }is described by a constant, the total number of interactions (m), as T = m·ln m  m·H_{2}. Consequently, both methods yield exactly the same pvalues.
Relationship between d_{i }and H_{2}
In the following we derive the relationship between the individual levels of specialization (d_{i}) and the community level (H_{2}). The nonstandardized KullbackLeibler distance for row i can be rewritten as
The weighted mean of d_{i }for all i rows (each row weighted by q_{i}) yields
While the first summand in the final equation for <d_{i}> equals H_{2}, the remaining two summands correspond to the maximum entropy H_{2max}, because
Therefore,
<d_{i}> = H_{2max }H_{2}.
The same calculation applies for <d_{j}>, thus <d_{i}> = <d_{j}>. Consequently, the degree of specialization of the entire network (corresponding to the deviation of the networkwide entropy from its maximum value) equals the weighted sum of the specialization of its elements (species).
Properties of alternative niche breadth measures
The standardized Hurlbert's (B') and Smith's (FT) measure can be applied widely for niche breadth analysis [21,22,24]. In this context, the KullbackLeibler distance (d) can be viewed as a modified ShannonWiener measure of niche breadth that accounts for niche availabilities. Like the KullbackLeibler distance, both B' and FT compare the proportional distribution of individuals (p) to the proportional resource availability (q) (here: partner availability). For a certain species i, the two measures are in our notation:
Each p'_{ij }is the proportion of the number of interactions in relation to the respective row total, and q_{j }is the proportion of all interactions by partner j in relation to the total number of interactions. Thus,
Both the standardized Hurlbert's (B') and Smith's (FT) measure range between 0 for the most specialized case to 1.0 for extreme generalization (broadest niche). In the context of niche breadth, it has been shown that the ShannonWiener measure is most sensitive, while Hurlbert's and particularly Smith's measure are less sensitive for the selection of rare resources [21] (see also [20]).
For the application in network analyses, however, both B' and FT may show some undesired properties. Generally, B', FT and d' are reasonably well correlated with each other across the species within a network (e.g., r_{s }= 0.49 between d' and B', and r_{s }= 0.36 between d' and FT for the 90 pollinators in the network of Vázquez and Simberloff [33], both p < 0.001). However, differences with d' are substantial when a highly specialized species interacts largely exclusively with a specialized partner, e.g. a specialized pollinator with a plant that is almost exclusively pollinated by this one. Imagine a scenario where one exclusive interaction occurs between a plant species and a pollinator species in a 3 × 3 matrix (Table 2). If the interaction between pollinator sp. 3 and plant sp. 3 is only infrequent (e.g. a_{33 }= 1), all indices show a high degree of specialization (d' = 1.0, B' = 0, FT = 0.14) for both partners. However, as the number of exclusive interactions (a_{33}) increases, the values for both B' and FT of pollinator sp. 3 and plant sp. 3 show a highly undesired change towards generalization, although a higher a_{33 }is intuitively considered as extreme specialization (e.g., for a_{33 }= 50 the values for pollinator sp. 3 are B' = 0.31 and FT = 0.70), while only d' remains unaffected (d' = 1.0). FT is always larger than zero, and B' becomes larger than zero when the specialists interact more frequently than one of the other partners, thus when q_{j }> min(q_{1}, q_{2}, ... q_{c}). Both FT and B' approach a value of 1.0 (maximum generalization) for very large a_{33}. This undesired effect of FT and B' is not restricted to completely exclusive interactions between two partners.
Table 2. Association matrix example. Fictive association matrix between three pollinator species and three plant species. Numbers in each cell are counts of interaction frequencies.
Simulation of sampling effort and matrix architecture
Two published plantpollinator networks were selected to investigate the behavior of different specialization measures [32,33]. Both articles use their observed interaction matrices as a model to discuss network properties based on the number of links per pollinator or plant species, allowing a comparison of conclusions drawn. Both networks may be compared as they comprise relatively large datasets from temperate ecosystems, reporting interaction frequencies between plants and their floral visitors: the British meadow community studied by Memmott [32] involved 79 pollinator and 25 plant species (2183 pollinator visits observed), the forests in Argentina studied by Vázquez and Simberloff [33] involved 90 pollinator and 14 plant species (5285 visits). The datasets can be obtained from the Interaction Web Database [53]. We simulated a decreased sampling intensity in both networks using a rarefaction method in order to investigate how sampling effort affects the estimation of specialization indices. Real association matrices were reduced by randomly extracting interactions, e.g. from the total of m = 2183 visits in Memmott's web down to m = 5 visits (in steps of five, repeated ten times for each m).
In order to compare the null model characteristics of the specialization measures, we simulated artificial matrices with randomly associated partners and plotted the indices against an increasing number of partners and/or total number of interactions. We assumed that the total frequency of participating species approximates a lognormal distribution, which is typical for biological communities [21,22,24]. All row and column totals were randomly generated from a lognormal distribution (μ = 50, ∑= 1) that was scaled to the desired total number of interactions. Ten different combinations of row and column totals were obtained for each matrix size and taken as template to randomly associate the partners five times, thus each matrix size was represented by 50 random associations.
Authors' contributions
NB1 conceived of the study and all authors (NB1, FM, NB2) were involved in designing the methods, analyses, interpretation and drafting the manuscript.
Acknowledgements
We thank Diego Vázquez, Pedro Jordano, Thomas Hovestadt, and Michel Loreau for helpful comments and valuable discussion on earlier versions of this manuscript and the Interaction Web Database [53] for providing the datasets used here.
References

Olesen JM, Jordano P: Geographic patterns in plantpollinator mutualistic networks.

Novotny V, Basset Y: Host specificity of insect herbivores in tropical forests.
Proc R Soc London Ser B 2005, 272:10831090. Publisher Full Text

Waser NM, Chittka L, Price MV, Williams NM, Ollerton J: Generalization in pollination systems, and why it matters.
Ecology 1996, 77:10431060. Publisher Full Text

Jordano P: Patterns of mutualistic interactions in pollination and seed dispersal: connectance, dependence asymmetries, and coevolution.
Am Nat 1987, 129:657677. Publisher Full Text

Bascompte J, Jordano P, Melian CJ, Olesen JM: The nested assembly of plantanimal mutualistic networks.
Proc Natl Acad Sci USA 2003, 100:93839387. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Waser NM, Ollerton J, Eds: Plantpollinator interactions: from specialization to generalization. Chicago: University of Chicago Press; 2006.

Ollerton J, Cranmer L: Latitudinal trends in plantpollinator interactions: are tropical plants more specialised?
Oikos 2002, 98:340350. Publisher Full Text

Devoto M, Medan D, Montaldo NH: Patterns of interaction between plants and pollinators along an environmental gradient.
Oikos 2005, 109:461472. Publisher Full Text

Winemiller KO: Must connectance decrease with species richness?
Am Nat 1989, 134:960968. Publisher Full Text

Martinez ND: Constant connectance in community food webs.
Am Nat 1992, 139:12081218. Publisher Full Text

May RM: Will a large complex system be stable?
Nature 1972, 238:413414. PubMed Abstract  Publisher Full Text

Rejmánek M, Starý P: Connectance in real biotic communities and critical values for stability of model ecosystems.
Nature 1979, 280:311313. Publisher Full Text

Vázquez DP, Simberloff D: Ecological specialization and susceptibility to disturbance: conjectures and refutations.
Am Nat 2002, 159:606623. Publisher Full Text

Dunne JA, Williams RJ, Martinez ND: Network structure and biodiversity loss in food webs: robustness increases with connectance.
Ecol Lett 2002, 5:558567. Publisher Full Text

Magurran AE: Ecological diversity and its measurement. Princeton: Princeton University Press; 1988.

Sahli HF, Conner JK: Characterizing ecological generalization in plantpollination systems.
Oecologia 2006, 148:365372. PubMed Abstract  Publisher Full Text

Parrish JAD, Bazzaz FA: Difference in pollination niche relationships in early and late successional plant communities.
Ecology 1979, 60:597610. Publisher Full Text

Basset Y: Diversity and abundance of insect herbivores foraging on seedlings in a rainforest in Guyana.
Ecol Entomol 1999, 24:245259. Publisher Full Text

Herrera CM: Plant generalization on pollinators: species property or local phenomenon?

Bersier LF, BanasekRichter C, Cattin MF: Quantitative descriptors of foodweb matrices.

Krebs CJ: Ecological Methodology. Menlo Park: Benjamin Cummings; 1999.

Hurlbert SH: Measurement of niche overlap and some relatives.
Ecology 1978, 59:6777. Publisher Full Text

Feinsinger P, Spears EE, Poole RW: A simple measure of niche breadth.
Ecology 1981, 61:2732. Publisher Full Text

Smith EP: Niche breadth, resource availability, and inference.
Ecology 1982, 63:16751681. Publisher Full Text

Fonseca CR, Ganade G: Asymmetries, compartments and null interactions in an Amazonian antplant community.
J Anim Ecol 1996, 65:339347. Publisher Full Text

Kenny D, Loehle C: Are food webs randomly connected?
Ecology 1991, 72:17941799. Publisher Full Text

Goldwasser L, Roughgarden J: Sampling effects and the estimation of foodweb properties.
Ecology 1997, 78:4154. Publisher Full Text

Vázquez DP, Aizen MA: Asymmetric specialization: a pervasive feature of plantpollinator interactions.

Kay KM, Schemske DW: Geographic patterns in plantpollinator mutualistic networks: comment.

Vázquez DP, Morris WF, Jordano P: Interaction frequency as a surrogate for the total effect of animal mutualists on plants.
Ecol Lett 2005, 8:10881094. Publisher Full Text

Borer ET, Anderson K, Blanchette CA, Broitman B, Cooper SD, Halpern BS, Seabloom EW, Shurin JB: Topological approaches to food web analyses: a few modifications may improve our insights.
Oikos 2002, 99:397401. Publisher Full Text

Memmott J: The structure of a plantpollinator food web.
Ecol Lett 1999, 2:276280. Publisher Full Text

Vázquez DP, Simberloff D: Changes in interaction biodiversity induced by an introduced ungulate.
Ecol Lett 2003, 6:10771083. Publisher Full Text

Dicks LV, Corbet SA, Pywell RF: Compartmentalization in plantinsect flower visitor webs.
J Anim Ecol 2002, 71:3243. Publisher Full Text

Vázquez DP, Aizen MA: Null model analyzes of specialization in plantpollinator interactions.

Auerbach MJ: Stability, probability, and the topology of food webs. In Ecological communities: conceptual issues and the evidence. Edited by Strong DR, Simberloff D, Abele LG, Thistle AB. Princeton: Princeton University Press; 1984:413436.

Gotelli NJ, Graves GR: Null models in ecology. Washington: Smithsonian Institution; 1996.

Bascompte J, Jordano P, Olesen JM: Asymmetric coevolutionary networks facilitate biodiversity maintenance.
Science 2006, 312:431433. PubMed Abstract  Publisher Full Text

Memmott J, Waser NM, Price MV: Tolerance of pollination networks to species extinctions.
Proc R Soc London Ser B 2004, 271:26052611. Publisher Full Text

Johnson SD, Steiner KE: Generalization versus specialization in plant pollination systems.
Trends Ecol Evol 2000, 15:140143. PubMed Abstract  Publisher Full Text

Symons FB, Beccaloni GW: Phylogenetic indices for measuring the diet breadths of phytophagous insects.
Oecologia 1999, 119:427434. Publisher Full Text

Webb CO, Ackerly DD, McPeek MA, Donoghue MJ: Phylogenies and community ecology.
Annu Rev Ecol Syst 2002, 33:475505. Publisher Full Text

Novotny V, Basset Y, Miller SE, Weiblen GD, Bremer B, Cizek L, Drozd P: Low host specificity of herbivorous insects in a tropical forest.
Nature 2002, 416:841844. PubMed Abstract  Publisher Full Text

Jordano P, Bascompte J, Olesen JM: Invariant properties in coevolutionary networks of plantanimal interactions.
Ecol Lett 2003, 6:6981. Publisher Full Text

Vázquez DP: Degree distribution in plantanimal mutualistic networks: forbidden links or random interactions?
Oikos 2005, 108:421426. Publisher Full Text

Gotelli NJ: Null model analysis of species cooccurrence patterns.
Ecology 2000, 81:26062621. Publisher Full Text

Fenster CB, Armbruster WS, Wilson P, Dudash MR, Thomson JD: Pollination syndromes and floral specialization.
Annu Rev Ecol Evol Syst 2004, 35:375403. Publisher Full Text

Montecarlo statistics on RxC matrices [http://itb.biologie.huberlin.de/~nils/stat/] webcite

Manly B: Randomization bootstrap and Monte Carlo methods in biology. London: Chapman and Hall; 1997.

Blüthgen N, Verhaagh M, Goitía W, Blüthgen N: Ant nests in tank bromeliads – an example of nonspecific interaction.
Insect Soc 2000, 47:313316. Publisher Full Text

Patefield WM: An efficient method of generating random RxC tables with given row and column totals.
Appl Stat 1981, 30:9197. Publisher Full Text

Interaction web database [http://www.nceas.ucsb.edu/interactionweb/] webcite