Department of Animal Ecology and Tropical Biology, University of Würzburg, Biozentrum, Am Hubland, 97074 Würzburg, Germany

Institute of Theoretical Biology, Humboldt University, 10115 Berlin and Institute of Molecular Neurobiology, Free University of Berlin, 14195 Berlin, Germany

Abstract

Background

Network analyses of plant-animal interactions hold valuable biological information. They are often used to quantify the degree of specialization between partners, but usually based on qualitative indices such as 'connectance' or number of links. These measures ignore interaction frequencies or sampling intensity, and strongly depend on network size.

Results

Here we introduce two quantitative indices using interaction frequencies to describe the degree of specialization, based on information theory. The first measure (_{2}') characterizes the degree of specialization or partitioning among two parties in the entire network. Both indices are mathematically related and derived from Shannon entropy. The species-level index _{2}' as a network-level index is useful for comparisons across different interaction webs. Analyses of two published pollinator networks identified differences and features that have not been detected with previous approaches. For instance, plants and pollinators within a network differed in their average degree of specialization (weighted mean _{2}' is not affected by network size or sampling intensity.

Conclusion

Quantitative analyses reflect properties of interaction networks more appropriately than previous qualitative attempts, and are robust against variation in sampling intensity, network size and symmetry. These measures will improve our understanding of patterns of specialization within and across networks from a broad spectrum of biological interactions.

Background

The degree of specialization of plants or animals has been studied and debated extensively, and a continuum from complete specialization to full generalization can be found in various systems

Quantifying specialization at the species level

Specialization or generalization of interactions are most commonly characterized as the number of partners (or 'links'), e.g. the number of pollinator species visiting a flowering plant species or the number of food plant families a herbivore feeds upon. In this qualitative approach, interactions between a consumer and a resource species are only scored in a binary way as 'present' or 'absent', ignoring any distinction between strong interactions and weak or occasional ones. For example, binary representation of interactions do not distinguish a scenario where 99% of the individuals of a herbivore species feed on a single plant species only, but occasionally an individual is found on another plant, from a different scenario where a herbivore regularly feeds on both food plants. The problem is analogous to the measurement of biodiversity either as a crude species richness versus as a more elaborate diversity index including relative abundances

Quantifying specialization at the community level

The measurement used most commonly to characterize community-wide specialization is the 'connectance' index (_{plants}/_{poll}/

This measure,

Analyses based on binary data – both at the species and the community level – have obvious shortcomings, since they are highly dependent on sampling effort, decisions which species to include or not, and the size of investigated networks. Several authors thus emphasized the need to move beyond binary representations of interactions to quantitative measures involving some measure of interaction strength

A severe additional problem of connectance is that its lower and upper constraints are not scale-invariant _{min}) to maintain at least one link per species declines in a hyperbolic function with the number of interacting species, since _{min }= max(_{max}) may be constrained by, or a function of, total sampling effort. Across networks,

The objective of this paper is to develop and discuss specialization measures that are based on frequency data and thus account for sampling intensity, and that overcome the problem of scale dependence. We then test these approaches by evaluating the effect of sampling effort and scale dependence on a published natural pollination network, and on randomly generated associations as a null model. We differentiate between species-level measures of specialization, useful to investigate variability among species within a web, and a single network-wide measure that can be used for comparisons across networks.

Results

Patterns in two pollinator networks

Two selected plant-pollinator networks (British meadows studied by Memmott _{2}' (standardized two-dimensional Shannon entropy, see _{2}' = 0.24) compared to the Argentinean community (_{2}' = 0.63).

The variation of species-level specialization measures (standardized Kullback-Leibler distance, _{poll}> = 0.16) than in the latter (<_{poll}> = 0.54). The relationship between specialization of species _{i}_{i}) across the pollinator species differs between the two webs. In the British web, _{i }and _{i }were not correlated significantly (Spearman's _{s }= -0.08, _{s }= 0.65, _{plants}> = 0.27 and <_{plants}> = 0.53. No significant correlation was found between the plants' frequency and specialization in either web (both _{plants}> > <_{poll.}>), but not in the Argentinean web. This distinction is not found when only the weighted mean number of links (_{plants}> is much greater than <_{poll.}> in both networks. The difference in <_{r }= _{c }=

Patterns within pollinator networks

**Patterns within pollinator networks**. Frequency distribution of the species-level specialization index (

Simulation of sampling effort

In order to test whether specialization estimates are dependent on sampling and scale effects, we simulated a decreased sampling intensity in both networks using rarefaction (see _{2}' is robust and already very well estimated by a small fraction of the interactions sampled (Fig. _{2}' remains below 5% from about half of the total number of visits onwards in the British web and even at one-tenth of the total sampling effort of the Argentinean web. The estimation of connectance (_{2}', do not necessarily require a very large or even complete association matrix, but can also be very well estimated from a smaller representative subset as long as there is no systematic sampling bias.

Sampling effect in pollinator networks

**Sampling effect in pollinator networks**. Rarefaction of sampling effort in a British and an Argentinean pollination web [32,33]. Two network-level measures of specialization – the frequency-based specialization index (_{2}') and the 'connectance' index (

Null model patterns

The degree of specialization can be further characterized by comparison with a null model. The null model used here is that each species has a fixed total number of interactions (given by the observed association matrix), but interactions are assigned randomly. In the above pollinator networks, random associations yield a specialization index _{2}' that remains close to zero for almost the entire range of sampling intensity, while connectance (_{2}' derived from real networks may typically be clearly distinguished from this null model, while the comparison of

Simulations of artificially generated random associations (see _{2}' is largely unaffected by network size (Fig. _{2}' is usually close to zero. Connectance values (_{2}' and

Simulated random networks

**Simulated random networks**. Behavior of specialization measures in simulated random networks. Each point represents one matrix with random associations, based on specific row and column totals that follow a lognormal distribution. The size of squared matrices in (A) increased from 2 × 2 to 200 × 200. In (B), only the number of rows changed, while the number of columns was fixed at 20, rectangular matrices thus increased from 2 × 20 to 200 × 20. In (C), the network size was fixed at 20 × 20. The total number of interactions (_{2}' and connectance

Discussion

Properties of specialization measures

The suggested indices, _{2}', quantify the degree of specialisation of elements within an interaction network and of the entire network, respectively. While the number of links (_{2}' represent corresponding measures for frequency-based data. The need to include information on interaction strength or interaction frequency into network analyses has been announced by various authors _{2}' are largely robust against variation in matrix size, shape, and sampling effort. In several cases, _{2}' remained largely unchanged in simulations of random associations over a range of network sizes, variable network asymmetries, and number of interactions. This scale invariance suggests that both _{2}' can be used directly for comparisons across different networks, while comparisons of

Qualitative methods like the indices suggested here also allow a more detailed analysis of interaction patterns within and across networks. Fruitful areas include comparisons of networks across different interaction types _{plants }= _{poll }= _{i}) is strongly positively correlated with its overall frequency (_{i}) in five pollination networks including the datasets analyzed above. They argued that this apparent higher generalization of common plants and common pollinators may be largely explained by null models, calling for an improved measurement of specialization. Our results for the correlation between _{i }and _{i }in two pollinator webs suggest that the relationship between specialization and abundance may be more variable, and even positive as in the Argentinean network.

Caveats

Some problems apply to any measure of network analyses including the proposed indices. Measures of specialization mostly ignore phylogenetic relationships or ecological similarity within an association matrix. For example, a plant species that is pollinated by multiple moth species may be unsuitably regarded as more generalized than a plant pollinated by few insect species comprising several different orders _{2}.' imply that all individuals can be shifted around between resources in the same way, irrespective of their size or non-fitting parameters. The role of 'forbidden links' as constraints to network analyses has been discussed elsewhere

For both indices _{2}', we proposed above to use the total number of interactions for each species as a measure of partner availability (_{j}) and as constraint for standardization (fixed row and column totals). It may be debated whether independent measures of plant and animal abundances could be more appropriate than using interaction frequency data as such. However, despite the fact that such abundance data barely exist for most networks, note that the actual number of interactions often more suitably reflects resource availability and consumer activity than an independent measure of species abundance. For instance, a flower of one species may have a much higher nectar production than another and consequently receive a higher number of visitors, while the local abundance of the plant species does not reflect such differences in resource quality and/or quantity. Both _{2}' thus focus on the actual partitioning between the interacting species. In studies where detailed knowledge or theoretical assumptions about resources (availability and quality) or consumers (activity density and consumption rate) are available or under experimental control, such data may be incorporated into the analysis (defining _{j }and constraints) instead of interaction frequencies. The constraint of fixed row and column totals has been debated elsewhere in the context of species co-occurrence patterns, where it was found to be most appropriate in null model comparisons, although critics have argued earlier that these marginals themselves may already reflect competitive interactions (

It should also be emphasized that analyses of frequency data may be susceptible for pseudoreplication of repeated associations of the same individuals or close associations derived from a single dispersal event (e.g. a social insect colony, aggregating individuals, multiple offspring from a single egg cluster, or monospecific plant clusters). These may lead to an overestimation of specialization. To be more meaningful on a population level, frequency analyses should thus be based on spatially independent association replicates. Note that all species-wise specialization measures such as

Conclusion

In accordance with previous calls _{2}' represent scale-independent and meaningful indices to characterize specialization on the level of single species and the entire network, respectively. These novel indices allow us to investigate patterns within and across networks that have not been detected with qualitative measures such as correlations with species frequencies, network size and asymmetries in specialization between partners. Recently, Bascompte et al.

Methods

Species-level index

As species-level measure of 'partner diversity', we propose the Kullback-Leibler distance (or Kullback-Leibler divergence, relative entropy) in a standardized form (_{ij}, (Table

Instead of frequencies (_{ij}), each interaction can be assigned a proportion of the total (

Elements in a species association matrix. Interaction frequencies (_{ij}) between _{i}, columns: _{j}, total elements:

Animal sp.1

sp. 2

...

sp.

Plant sp. 1

_{11}

_{12}

...

a_{1c}

sp. 2

a_{21}

_{22}

...

a_{2c}

...

...

...

...

...

...

sp.

_{r1}

_{r2}

...

a_{rc}

...

Let _{ij }be the proportion of the number of interactions (_{ij}) in relation to the respective row total (_{i}), and _{j }the proportion of all interactions by partner

To quantify the specialization of a species _{i }is suggested. This _{i }is related to Shannon diversity, similar to an index recently suggested to characterize biomass flow diversity in food webs _{j}) to the overall partner availability (_{j}). The Kullback-Leibler distance for species

which can be normalized as

The theoretical maximum is given by _{max }= ln (_{i}), and the theoretical minimum (_{min}) is zero for the special case where all _{ij }= _{j}. However, a realistic _{min }may be constrained at some value above zero given that _{ij }and _{j }are calculated from discrete integer values (_{ij}). To take this into account, _{min }is more suitably computed algorithmically as in a program available from the authors and online _{i}> for pollinators as

While <_{i}> usually differs from <_{j}>, the weighted means of the non-standardized Kullback-Leibler distances are the same for both parties, hence <_{i}> = <_{j}>.

Network-level index

The following network-wide measure is based on the bipartite representation of a two mode network of interactions such as plant-animal or other resource-consumer interactions where members of each party interact with members of the other party but not among themselves (unlike many food webs). The two-dimensional Shannon entropy (termed _{2 }in order to avoid confusion with the common one-dimensional

_{2 }decreases with higher specialization. This measure is closely related to the weighted mean of the non-standardized Kullback-Leibler distance of all species, since

<_{i}> = <_{j}> = _{2max }- _{2}

(see below, _{i }_{2}). _{2 }can be standardized between 0 and 1.0 for extreme specialization versus extreme generalization, respectively, when its minimum and maximum values (_{2min }and _{2max}) are known. _{2min }and _{2max }can be calculated for given constraints. The constraints used here are the maintenance of the total number of interactions of each species, thus all row and column totals, _{i }and _{j}, being fixed (see also

_{2 }reaches its theoretical maximum where each _{ij }equals its expected value from a random interaction matrix (_{i}·_{j}), such that

while its theoretical minimum (_{2min}) may be close to zero depending on the matrix architecture. Like for _{min} above, H_{2max }and H_{2min }are constrained by the fact that they are derived from integer values. A program implementing a heuristic solution to obtain _{2max }and _{2min}, and to perform the entire analysis is available from the authors or online

The degree of specialization is obtained as a standardized entropy on a scale between _{2min }and _{2max }as

Consequently, _{2}' ranges between 0 and 1.0 for extreme generalization and specialization, respectively.

Comparison with random associations

_{2 }can be tested against a null model of random associations (_{2ran}). A number of random permutations of the matrix can be performed using a _{2 }is more specialized than expected by random associations is simply given as the proportion of values obtained for _{2ran }that are equal or larger than _{2}, a common procedure in randomization statistics _{2ran }is usually only slightly larger than H_{2min}._{Previously, permutations of r × c contingency tables often used a different test statistics instead of H2 255152:}

The relationship between _{2 }is described by a constant, the total number of interactions (_{2}. Consequently, both methods yield exactly the same

Relationship between _{i }and _{2}

In the following we derive the relationship between the individual levels of specialization (_{i}) and the community level (_{2}). The non-standardized Kullback-Leibler distance for row

because

The weighted mean of _{i }for all _{i}) yields

since

While the first summand in the final equation for <_{i}> equals -_{2}, the remaining two summands correspond to the maximum entropy _{2max}, because

Therefore,

<_{i}> = _{2max }-_{2}.

The same calculation applies for <_{j}>, thus <_{i}> = <_{j}>. Consequently, the degree of specialization of the entire network (corresponding to the deviation of the network-wide entropy from its maximum value) equals the weighted sum of the specialization of its elements (species).

Properties of alternative niche breadth measures

The standardized Hurlbert's (

Each _{ij }is the proportion of the number of interactions in relation to the respective row total, and _{j }is the proportion of all interactions by partner

Both the standardized Hurlbert's (

For the application in network analyses, however, both _{s }= -0.49 between _{s }= -0.36 between _{33 }= 1), all indices show a high degree of specialization (_{33}) increases, the values for both _{33 }is intuitively considered as extreme specialization (e.g., for _{33 }= 50 the values for pollinator sp. 3 are _{j }> min(_{1}, _{2}, ... _{c}). Both _{33}. This undesired effect of

Association matrix example. Fictive association matrix between three pollinator species and three plant species. Numbers in each cell are counts of interaction frequencies.

Pollinator sp. 1

Pollinator sp. 2

**Pollinator sp. 3**

Plant sp. 1

21

5

0

Plant sp. 2

23

4

0

**Plant sp. 3**

0

0

**
a
**

Simulation of sampling effort and matrix architecture

Two published plant-pollinator networks were selected to investigate the behavior of different specialization measures

In order to compare the null model characteristics of the specialization measures, we simulated artificial matrices with randomly associated partners and plotted the indices against an increasing number of partners and/or total number of interactions. We assumed that the total frequency of participating species approximates a lognormal distribution, which is typical for biological communities

Authors' contributions

NB1 conceived of the study and all authors (NB1, FM, NB2) were involved in designing the methods, analyses, interpretation and drafting the manuscript.

Acknowledgements

We thank Diego Vázquez, Pedro Jordano, Thomas Hovestadt, and Michel Loreau for helpful comments and valuable discussion on earlier versions of this manuscript and the Interaction Web Database