Email updates

Keep up to date with the latest news and content from BMC Ecology and BioMed Central.

Open Access Highly Accessed Methodology article

Measuring specialization in species interaction networks

Nico Blüthgen1*, Florian Menzel1 and Nils Blüthgen2

Author Affiliations

1 Department of Animal Ecology and Tropical Biology, University of Würzburg, Biozentrum, Am Hubland, 97074 Würzburg, Germany

2 Institute of Theoretical Biology, Humboldt University, 10115 Berlin and Institute of Molecular Neurobiology, Free University of Berlin, 14195 Berlin, Germany

For all author emails, please log on.

BMC Ecology 2006, 6:9  doi:10.1186/1472-6785-6-9

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1472-6785/6/9


Received:4 May 2006
Accepted:14 August 2006
Published:14 August 2006

© 2006 Blüthgen et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Network analyses of plant-animal interactions hold valuable biological information. They are often used to quantify the degree of specialization between partners, but usually based on qualitative indices such as 'connectance' or number of links. These measures ignore interaction frequencies or sampling intensity, and strongly depend on network size.

Results

Here we introduce two quantitative indices using interaction frequencies to describe the degree of specialization, based on information theory. The first measure (d') describes the degree of interaction specialization at the species level, while the second measure (H2') characterizes the degree of specialization or partitioning among two parties in the entire network. Both indices are mathematically related and derived from Shannon entropy. The species-level index d' can be used to analyze variation within networks, while H2' as a network-level index is useful for comparisons across different interaction webs. Analyses of two published pollinator networks identified differences and features that have not been detected with previous approaches. For instance, plants and pollinators within a network differed in their average degree of specialization (weighted mean d'), and the correlation between specialization of pollinators and their relative abundance also differed between the webs. Rarefied sampling effort in both networks and null model simulations suggest that H2' is not affected by network size or sampling intensity.

Conclusion

Quantitative analyses reflect properties of interaction networks more appropriately than previous qualitative attempts, and are robust against variation in sampling intensity, network size and symmetry. These measures will improve our understanding of patterns of specialization within and across networks from a broad spectrum of biological interactions.

Background

The degree of specialization of plants or animals has been studied and debated extensively, and a continuum from complete specialization to full generalization can be found in various systems [1-6]. In general, two levels of specialization measures may be distinguished: first, the characterization of focal species and, second, the degree of specialization of an entire interaction network, representing an assemblage of species and their interaction partners (e.g. food webs, mutualistic networks, predator-prey relationships). When interactions are considered as ecological niche, the first level describes the niche breadth of a species and the second level the degree of niche partitioning across species. While the species level is more straightforward in its biological interpretation, analyses at the network level can be useful for comparisons across different types of networks. Such analyses have been performed to compare plant-pollinator webs versus plant-seed disperser webs [4,5], different plant-pollinator networks along geographic gradients [1,7,8], or food webs of variable size [9,10]. Entire network analyses are also used to study patterns on a community level such as coevolutionary adaptations [3], ecosystem stability or resilience [11-14].

Quantifying specialization at the species level

Specialization or generalization of interactions are most commonly characterized as the number of partners (or 'links'), e.g. the number of pollinator species visiting a flowering plant species or the number of food plant families a herbivore feeds upon. In this qualitative approach, interactions between a consumer and a resource species are only scored in a binary way as 'present' or 'absent', ignoring any distinction between strong interactions and weak or occasional ones. For example, binary representation of interactions do not distinguish a scenario where 99% of the individuals of a herbivore species feed on a single plant species only, but occasionally an individual is found on another plant, from a different scenario where a herbivore regularly feeds on both food plants. The problem is analogous to the measurement of biodiversity either as a crude species richness versus as a more elaborate diversity index including relative abundances [15]. Several approaches have thus been used to directly include variation in interaction frequencies (i.e., their evenness) in characterizing the diversity of partners, e.g. Simpson's diversity index for pollinators [16,17] or Lloyd's index for host specificity [18]. Alternatively, other studies indirectly controlled for abundance or sampling intensity using rarefaction methods [13,19]. Correspondingly, Bersier and coworkers [20] have suggested to quantify the diversity of biomass flows in food webs using a Shannon diversity measure. Niche breadth theory provides several additional indices that include some measure of resource frequency or resource use intensity [21], which can be viewed in analogy to 'partner diversity' in the context of association networks. However, Hurlbert [22] emphasized that not only proportional utilization, but also the proportional availability of each niche should be taken into account. A species that uses all niches in the same proportion as their availability in the environment should be considered more opportunistic than a species that uses rare resources disproportionately more. If variation in resource availability is large, diversity-based measures that ignore this availability may be highly misleading [22,23]. Several niche breadth measures thus combine proportional resource utilization with proportional resource availability [22-24]. These concepts have been rarely applied in the context of species interaction networks, e.g. plant-pollinator webs where binary data are more common than quantitative webs.

Quantifying specialization at the community level

The measurement used most commonly to characterize community-wide specialization is the 'connectance' index (C) [1,4,8-10,25-27]. C is defined as the proportion of the actually observed interactions to all possible interactions. Consider a contingency table showing the association between two parties, with r rows (e.g., plant species) and c columns (e.g., pollinators). Connectance is defined as C = I/(r·c), with I being the total number of non-zero elements in the matrix. Therefore, like the number of partners or links (L) described above, C uses only binary information and ignores interaction strength. C is directly related to the mean number of links (<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>) of plant species or pollinator species as C = <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>plants/c = <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>poll/r.

This measure, <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>, has also been used to compare networks [1,3,7,8,28]. Recently, it has been suggested to use <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a> instead of C to characterize networks [29]. However, note that comparisons across networks of different size (number of species) are problematic, since <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>, unlike C, is not scaled according to the number of available partners (see also [2,10]). <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a> in a small network may represent a larger proportion of available partners compared to the same value of <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a> in a large network.

Analyses based on binary data – both at the species and the community level – have obvious shortcomings, since they are highly dependent on sampling effort, decisions which species to include or not, and the size of investigated networks. Several authors thus emphasized the need to move beyond binary representations of interactions to quantitative measures involving some measure of interaction strength [4,20,27,29-32]. A way to at least partly overcome these deficiencies is to cut off all rare species or weak interactions below a frequency threshold [3,9,33,34] or to control for sampling effort in null models [7,8,13,19,25,35]. However, for interaction webs where a more detailed information is available, simplification to binary data as in C or <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a> remains unsatisfactory. Conveniently, the observed interaction frequency may represent a meaningful surrogate for interaction strength, at least in pollination and seed-dispersal systems as shown by Vázquez et al. [30] (see also [16]). Incorporating interaction frequency or even a direct measure of interaction strength in a network measure of specialization would thus provide an important progress frequently called for.

A severe additional problem of connectance is that its lower and upper constraints are not scale-invariant [25], which limits its use for comparisons across networks. The minimum possible value (Cmin) to maintain at least one link per species declines in a hyperbolic function with the number of interacting species, since Cmin = max(r, c)/(r·c), and an upper limit (Cmax) may be constrained by, or a function of, total sampling effort. Across networks, C decays strongly with network size, which has been debated in detail in the context of food web analysis [9,10,26,27,36,37]. The strong relationship between C and network size generates a problem for disentangling any biologically meaningful effect from this mathematically inherent scale dependence. For instance, network comparisons may focus on residual variation in C after an average effect of network size has been controlled for [1,4], or C could be rescaled to account for this size effect (see [25,36]). For natural networks of similar size, the range of actual C values is typically very narrow [4], thus other structural forces may be poorly detectable.

The objective of this paper is to develop and discuss specialization measures that are based on frequency data and thus account for sampling intensity, and that overcome the problem of scale dependence. We then test these approaches by evaluating the effect of sampling effort and scale dependence on a published natural pollination network, and on randomly generated associations as a null model. We differentiate between species-level measures of specialization, useful to investigate variability among species within a web, and a single network-wide measure that can be used for comparisons across networks.

Results

Patterns in two pollinator networks

Two selected plant-pollinator networks (British meadows studied by Memmott [32], Argentinean forests studied by Vázquez and Simberloff [33]) differ markedly in their degree of specialization when quantitative analyses are applied. The qualitative network index, connectance, is similar in both interaction webs (British web: C = 0.15, Argentinean web: C = 0.13). However, frequencies of pollinator visits are much more evenly distributed in the British community than in the Argentinean example. In the British web, the interaction between a dipteran species and Leontodon hispidus was the most frequent one, representing 6% of the total 2183 interactions observed. In the Argentinean network, visits of Aristotelia chilensis by a colletid bee species represented 20% of the 5285 interactions alone. Interactions between the top five plant and top five pollinator species made up 44% of the interactions in the British web, but 74% in the Argentinean web. This difference in the heterogeneity of interaction frequencies is not evident in measures based on binary information such as number of links (L) or connectance (C). In contrast, the degree of specialization shown by the frequency-based index H2' (standardized two-dimensional Shannon entropy, see Methods: Network-level index) is much lower in the British community (H2' = 0.24) compared to the Argentinean community (H2' = 0.63).

The variation of species-level specialization measures (standardized Kullback-Leibler distance, d') holds valuable information for the structural properties of a network (see Methods: Species-level index). The British pollination web is dominated by highly generalized pollinators (low d', both in terms of individuals as well as species), while putative specialists are represented by very few individuals and species (Fig. 1A). In contrast, most pollinators in the Argentinean web are moderately generalized to specialized, with the second highest level of specialization found in the most common species (Fig. 1B). Consequently, the weighted mean degree of specialization is much lower in the former web (<d'poll> = 0.16) than in the latter (<d'poll> = 0.54). The relationship between specialization of species i (d'i) and its interaction frequency (Ai) across the pollinator species differs between the two webs. In the British web, d'i and Ai were not correlated significantly (Spearman's rs = -0.08, p = 0.46), while a highly positive correlation was found in the Argentinean web (rs = 0.65, p < 0.0001). Note that designation of any specialization index to a species i that is only represented by a single individual may be critical. However, significances in the above correlations remain unaffected when pollinators with one single interaction are excluded. From the plants' point of view, the species in Memmott's web are also more generalized in terms of their pollinator spectrum (Fig. 1C) than the plants studied by Vázquez and Simberloff (Fig. 1D). The respective weighted means are <d'plants> = 0.27 and <d'plants> = 0.53. No significant correlation was found between the plants' frequency and specialization in either web (both p ≥ 0.16). Interestingly, plants were on average more specialized than pollinators in the British web (<d'plants> > <d'poll.>), but not in the Argentinean web. This distinction is not found when only the weighted mean number of links (L) are examined, since <Lplants> is much greater than <Lpoll.> in both networks. The difference in <L> may be driven by the highly asymmetrical matrix architecture in both webs, where the number of pollinator species greatly exceeds the number of plant species. The unweighted mean <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a> is even directly linked to the matrix architecture (i.e., number of rows and columns, r and c) by a constant (connectance C), since <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>r = c·C and <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>c = r·C. In contrast, the matrix asymmetry does not affect d' (see also below, Null model patterns).

thumbnailFigure 1. Patterns within pollinator networks. Frequency distribution of the species-level specialization index (d') for pollinators and plants from two published networks, one from Britain [32] and one from Argentina [33]. Bars show the number of individuals in each category (label '0' defines 0.00 ≤ d' < 0.05, etc.). Bars are separated for different species, and total number of species in each category is given on top. Arrows indicate cases where bars are invisible due to low numbers of individuals.

Simulation of sampling effort

In order to test whether specialization estimates are dependent on sampling and scale effects, we simulated a decreased sampling intensity in both networks using rarefaction (see Methods: Simulation of sampling effort and matrix architecture). In both networks, H2' is robust and already very well estimated by a small fraction of the interactions sampled (Fig. 2). The coefficient of variance of H2' remains below 5% from about half of the total number of visits onwards in the British web and even at one-tenth of the total sampling effort of the Argentinean web. The estimation of connectance (C) is also relatively stable at least in the Argentinean web, although it shows a positive trend across sampling effort in the British web (Fig. 2). These findings suggest that network-wide measures of specialization, particularly H2', do not necessarily require a very large or even complete association matrix, but can also be very well estimated from a smaller representative subset as long as there is no systematic sampling bias.

thumbnailFigure 2. Sampling effect in pollinator networks. Rarefaction of sampling effort in a British and an Argentinean pollination web [32,33]. Two network-level measures of specialization – the frequency-based specialization index (H2') and the 'connectance' index (C) – are shown for networks in which the total number of interactions (m) has been reduced by randomly deleting interactions. Black dots show the effect of sampling effort for the original association matrix, gray dots the effect for a null model, i.e. five networks in which partners were randomly associated (same row and column totals as in the original matrix).

Null model patterns

The degree of specialization can be further characterized by comparison with a null model. The null model used here is that each species has a fixed total number of interactions (given by the observed association matrix), but interactions are assigned randomly. In the above pollinator networks, random associations yield a specialization index H2' that remains close to zero for almost the entire range of sampling intensity, while connectance (C) shows a positive trend over the total number of interactions (m) (Fig. 2). Therefore, H2' derived from real networks may typically be clearly distinguished from this null model, while the comparison of C is complicated by scale dependence and the relatively large values yielded by the null model.

Simulations of artificially generated random associations (see Methods: Simulation of sampling effort and matrix architecture) confirm that the network-level specialization index H2' is largely unaffected by network size (Fig. 3A), network architecture (Fig. 3B) or total number of interactions (m) for a fixed matrix size (Fig. 3C). For random associations as shown here, H2' is usually close to zero. Connectance values (C) of random matrices show the known hyperbolic function over the number of associated species (Fig. 3A), changes with matrix asymmetry (Fig. 3B) and increase strongly with increasing m (Fig. 3C). For specialization measures at the species level, the average number of links per species (<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>) increases strongly with network size, number of available partners, and m (Fig. 3). While other niche breadth measures may also show some variation across different network scales (not shown), the weighted mean Kullback-Leibler distance <d'> is poorly affected by network size, network asymmetry, and number of interactions (Fig. 3). Both H2' and d' may thus be appropriate for comparisons across matrices of different scale.

thumbnailFigure 3. Simulated random networks. Behavior of specialization measures in simulated random networks. Each point represents one matrix with random associations, based on specific row and column totals that follow a lognormal distribution. The size of squared matrices in (A) increased from 2 × 2 to 200 × 200. In (B), only the number of rows changed, while the number of columns was fixed at 20, rectangular matrices thus increased from 2 × 20 to 200 × 20. In (C), the network size was fixed at 20 × 20. The total number of interactions (m) increased with matrix size in (A), where each species had on average 20 individuals. In (B), m was fixed at 4000, resulting in a reduced interaction density for larger matrices. In (C), m increased from 20 to 4000. The index H2' and connectance C are specialization measures of the whole matrix and thus reciprocal, while the average number of links (

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>

), and weighted mean standardized Kullback-Leibler distance (<d'>) are given for all columns (rows give a similar pattern).

Discussion

Properties of specialization measures

The suggested indices, d' and H2', quantify the degree of specialisation of elements within an interaction network and of the entire network, respectively. While the number of links (L) and connectance (C) represent species-level and community-level measures of interactions based on binary data, respectively, d' and H2' represent corresponding measures for frequency-based data. The need to include information on interaction strength or interaction frequency into network analyses has been announced by various authors [4,20,27,30,31,38]. Parallel to earlier advances in diversity measures compared to species richness, quantitative network measures account for the heterogeneity in link strength rather than assigning equal weights to every link. Moreover, we have shown that d' and H2' are largely robust against variation in matrix size, shape, and sampling effort. In several cases, C may be strongly affected by sampling effort [25,27], while H2' remained largely unchanged in simulations of random associations over a range of network sizes, variable network asymmetries, and number of interactions. This scale invariance suggests that both d' and H2' can be used directly for comparisons across different networks, while comparisons of L and C are more problematic [1,35].

Qualitative methods like the indices suggested here also allow a more detailed analysis of interaction patterns within and across networks. Fruitful areas include comparisons of networks across different interaction types [4], biogeographical gradients [1], biodiversity and land use gradients [13], robustness of networks against extinction risks [39], asymmetries between plants and animals [38], and relationships between specialisation and abundance [35]. While a comparison of the average number of partners between plants versus animals is solely dependent on the matrix architecture (i.e., the number of rows r versus columns c, since <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>plants = c·C and <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M1">View MathML</a>poll = r·C), this limitation does not apply to d'. In the two selected pollinator webs, plants are either similarly or more specialised than pollinators in regard to weighted mean d'. This allows an scale-independent evaluation of asymmetries in the degree of specialization between partners (see also [38]). Moreover, Vázquez and Aizen [35] noted that the number of links of a species (Li) is strongly positively correlated with its overall frequency (Ai) in five pollination networks including the datasets analyzed above. They argued that this apparent higher generalization of common plants and common pollinators may be largely explained by null models, calling for an improved measurement of specialization. Our results for the correlation between d'i and Ai in two pollinator webs suggest that the relationship between specialization and abundance may be more variable, and even positive as in the Argentinean network.

Caveats

Some problems apply to any measure of network analyses including the proposed indices. Measures of specialization mostly ignore phylogenetic relationships or ecological similarity within an association matrix. For example, a plant species that is pollinated by multiple moth species may be unsuitably regarded as more generalized than a plant pollinated by few insect species comprising several different orders [40]. In addition, the fact that herbivores are commonly specialized on host plant families rather than species may skew network patterns if not carefully accounted for. A first approach to investigate such effects may be to compare the level of specialization after a stepwise reduction of the matrix by pooling species to higher taxonomic units, such as genera, families, and orders. For known phylogenies, more advanced techniques for analyses with a particular evolutionary focus are available [41-43]. Another deficiency may be that species or their partners are all given the same individual 'weight' in the analyses, whether they may be small bees or large bats visiting a small herb with little nectar or a mass flowering tree. Null models as in the calculation for both C and H2.' imply that all individuals can be shifted around between resources in the same way, irrespective of their size or non-fitting parameters. The role of 'forbidden links' as constraints to network analyses has been discussed elsewhere [44,45]. Similarly, calculations of d' or other niche breadth measures are based on the implicit assumption that each species adjusts its interactions according to the availability of partners (niches), irrespective of morphological or behavioral constraints. Moreover, if data are collected from a large heterogeneous habitat or over a prolonged time period, calculations of the degree of specialization may be severely constrained by the spatiotemporal overlap or non-overlap between partners for other reasons than resource preferences, e.g. when not all species are able to reach all sites in the same way, or when some resources and consumers have asynchronous phenologies. Consequently, network analyses as suggested here will be most useful to study resource-consumer partitioning within a short time frame and limited spatial scale.

For both indices d' and H2', we proposed above to use the total number of interactions for each species as a measure of partner availability (qj) and as constraint for standardization (fixed row and column totals). It may be debated whether independent measures of plant and animal abundances could be more appropriate than using interaction frequency data as such. However, despite the fact that such abundance data barely exist for most networks, note that the actual number of interactions often more suitably reflects resource availability and consumer activity than an independent measure of species abundance. For instance, a flower of one species may have a much higher nectar production than another and consequently receive a higher number of visitors, while the local abundance of the plant species does not reflect such differences in resource quality and/or quantity. Both d' and H2' thus focus on the actual partitioning between the interacting species. In studies where detailed knowledge or theoretical assumptions about resources (availability and quality) or consumers (activity density and consumption rate) are available or under experimental control, such data may be incorporated into the analysis (defining qj and constraints) instead of interaction frequencies. The constraint of fixed row and column totals has been debated elsewhere in the context of species co-occurrence patterns, where it was found to be most appropriate in null model comparisons, although critics have argued earlier that these marginals themselves may already reflect competitive interactions ([46] and references therein). Any approach to compare networks based on fixed marginals for standardization will fail to detect potentially meaningful patterns displayed by these architectural features, namely the number of resource and consumer species and the heterogeneity of total interaction frequencies. This network architecture may already be shaped by past competitive interactions or indicate fundamental constraints, a largely unexplored hypothesis that merits additional investigations.

It should also be emphasized that analyses of frequency data may be susceptible for pseudoreplication of repeated associations of the same individuals or close associations derived from a single dispersal event (e.g. a social insect colony, aggregating individuals, multiple offspring from a single egg cluster, or monospecific plant clusters). These may lead to an overestimation of specialization. To be more meaningful on a population level, frequency analyses should thus be based on spatially independent association replicates. Note that all species-wise specialization measures such as d' are sensitive to the behavior of the other species. Any systematic sampling bias (e.g. a taxonomic focus within a guild) will therefore affect the conclusions of comparisons within or across networks.

Conclusion

In accordance with previous calls [4,20,27,30,31,38], we suggest that the explicit inclusion of frequency data reflects an important step forward in network analyses, as too many assumptions are implicit in any measure based on binary representation. Most notably, connectance and 'number of partners' imply an equal availability of all partners – an unlikely scenario. Qualitative indices are not robust against sampling effort. On the contrary, the proposed quantitative measures based on interaction frequencies explicitly account for this source of variation. Our study suggests that d' and H2' represent scale-independent and meaningful indices to characterize specialization on the level of single species and the entire network, respectively. These novel indices allow us to investigate patterns within and across networks that have not been detected with qualitative measures such as correlations with species frequencies, network size and asymmetries in specialization between partners. Recently, Bascompte et al. [38] showed that the incorporation of frequency data may unveil pervasive asymmetries within networks. Particularly since Vázquez et al. [30] demonstrated that interaction frequencies in plant-pollinator and plant-seed disperser systems often correlate with the magnitude of mutualistic services for the plant (although variation in pollinator effectiveness can be important, see [47]), an increased collection of frequency data and appropriate quantitative analyses would greatly benefit future network studies.

Methods

Species-level index

As species-level measure of 'partner diversity', we propose the Kullback-Leibler distance (or Kullback-Leibler divergence, relative entropy) in a standardized form (d'). Coming from information theory, this index quantifies the difference between two probability distributions [48]. While the standardized Hurlbert's and Smith's measure of niche breadth could be used alternatively [21,22,24], d' has some advantages in the context of networks. While all three indices regard an exclusive pairing between two species as high degree of specialization as long as interactions between the two partners are infrequent, Hurlbert's and Smith's indices show a undesired trend towards full generalization when the number of interactions between the two partners increase, although this should be considered a stronger indication of specialization (see below, Properties of alternative niche breadth measures). The interaction between two parties is commonly displayed in a r × c contingency table, with r rows representing one party such as flowering plant species, and c columns representing the other party such as pollinator species. In each cell, the frequency of interaction between plant species i and pollinator species j (or another useful measure of interaction strength) is given as aij, (Table 1).

Instead of frequencies (aij), each interaction can be assigned a proportion of the total (m) as

Table 1. Elements in a species association matrix. Interaction frequencies (aij) between c animal and r plant species and their respective totals (rows:Ai, columns: Aj, total elements: m).

 <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M9','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M9">View MathML</a>, where <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M26','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M26">View MathML</a>.

Let p'ij be the proportion of the number of interactions (aij) in relation to the respective row total (Ai), and qj the proportion of all interactions by partner j in relation to the total number of interactions (m). Thus,

 <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M10','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M10">View MathML</a>, <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M27','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M27">View MathML</a>, <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M28','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M28">View MathML</a>, and <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M29','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M29">View MathML</a>.

To quantify the specialization of a species i, the following index di is suggested. This di is related to Shannon diversity, similar to an index recently suggested to characterize biomass flow diversity in food webs [20]. However, an appropriate index in this context should not only consider the diversity of partners, but also their respective availability (see [22]). Consequently, the following index compares the distribution of the interactions with each partner (p'j) to the overall partner availability (qj). The Kullback-Leibler distance for species i is denoted as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M11','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M11">View MathML</a>

which can be normalized as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M12','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M12">View MathML</a>

The theoretical maximum is given by dmax = ln (m/Ai), and the theoretical minimum (dmin) is zero for the special case where all p'ij = qj. However, a realistic dmin may be constrained at some value above zero given that p'ij and qj are calculated from discrete integer values (aij). To take this into account, dmin is more suitably computed algorithmically as in a program available from the authors and online [49], providing all d' for a given matrix. This standardized Kullback-Leibler distance (d') ranges from 0 for the most generalized to 1.0 for the most specialized case. Thus, d' can be interpreted as deviation of the actual interaction frequencies from a null model which assumes that all partners are used in proportion to their availability. An average degree of specialization among the species of a party can be presented as a weighted mean of the standardized index, e.g. <d'i> for pollinators as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M13','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M13">View MathML</a>

While <d'i> usually differs from <d'j>, the weighted means of the non-standardized Kullback-Leibler distances are the same for both parties, hence <di> = <dj>.

Network-level index

The following network-wide measure is based on the bipartite representation of a two mode network of interactions such as plant-animal or other resource-consumer interactions where members of each party interact with members of the other party but not among themselves (unlike many food webs). The two-dimensional Shannon entropy (termed H2 in order to avoid confusion with the common one-dimensional H) is obtained as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M14','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M14">View MathML</a>

H2 decreases with higher specialization. This measure is closely related to the weighted mean of the non-standardized Kullback-Leibler distance of all species, since

<di> = <dj> = H2max - H2

(see below, Relationship between di and H2). H2 can be standardized between 0 and 1.0 for extreme specialization versus extreme generalization, respectively, when its minimum and maximum values (H2min and H2max) are known. H2min and H2max can be calculated for given constraints. The constraints used here are the maintenance of the total number of interactions of each species, thus all row and column totals, Ai and Aj, being fixed (see also [46]). Alternative constraints may be defined depending on the knowledge of the system studied.

H2 reaches its theoretical maximum where each pij equals its expected value from a random interaction matrix (qi·qj), such that

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M15','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M15">View MathML</a>

while its theoretical minimum (H2min) may be close to zero depending on the matrix architecture. Like for dmin above, H2max and H2min are constrained by the fact that they are derived from integer values. A program implementing a heuristic solution to obtain H2max and H2min, and to perform the entire analysis is available from the authors or online [49].

The degree of specialization is obtained as a standardized entropy on a scale between H2min and H2max as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M16','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M16">View MathML</a>

Consequently, H2' ranges between 0 and 1.0 for extreme generalization and specialization, respectively.

Comparison with random associations

H2 can be tested against a null model of random associations (H2ran). A number of random permutations of the matrix can be performed using a r × c randomization algorithm (also available at [49]). The probability (p-value) that the observed H2 is more specialized than expected by random associations is simply given as the proportion of values obtained for H2ran that are equal or larger than H2, a common procedure in randomization statistics [25,50]. H2ran is usually only slightly larger than H2min.Previously, permutations of r × c contingency tables often used a different test statistics instead of H2 [25,51,52]:

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M17','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M17">View MathML</a>

The relationship between T and H2 is described by a constant, the total number of interactions (m), as T = m·ln m - m·H2. Consequently, both methods yield exactly the same p-values.

Relationship between di and H2

In the following we derive the relationship between the individual levels of specialization (di) and the community level (H2). The non-standardized Kullback-Leibler distance for row i can be rewritten as

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M18','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M18">View MathML</a>

because <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M19','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M19">View MathML</a>.

The weighted mean of di for all i rows (each row weighted by qi) yields

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M20','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M20">View MathML</a>

since <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M21','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M21">View MathML</a> and <a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M22','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M22">View MathML</a>

While the first summand in the final equation for <di> equals -H2, the remaining two summands correspond to the maximum entropy H2max, because

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M23','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M23">View MathML</a>

Therefore,

<di> = H2max -H2.

The same calculation applies for <dj>, thus <di> = <dj>. Consequently, the degree of specialization of the entire network (corresponding to the deviation of the network-wide entropy from its maximum value) equals the weighted sum of the specialization of its elements (species).

Properties of alternative niche breadth measures

The standardized Hurlbert's (B') and Smith's (FT) measure can be applied widely for niche breadth analysis [21,22,24]. In this context, the Kullback-Leibler distance (d) can be viewed as a modified Shannon-Wiener measure of niche breadth that accounts for niche availabilities. Like the Kullback-Leibler distance, both B' and FT compare the proportional distribution of individuals (p) to the proportional resource availability (q) (here: partner availability). For a certain species i, the two measures are in our notation:

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M24','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M24">View MathML</a>

Each p'ij is the proportion of the number of interactions in relation to the respective row total, and qj is the proportion of all interactions by partner j in relation to the total number of interactions. Thus,

<a onClick="popup('http://www.biomedcentral.com/1472-6785/6/9/mathml/M25','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6785/6/9/mathml/M25">View MathML</a>

Both the standardized Hurlbert's (B') and Smith's (FT) measure range between 0 for the most specialized case to 1.0 for extreme generalization (broadest niche). In the context of niche breadth, it has been shown that the Shannon-Wiener measure is most sensitive, while Hurlbert's and particularly Smith's measure are less sensitive for the selection of rare resources [21] (see also [20]).

For the application in network analyses, however, both B' and FT may show some undesired properties. Generally, B', FT and d' are reasonably well correlated with each other across the species within a network (e.g., rs = -0.49 between d' and B', and rs = -0.36 between d' and FT for the 90 pollinators in the network of Vázquez and Simberloff [33], both p < 0.001). However, differences with d' are substantial when a highly specialized species interacts largely exclusively with a specialized partner, e.g. a specialized pollinator with a plant that is almost exclusively pollinated by this one. Imagine a scenario where one exclusive interaction occurs between a plant species and a pollinator species in a 3 × 3 matrix (Table 2). If the interaction between pollinator sp. 3 and plant sp. 3 is only infrequent (e.g. a33 = 1), all indices show a high degree of specialization (d' = 1.0, B' = 0, FT = 0.14) for both partners. However, as the number of exclusive interactions (a33) increases, the values for both B' and FT of pollinator sp. 3 and plant sp. 3 show a highly undesired change towards generalization, although a higher a33 is intuitively considered as extreme specialization (e.g., for a33 = 50 the values for pollinator sp. 3 are B' = 0.31 and FT = 0.70), while only d' remains unaffected (d' = 1.0). FT is always larger than zero, and B' becomes larger than zero when the specialists interact more frequently than one of the other partners, thus when qj > min(q1, q2, ... qc). Both FT and B' approach a value of 1.0 (maximum generalization) for very large a33. This undesired effect of FT and B' is not restricted to completely exclusive interactions between two partners.

Table 2. Association matrix example. Fictive association matrix between three pollinator species and three plant species. Numbers in each cell are counts of interaction frequencies.

Simulation of sampling effort and matrix architecture

Two published plant-pollinator networks were selected to investigate the behavior of different specialization measures [32,33]. Both articles use their observed interaction matrices as a model to discuss network properties based on the number of links per pollinator or plant species, allowing a comparison of conclusions drawn. Both networks may be compared as they comprise relatively large datasets from temperate ecosystems, reporting interaction frequencies between plants and their floral visitors: the British meadow community studied by Memmott [32] involved 79 pollinator and 25 plant species (2183 pollinator visits observed), the forests in Argentina studied by Vázquez and Simberloff [33] involved 90 pollinator and 14 plant species (5285 visits). The datasets can be obtained from the Interaction Web Database [53]. We simulated a decreased sampling intensity in both networks using a rarefaction method in order to investigate how sampling effort affects the estimation of specialization indices. Real association matrices were reduced by randomly extracting interactions, e.g. from the total of m = 2183 visits in Memmott's web down to m = 5 visits (in steps of five, repeated ten times for each m).

In order to compare the null model characteristics of the specialization measures, we simulated artificial matrices with randomly associated partners and plotted the indices against an increasing number of partners and/or total number of interactions. We assumed that the total frequency of participating species approximates a lognormal distribution, which is typical for biological communities [21,22,24]. All row and column totals were randomly generated from a lognormal distribution (μ = 50, = 1) that was scaled to the desired total number of interactions. Ten different combinations of row and column totals were obtained for each matrix size and taken as template to randomly associate the partners five times, thus each matrix size was represented by 50 random associations.

Authors' contributions

NB1 conceived of the study and all authors (NB1, FM, NB2) were involved in designing the methods, analyses, interpretation and drafting the manuscript.

Acknowledgements

We thank Diego Vázquez, Pedro Jordano, Thomas Hovestadt, and Michel Loreau for helpful comments and valuable discussion on earlier versions of this manuscript and the Interaction Web Database [53] for providing the datasets used here.

References

  1. Olesen JM, Jordano P: Geographic patterns in plant-pollinator mutualistic networks.

    Ecology 2002, 83:2416-2424. OpenURL

  2. Novotny V, Basset Y: Host specificity of insect herbivores in tropical forests.

    Proc R Soc London Ser B 2005, 272:1083-1090. Publisher Full Text OpenURL

  3. Waser NM, Chittka L, Price MV, Williams NM, Ollerton J: Generalization in pollination systems, and why it matters.

    Ecology 1996, 77:1043-1060. Publisher Full Text OpenURL

  4. Jordano P: Patterns of mutualistic interactions in pollination and seed dispersal: connectance, dependence asymmetries, and coevolution.

    Am Nat 1987, 129:657-677. Publisher Full Text OpenURL

  5. Bascompte J, Jordano P, Melian CJ, Olesen JM: The nested assembly of plant-animal mutualistic networks.

    Proc Natl Acad Sci USA 2003, 100:9383-9387. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  6. Waser NM, Ollerton J, Eds: Plant-pollinator interactions: from specialization to generalization. Chicago: University of Chicago Press; 2006.

  7. Ollerton J, Cranmer L: Latitudinal trends in plant-pollinator interactions: are tropical plants more specialised?

    Oikos 2002, 98:340-350. Publisher Full Text OpenURL

  8. Devoto M, Medan D, Montaldo NH: Patterns of interaction between plants and pollinators along an environmental gradient.

    Oikos 2005, 109:461-472. Publisher Full Text OpenURL

  9. Winemiller KO: Must connectance decrease with species richness?

    Am Nat 1989, 134:960-968. Publisher Full Text OpenURL

  10. Martinez ND: Constant connectance in community food webs.

    Am Nat 1992, 139:1208-1218. Publisher Full Text OpenURL

  11. May RM: Will a large complex system be stable?

    Nature 1972, 238:413-414. PubMed Abstract | Publisher Full Text OpenURL

  12. Rejmánek M, Starý P: Connectance in real biotic communities and critical values for stability of model ecosystems.

    Nature 1979, 280:311-313. Publisher Full Text OpenURL

  13. Vázquez DP, Simberloff D: Ecological specialization and susceptibility to disturbance: conjectures and refutations.

    Am Nat 2002, 159:606-623. Publisher Full Text OpenURL

  14. Dunne JA, Williams RJ, Martinez ND: Network structure and biodiversity loss in food webs: robustness increases with connectance.

    Ecol Lett 2002, 5:558-567. Publisher Full Text OpenURL

  15. Magurran AE: Ecological diversity and its measurement. Princeton: Princeton University Press; 1988. OpenURL

  16. Sahli HF, Conner JK: Characterizing ecological generalization in plant-pollination systems.

    Oecologia 2006, 148:365-372. PubMed Abstract | Publisher Full Text OpenURL

  17. Parrish JAD, Bazzaz FA: Difference in pollination niche relationships in early and late successional plant communities.

    Ecology 1979, 60:597-610. Publisher Full Text OpenURL

  18. Basset Y: Diversity and abundance of insect herbivores foraging on seedlings in a rainforest in Guyana.

    Ecol Entomol 1999, 24:245-259. Publisher Full Text OpenURL

  19. Herrera CM: Plant generalization on pollinators: species property or local phenomenon?

    Am J Bot 2005, 92:13-20. OpenURL

  20. Bersier LF, Banasek-Richter C, Cattin MF: Quantitative descriptors of food-web matrices.

    Ecology 2002, 83:2394-2407. OpenURL

  21. Krebs CJ: Ecological Methodology. Menlo Park: Benjamin Cummings; 1999. OpenURL

  22. Hurlbert SH: Measurement of niche overlap and some relatives.

    Ecology 1978, 59:67-77. Publisher Full Text OpenURL

  23. Feinsinger P, Spears EE, Poole RW: A simple measure of niche breadth.

    Ecology 1981, 61:27-32. Publisher Full Text OpenURL

  24. Smith EP: Niche breadth, resource availability, and inference.

    Ecology 1982, 63:1675-1681. Publisher Full Text OpenURL

  25. Fonseca CR, Ganade G: Asymmetries, compartments and null interactions in an Amazonian ant-plant community.

    J Anim Ecol 1996, 65:339-347. Publisher Full Text OpenURL

  26. Kenny D, Loehle C: Are food webs randomly connected?

    Ecology 1991, 72:1794-1799. Publisher Full Text OpenURL

  27. Goldwasser L, Roughgarden J: Sampling effects and the estimation of food-web properties.

    Ecology 1997, 78:41-54. Publisher Full Text OpenURL

  28. Vázquez DP, Aizen MA: Asymmetric specialization: a pervasive feature of plant-pollinator interactions.

    Ecology 2004, 85:1251-1257. OpenURL

  29. Kay KM, Schemske DW: Geographic patterns in plant-pollinator mutualistic networks: comment.

    Ecology 2004, 85:875-878. OpenURL

  30. Vázquez DP, Morris WF, Jordano P: Interaction frequency as a surrogate for the total effect of animal mutualists on plants.

    Ecol Lett 2005, 8:1088-1094. Publisher Full Text OpenURL

  31. Borer ET, Anderson K, Blanchette CA, Broitman B, Cooper SD, Halpern BS, Seabloom EW, Shurin JB: Topological approaches to food web analyses: a few modifications may improve our insights.

    Oikos 2002, 99:397-401. Publisher Full Text OpenURL

  32. Memmott J: The structure of a plant-pollinator food web.

    Ecol Lett 1999, 2:276-280. Publisher Full Text OpenURL

  33. Vázquez DP, Simberloff D: Changes in interaction biodiversity induced by an introduced ungulate.

    Ecol Lett 2003, 6:1077-1083. Publisher Full Text OpenURL

  34. Dicks LV, Corbet SA, Pywell RF: Compartmentalization in plant-insect flower visitor webs.

    J Anim Ecol 2002, 71:32-43. Publisher Full Text OpenURL

  35. Vázquez DP, Aizen MA: Null model analyzes of specialization in plant-pollinator interactions.

    Ecology 2003, 84:2493-2501. OpenURL

  36. Auerbach MJ: Stability, probability, and the topology of food webs. In Ecological communities: conceptual issues and the evidence. Edited by Strong DR, Simberloff D, Abele LG, Thistle AB. Princeton: Princeton University Press; 1984:413-436. OpenURL

  37. Gotelli NJ, Graves GR: Null models in ecology. Washington: Smithsonian Institution; 1996. OpenURL

  38. Bascompte J, Jordano P, Olesen JM: Asymmetric coevolutionary networks facilitate biodiversity maintenance.

    Science 2006, 312:431-433. PubMed Abstract | Publisher Full Text OpenURL

  39. Memmott J, Waser NM, Price MV: Tolerance of pollination networks to species extinctions.

    Proc R Soc London Ser B 2004, 271:2605-2611. Publisher Full Text OpenURL

  40. Johnson SD, Steiner KE: Generalization versus specialization in plant pollination systems.

    Trends Ecol Evol 2000, 15:140-143. PubMed Abstract | Publisher Full Text OpenURL

  41. Symons FB, Beccaloni GW: Phylogenetic indices for measuring the diet breadths of phytophagous insects.

    Oecologia 1999, 119:427-434. Publisher Full Text OpenURL

  42. Webb CO, Ackerly DD, McPeek MA, Donoghue MJ: Phylogenies and community ecology.

    Annu Rev Ecol Syst 2002, 33:475-505. Publisher Full Text OpenURL

  43. Novotny V, Basset Y, Miller SE, Weiblen GD, Bremer B, Cizek L, Drozd P: Low host specificity of herbivorous insects in a tropical forest.

    Nature 2002, 416:841-844. PubMed Abstract | Publisher Full Text OpenURL

  44. Jordano P, Bascompte J, Olesen JM: Invariant properties in coevolutionary networks of plant-animal interactions.

    Ecol Lett 2003, 6:69-81. Publisher Full Text OpenURL

  45. Vázquez DP: Degree distribution in plant-animal mutualistic networks: forbidden links or random interactions?

    Oikos 2005, 108:421-426. Publisher Full Text OpenURL

  46. Gotelli NJ: Null model analysis of species co-occurrence patterns.

    Ecology 2000, 81:2606-2621. Publisher Full Text OpenURL

  47. Fenster CB, Armbruster WS, Wilson P, Dudash MR, Thomson JD: Pollination syndromes and floral specialization.

    Annu Rev Ecol Evol Syst 2004, 35:375-403. Publisher Full Text OpenURL

  48. Kullback S, Leibler RA: On information and sufficiency.

    Ann Math Stat 1951, 22:79-86. OpenURL

  49. Montecarlo statistics on RxC matrices [http://itb.biologie.hu-berlin.de/~nils/stat/] webcite

  50. Manly B: Randomization bootstrap and Monte Carlo methods in biology. London: Chapman and Hall; 1997. OpenURL

  51. Blüthgen N, Verhaagh M, Goitía W, Blüthgen N: Ant nests in tank bromeliads – an example of non-specific interaction.

    Insect Soc 2000, 47:313-316. Publisher Full Text OpenURL

  52. Patefield WM: An efficient method of generating random RxC tables with given row and column totals.

    Appl Stat 1981, 30:91-97. Publisher Full Text OpenURL

  53. Interaction web database [http://www.nceas.ucsb.edu/interactionweb/] webcite