Abstract
Background
When modelling infectious diseases, accurately capturing the pattern of dissemination through space is key to providing optimal recommendations for control. Mathematical models of disease spread in livestock, such as for footandmouth disease (FMD), have done this by incorporating a transmission kernel which describes the decay in transmission rate with increasing Euclidean distance from an infected premises (IP). However, this assumes a homogenous landscape, and is based on the distance between point locations of farms. Indeed, underlying the spatial pattern of spread are the contact networks involved in transmission. Accordingly, areaweighted tessellation around farm point locations has been used to approximate fieldcontiguity and simulate the effect of contiguous premises (CP) culling for FMD. Here, geographic data were used to determine contiguity based on distance between premises’ fields and presence of landscape features for two sample areas in Scotland. Sensitivity, positive predictive value, and the True Skill Statistic (TSS) were calculated to determine how point distance measures and areaweighted tessellation compared to the ‘gold standard’ of the mapbased measures in identifying CPs. In addition, the mean degree and density of the different contact networks were calculated.
Results
Utilising point distances <1 km and <5 km as a measure for contiguity resulted in poor discrimination between mapbased CPs/nonCPs (TSS 0.2790.344 and 0.3850.400, respectively). Point distance <1 km missed a high proportion of mapbased CPs; <5 km point distance picked up a high proportion of mapbased nonCPs as CPs. Areaweighted tessellation performed best, with reasonable discrimination between mapbased CPs/nonCPs (TSS 0.6170.737) and comparable mean degree and density. Landscape features altered network properties considerably when taken into account.
Conclusion
The farming landscape is not homogeneous. Basing contiguity on geographic locations of field boundaries and including landscape features known to affect transmission into FMD models are likely to improve individual farmlevel accuracy of spatial predictions in the event of future outbreaks. If a substantial proportion of FMD transmission events are by contiguous spread, and CPs should be assigned an elevated relative transmission rate, the shape of the kernel could be significantly altered since ability to discriminate between mapbased CPs and nonCPs is different over different Euclidean distances.
Background
Despite implementation of a national livestock movement ban 3 days after the first confirmed case of footandmouth (FMD) in the UK in 2001, the disease continued to spread through the farming landscape [1]. Such spread is thought to have occurred mainly by nosenose contact of livestock across shared fence lines and by contaminated fomites carried on people, vehicles, machinery, or blown by wind between premises [1]. Mathematical models were developed in order to capture the likely spread through space, to predict the likely impact of control strategies, and consequently to inform disease control policies implemented [2,3]. To describe the spatial pattern of spread, a transmission kernel was incorporated into the model. This kernel described the decay in rate of transmission to susceptible livestock premises with increasing Euclidean distance from an infected premises (IP) source (calculated between farm premises point locations). For the model of Keeling et al. this was derived from infection tracing following the livestock movement ban [3]. While this model captured the regional pattern of spread well, accuracy at the individual farm level was low for IPs, with about 12% of reported case premises over the duration of the epidemic being captured by simulations [4]. Although this low accuracy is in part due to stochastic variation, assumed homogeneity of the landscape by the kernel is also likely to have contributed.
In addition to incorporating space by using the spatial transmission kernel, contiguous premises (CPs) (farm premises neighbouring infected premises which were at highly elevated risk of infection) were modelled by areaweighted tessellation in order to examine the likely effect of culling CPs [3,5]. Areaweighted tessellation uses the known land areas and the known point locations of premises to construct weighted Voronoi polygons around the points. Voronoi polygons are constructed by connecting the perpendicular bisectors of lines between pairs of points, where only the closest bisectors are considered. This results in tessellated polygons, where any point within a polygon will be closer to the point around which the polygon was constructed than any other. Areaweighting this process means that the squareroot of the known land area of each point pulls or pushes the perpendicular bisector towards or away from a point, depending on the comparative size of the squareroot of the paired farm’s area. Contiguity is then based on having a shared polygon edge. This technique was applied to Great Britain’s farm premises, as recorded by the June 2000 agricultural census, to determine which farms were contiguous to other farms, and culling of CPs within model simulations were determined on this basis [3].
Based in part on the outputs of these models, preemptive culling of livestock contiguous to infected premises (IPs), livestock thought to be dangerous contacts (DCs) of an IP, and livestock within 3 km of/local to an IP was performed in 2001 [5]. While this control strategy did eventually bring the epidemic to a halt, it has been suggested that it could have been better targeted to reduce the epidemic duration and impact since it appeared that, as implemented in practice, low risk premises were targeted over higher risk premises [6]. Additionally, heterogeneities in the fragmentation of the livestock farming landscape across the country suggest that some regions did not require preemptive culls for disease containment [7]. The epidemic cost the UK economy approximately £6bn [8]. Thus, appropriate control strategies are necessary to reduce any future epidemic’s impact in terms of the number of livestock affected and the cost to the economy. Greater predictive accuracy of mathematical models may increase trust, and consequently compliance with suggested control strategies in practice. The 2001 FMD transmission kernel developed by Keeling and collaborators indicated that approximately 50% of transmission occurred within 3 km of an IP after the implementation of a livestock movement ban [9] – thus local spread is important, but there is a lack of understanding as to how this is related to true contiguity.
While the approximations used in the models will clearly, to some degree, capture the essence of spatial proximity, they are yet to be assessed for their accuracy in this respect. A kernel based on Euclidean distance between point locations not only fails to recognise that farms in reality are areas, but also that the landscape is nonhomogenous and that transmission potential is therefore not equal in all directions. Although areaweighted Voronoi polygons consider farms as areas, these are derived from point locations and therefore may not reflect how farms share boundaries in reality. Additionally, geographical features such as rivers, ditches and railways may act as barriers to transmission, and therefore prevent contiguity in terms of disease transmission [10].
We consider that the level of risk a premises is perceived to be at, based on its point distance from an IP, may be altered by knowing actual premises contiguity, particularly in the case of contact spread diseases such as FMD since the distance between two farm point locations may be considerable despite their fields actually being in contact. Thus, at the extreme end of the spectrum, the decay in risk with increasing Euclidean distance may simply explain the distribution of point distances between actual CPs.
Different methods of incorporating the spatial arrangement of farm premises into mathematical models of infectious diseases among livestock may have considerable impact on predicted epidemic size, distribution, and optimal control strategies. Therefore, this paper aims to compare the properties of the contact networks that arise from the classification of farm premises as being in contact by point distance measures, by Voronoi and areaweighted Voronoi tessellation, and by maps showing the field boundaries of premises and geographical features that surround them. Additionally, how well approximation methods capture farm premises considered to be in contact (the term CP will be used to describe contact) according to field edge distance and presence of geographical features will be assessed. Another measure based solely on distance between the closest field edges of premises will also be added to the comparison as such measures have recently been used in statistical analysis of bovine tuberculosis persistence [11]. Areas in Ayrshire and Aberdeenshire were chosen to evaluate these measures since they are both important livestock farming areas, but with different farm types dominating: Ayrshire consisting mainly of dairy cattle farming, and Aberdeenshire consisting of a mixture of cattle (mainly beef), sheep, pig and crop production [12,13].
Methods
Spatial data were visualised and manipulated in ArcGIS version 9.3 (ESRI, Redlands, CA, USA). Farm premises point locations were obtained from the Animal Health and Veterinary Laboratories Agency (AHVLA). Fields of farm premises were obtained from the Integrated Administration and Control System (IACS) dataset from 2006. The June 2006 Agricultural Census data was matched to the point location data based on the countyparishholding (CPH) number to select only premises with any cattle, sheep or pigs. A sample study area was then selected within each of Aberdeenshire and Ayrshire based on the point locations of premises being within an area of approximately 15x15 km. The point locations of these premises were then matched up with the IACS field data based on the parishholding (PH) component of the CPH number. The distance between PHmatched point and field locations were calculated using the ArcGIS ‘Generate Near Table’ tool.
Ordnance Survey (OS) MasterMap® Topography Layer data, at a varying scale of 1:1250 to 1:10000, was used to map geographical features. The OS MasterMap® data used for Ayrshire was provided direct from the OS (updated on 23/08/2012), whereas for Aberdeenshire the data was downloaded from EDINA Digimap (EDINA Digimap Ordnance Survey Service <http://edina.ac.uk/digimap>, downloaded March 2012, updated on 08/06/2011). For Ayrshire roads were indicated by topographic lines where DescGroup = “Road Or Track”, and tracks by topographic areas where Theme = “Roads Tracks And Paths”; for Aberdeenshire roads and tracks were indicated by topographic lines where Theme = “Land; Roads Tracks And Paths”. In both sample areas rivers >2 m wide were indicated by sets of double topographic lines where DescGroup = “Inland Water”, and inland water courses ≤2 m wide (henceforth referred to as small rivers/ditches) were indicated by single topographic lines where DescGroup = “Inland Water”. Railways were indicated by topographic lines where Theme = “Rail”.
Defining Contiguous Premises (CPs)
For each of the Aberdeenshire and Ayrshire samples a dataset was then created whereby every premises was paired to every other premises within 7 km of it, in terms of Euclidean distance between point locations. From this dataset each premises pair was then classified as being contiguous or not contiguous according to eight CP approximation definitions:
a) <1 km distance between point locations of premises;
b) <3 km distance between point locations of premises;
c) <5 km distance between point locations of premises;
d) <26 m distance between premises field edges at their closest point;
e) <151 m distance between premises field edges at their closest point;
f) <1 km distance between premises field edges at their closest point;
g) sharing a Voronoi polygon edge;
h) sharing an areaweighted Voronoi polygon edge.
The Voronoi polygons were generated from the point locations in ArcGIS. A wider sample of points was used to create the Voronoi polygons to act as a buffer so that withinsample the polygons were not influenced by edge effects. This dataset was checked for occurrences where point locations were shared by different premises. These could arise where two premises shared the same postcode, and where each premises’ point location was derived from that postcode. Where this happened, the pairs were taken to be CPs with each other, and to have identical other CPs. The areaweighted Voronoi polygons were weighted by known premises area. This was scripted and run in MATLAB (The MathWorks, Inc., Nat ick, MA, USA). Distances between point locations, field boundaries, and shared Voronoi polygon edges were calculated using the ArcGIS ‘Generate Near Table’ tool.
Maps of IACS and OS MasterMap data were checked visually to assess whether each premises pair actually shared a fence boundary, had fence boundaries separated by <15 m, were separated by a road/track or railway, were divided by a river or by a small river/ditch. The entire length of each premises boundary was considered. The relative length of each type of separation between premises was not considered such that if the premises shared a boundary at any point, they were classified as having a shared boundary, regardless of the boundary length. For classification in terms of separation by landscape features, the premises pairs would only be classified as such if the entire length of the shared boundary appeared to be separated by this feature. In cases where premises were separated along the entire boundary by more than one types of geographic feature, but where each feature type did not run the entire length of the boundary, the feature with the lowest perceived ‘barrier effect’ was taken to be the feature of separation (small river/ditch < road/track < river). Only one premises pair had a railway line running the entire length of their shared boundary in Ayrshire, and no premises were separated by railway in Aberdeenshire. Thus separation by railways was not included for the purposes of this analysis.
Based on map inspection, nine further definitions of being contiguous were then considered:
(i) having any fields separated up to a maximum distance of 15 m;
(ii) having any fields separated up to a maximum distance of 15 m not including premises divided by a river;
(iii) having any fields separated up to a maximum distance of 15 m not including premises separated by a road/track;
(iv) having any fields separated up to a maximum distance of 15 m not including premises divided by a river or separated by a road/track;
(v) having any fields separated up to a maximum distance of 15 m not including premises divided by a river or small river/ditch or separated by a road/track;
(vi) having any fields separated up to a maximum distance of 15 m not including premises divided by a river or small river/ditch;
(vii) having fields with a shared boundary (i.e. no separation);
(viii) having fields with a shared boundary not including premises divided by a river;
(ix) having fields with a shared boundary not including premises divided by a river or small river/ditch.
The cumulative number of mapbased CPs, according to the nine definitions (iix) listed above, with 0.25 km increases in Euclidean point distance was calculated.
Measuring agreement between the different CP definitions
Symmetric matrices of the premises in the samples were produced for each of the seventeen definitions of contiguity (approximation methods ah, and mapbased methods iix) using R version 2.13.2 (R Development Core Team, Vienna, Austria, <http://www.Rproject.org/>). Each element took the value 0 or 1 depending on whether the premises pairs were noncontiguous or contiguous under the definition, respectively. Agreement between matrices of different CP definitions was estimated using four measures: concordance, sensitivity (Se), positive predictive value (PPV), and True Skill Statistic (TSS), where:
● Concordance = (TP + TN)/ (TP + FP + FN + TN),
● Se = TP / (TP + FN),
● PPV = TP / (TP + FP),
● TSS = (sensitivity + specificity  1); where Specificity = TN / (FP + TN), and where TP = true positive, FP = false positive, TN = true negative, FN = false negative.
Concordance, Se and PPV were multiplied by 100 to give a percentage.
Calculating Se of point distance, field edge distance, and tessellation measures against a ‘gold standard’ of mapbased contiguity as defined by field edge separation and landscape features, enabled us to study how many farm premises were missed by the approximation methods that were contiguous under the mapbased definitions (by identifying the proportion of mapbased CPs that were correctly identified by each method). PPV enabled us to examine how many farm premises the approximation methods picked up that were not actually contiguous, by giving the proportion of approximation method CPs that were contiguous under the mapbased definitions. TSS gave an overall assessment of how well the approximation methods discriminated between contiguous and noncontiguous premises pairs as defined by mapbased methods.
TSS was used in preference to Kappa as it provides a similar measure of accuracy of the discrimination of two methods for a binary outcome, without being affected by prevalence [14]. This measure, also known as the Hanssen and Kuipers statistic and Youden’s Index, has values ranging from −1 to +1 and has previously been used to assess the accuracy of weather prediction models [1518].
The methodology used means that there was some room for human error in the classification of contiguity based on presence of landscape features along or between farm premises boundaries. To minimise this, the boundaries of CP pairs were checked twice, and the symmetry of the resulting matrices was verified using the command ‘isSymmetric’ in R, with maps being rechecked in the event of apparent asymmetry.
Network properties of different CP definitions
Network density and mean degree were calculated for a subset of the contiguous definitions. Density was calculated using the ‘igraph’ package in R, and was calculated on the sample premises only. In order to correct for edge effects in the calculation of mean degree, new data sets were created to count all CPs associated with farm premises within the sample, rather than only other premises from within the sample. For field edge based contiguity, all premises with fields listed in IACS with any cattle, sheep or pigs were included (this meant there were some premises within the sample zone not previously included as they did not belong to a point location within the selected area). For point distance based contiguity, all premises with any cattle, sheep or pigs and point locations that matched up to IACS field data were included. Mean degree was calculated by species kept on holding for the categories that had ≥5 holdings in, for all mapbased CP definitions and areaweighted tessellation.
Results
In the Aberdeenshire sample 113 premises points were first selected, but only 107 (94.7%) could be linked to fields within the IACS database. Of these point locations, 98 (91.6%) were sourced from an address match, 6 (5.6%) from a postcode match and 3 (2.8%) from the parish centroid. Four pairs of premises shared identical point locations; three of these were sourced from address matches, and one from a postcode match. For the Ayrshire sample 197 premises points were first selected, of which only 184 (93.4%) could be linked to fields within the IACS database. Of these point locations, 156 (84.8%) were sourced from an address match, 20 (10.9%) from a postcode match and 8 (4.3%) from the parish centroid. Seven pairs and one triplet of premises shared identical point locations. Five of the pairs with identical point locations were sourced from an address match, and one from a postcode match.
In the Aberdeenshire sample, 88.8% (n = 95) of premises point locations were <60 m from their CPHmatched nearest field; 2.8% (n = 3) were separated by 601000 m, and the remaining 8.4% (n = 9) by ≥1000 m. In the Ayrshire sample, 83.7% (n = 154) had point locations <60 m from their CPHmatched nearest field, while 8.2% (n = 15) were separated by 601000 m, and 8.2% (n = 15) by ≥1000 m. The least accurate of the point location sources was the parish centroid, followed by the postcode. The distribution of the PHmatched pointfield distances by the point location information source can be seen in Additional file 1.
Additional file 1. Distribution of distances between point location and nearest field location by point location source.
Format: DOCX Size: 18KB Download file
The majority of premises in the Ayrshire sample kept cattle only (70.1%), and no premises kept any pigs (Table 1). The median area of the farm premises was 73.5 hectares (IQR: 51.9104.8), with a median of 16 fields (IQR: 11–22) (mean = 17.7). In the Aberdeenshire sample 47.7% of all premises kept cattle and sheep, while just over a third kept cattle only (34.6%), and only six holdings kept pigs (Table 1). The median area of the farm premises was 76.4 hectares (IQR: 40.0174.0), with a median of 19 fields (IQR: 11–32) (mean = 22.0).
Table 1. Distribution of types of livestock kept on premises in samples
Agreement between the different CP definitions
Considering farms to be contiguous if they lie within 7 km Euclidean distance of one another’s point locations captured 98.1% (153/156) and 97.8% (348/356) of CP premises pairs that were separated by <15 m at their field edges in Aberdeenshire and Ayrshire, respectively. The pattern of mapbased CP identification over increasing Euclidean distance between the premises point locations differed slightly between Aberdeenshire and Ayrshire (Figure 1). In Aberdeenshire, the number of mapbased CPs identified began to plateau at 2.5 km point distance, such that 88.9% (n = 136) of premises separated by <15 m at their field edges were captured within 2.5 km. In Ayrshire however, the plateau was less distinct, and began at around 3.25 km; 88.8% (n = 309) of premises separated by <15 m at their field edges were captured by this distance.
Figure 1. Number of premises in contact by mapbased measures up to 7 km point distance.
Concordance of approximation measures was very high for point distances <1 km, field edge distances <1 km, and Voronoi and areaweighted tessellation for both Aberdeenshire and Ayrshire (all >87% agreement with mapbased contiguity measures) (Additional file 2). This however was distinctly biased towards noncontiguous pair agreements (True Negatives).
Additional file 2. Concordance (%) of different definitions of being contiguous for sample areas in Aberdeenshire and Ayrshire.
Format: XLSX Size: 13KB Download file
Sensitivity was therefore calculated to find the proportion of mapbased CPs that were correctly identified by the approximation methods. Sensitivity was fairly consistent between mapbased contiguity measures. For measures based on point distances, sensitivity was low for <1 km, only reaching >94% at point distances <5 km (Table 2). Ayrshire had a higher average sensitivity at <1 km point distance compared to Aberdeenshire (Ayrshire 33.8%; Aberdeenshire 30.3%), but lower average sensitivity at <3 km point distance (Ayrshire 87.4%; Aberdeenshire 92.0%). Both samples reached an average of about 96% sensitivity at 5 km point distance. The two tessellation methods identified a higher average of mapbased CPs in Aberdeenshire (Voronoi tessellation = 73.6%; areaweighted tessellation = 83.4%) than in Ayrshire (Voronoi tessellation = 63.5%; areaweighted tessellation = 68.0%). Field edge distance measures were 100% sensitive by definition (Table 2).
Table 2. Sensitivity (%) of approximation methods versus mapbased measures for sample areas in Aberdeenshire and Ayrshire
PPV identified the proportion of approximation method CPs that were CPs under mapbased methods, so that a low value indicates that only a low proportion of those identified are mapbased CPs. For both samples PPV was consistently low (<50%) through the different mapbased CP definitions for point distances <3 km and <5 km, field edge distance <1 km, and Voronoi and areaweighted tessellation (Table 3). For point distances <1 km, Aberdeen had a higher average PPV of 55.1% compared to Ayrshire which had an average PPV of 48.1%. As expected, the highest PPV was for field edge distance <26 m, and this was similar between the two samples (Aberdeenshire range 66.393.9%; Ayrshire range 66.996.1%).
Table 3. PPV (%) of approximation methods versus mapbased measures for sample areas in Aberdeenshire and Ayrshire
The highest TSS scores were found for the field edge distance measures (Table 4). Out of point distance measures, <3 km had the highest TSS score (Aberdeenshire range 0.6860.712; Ayrshire range 0.6620.680). Point distances of <5 km and <1 km had average TSS scores of 0.393 and 0.289 in Aberdeenshire and 0.390 and 0.324 in Ayrshire, respectively. Voronoi and areaweighted tessellation had average TSS scores of 0.647 and 0.727 in Aberdeenshire and 0.588 and 0.626 in Ayrshire, respectively.
Table 4. TSS of different definitions of being contiguous for sample areas in Aberdeenshire and Ayrshire
Network properties
The mean degree (i.e. mean number of CPs) was slightly higher in Ayrshire than in Aberdeenshire for all definitions of contact (Table 5). Overall, the mean degree range for the Aberdeenshire sample was 2.673.92 and for the Ayrshire sample was 3.214.64, for all mapbased CP definitions. The mean degree of CPs defined as those <15 m separated at their field boundaries dropped by 1.22 and 1.34 in Aberdeenshire and Ayrshire, respectively, when the presence of all landscape features (rivers, ditches and roads/tracks) were taken to restrict contact (distribution shown in Figure 2). For CPs defined by having a shared boundary, the presence of rivers and ditches reduced the mean degree by 0.40 and 0.51 in Aberdeenshire and Ayrshire, respectively (distribution shown in Figure 2). For the point distance CP definitions, <1 km considerably underestimated mean degree when compared to mapbased CP definitions, particularly in Aberdeenshire, whereas <3 km considerably overestimated it, particularly in Ayrshire. Areaweighted tessellation also overestimated mean degree compared to mapbased CP definitions, although to a lesser extent than <3 km point distance. Holdings that kept only sheep had a mean degree between 0.851.52 and 1.132.07 less than holdings that kept cattle only or cattle and sheep, in Aberdeenshire and Ayrshire respectively, across all mapbased CP definitions. Areaweighted tessellation (Figure 3) and point distance measures (not shown) did not identify this difference.
Table 5. Network properties according to different contiguity definitions for farm premises in Aberdeenshire and Ayrshire
Figure 2. Frequency distributions of number of neighbours according to different definitions of mapbased contiguity.
Figure 3. Mean degree by species kept on holding, under different definitions of contiguity.
Aberdeenshire had a higher density than Ayrshire for each definition except <1 km point distance, for which the two samples were equal (Table 5). The range of density values for all mapbased CP definitions were 0.0190.027 for Aberdeenshire and 0.0140.021 for Ayrshire. For CPs defined by <1 km point distance, density was 0.012 for both samples. This was only slightly less than for CPs in Ayrshire defined by a shared boundary excluding those with rivers and ditches between. For Aberdeenshire however, this was about half the density of most of the mapbased CP definitions. For CPs defined by <3 km point distance, density was quadrupled in Aberdeenshire when compared to <15 separation of field boundaries, and quintupled in Ayrshire (Table 5). Areaweighted tessellation overestimated density less than <3 km point distance did for both sample networks.
Discussion
The point locations of farm premises were not completely accurate: distances between the CPHmatched point and field locations were ≥1 km in 8.4% and 8.2% of the sample in Aberdeenshire and Ayrshire, respectively. Overall, <3 km point distance had the most balanced identification of mapbased CPs and mapbased nonCPs when compared to each the <1 km and <5 km categories, and therefore had the highest TSS score of point distances.
Point distance measures do not seek to classify premises within any given distance as contiguous, rather that they are given a weighted level of risk based on the distance from an IP. By comparing these measures against mapbased contiguity as if they also defined contiguity does, however, enable us to begin to consider how accounting for mapbased contiguity might alter the shape of the transmission kernel. In reality, during the FMD 2001 outbreak, preemptive culling was in part determined by identification of CPs on the ground, since they were considered to be at increased risk of becoming infected. Therefore, if contiguous spread does account for a considerable proportion of transmission events IPs would have an elevated rate of transmission relative to true CPs, regardless of Euclidean point distance between the premises. This would leave transmission events attributable to routes other than those linked to contiguity (e.g. fence line contact, fomites blown between premises), to be captured by the kernel. Crudely, this might be thought of as considering only the relative rate of transmission to mapbased nonCPs based on distance between the premises, although in reality mapbased CPs would be at risk from these alternative transmission routes as well. Nonetheless this would likely change the shape of the kernel more at small distances than those further away, since at <1 km point distance, an average of 44.9% and 51.9% were mapbased nonCPs in Aberdeenshire and Ayrshire, respectively, but at <5 km these figures were 91.4% and 93.9%, respectively. Indeed, once contiguous transmission is separated out from the kernel, it might be the case that another distance measure such as road distance, as previously considered by Savill et al. [9], better represents the distancerisk relationship for noncontiguous mechanisms of spread.
In both sample areas, Voronoi tessellation had a slightly lower TSS than for <3 km point distance. Areaweighted tessellation on the other hand had a slightly higher TSS than for <3 km point distance in Aberdeenshire, but slightly lower TSS in Ayrshire. This suggests that, in terms of discrimination between mapbased CPs and nonCPs, <3 km point distance and areaweighted tessellation perform similarly, and that the best option may be determined by the landscape of the area that the method is to be applied to. Voronoi and areaweighted tessellation measures performed better overall in Aberdeenshire than in Ayrshire, with somewhat higher TSS scores likeforlike. This may be attributed to sensitivity being considerably poorer in Ayrshire, such that more mapbased CPs were being missed by the tessellations. This in turn was likely to be due to the greater density of farm premises in the sample, leading to a greater distortion of contiguity when tessellating around more tightly packed points. Thus in areas of high livestock farm density, tessellation methods may capture contiguity between farm premises with less accuracy than in lower density areas. While the low levels of accuracy (≈2025%) reported for predicting culled farms by an adapted version of the Keeling et al. (2001) model [4] are likely due largely to the complex ‘on the ground’ implementation of culling during the 2001 FMD outbreak, the less than perfect performance of areaweighted tessellation in discriminating between mapbased CPs and nonCPs may also have been a contributing factor.
The distances used for field edge based measures in this paper have been used to analyse the persistence of bovine tuberculosis (bTB) [11]. These definitions were far superior to either point distance or tessellation approximations in identifying mapbased CPs in the two samples, reflected in their consistently high TSS scores (≥0.868). By definition they captured all of the mapbased CPs as these were also calculated based on field edge distance, only using smaller distances of separation. However, PPV indicated that landscape features do interrupt mapbased CP boundaries – accounting for up to 29.4% decrease in PPV when all landscape features were taken into account (for Aberdeenshire, from 93.9% for all separated <15 m at field edges to 66.3% for all separated <15 m at field edges excluding those separated by rivers, roads/tracks, and ditches). While this may vary depending on the area of study and the landscape features considered to have an effect on a particular disease’s transmission, it suggests that the way in which premises are perceived to be connected may be substantially altered after taking them into account. Indeed, the mechanism of spread of different diseases must be considered when studying the effects of contiguity. For example, the spread of bTB via badgertocattle as well as cattletocattle routes means that extended distances between field edges are likely to be appropriate since badgers can roam freely. However, there is some evidence to suggest that bTB prevalence increases following repeated badger culling are less marked when topographical features such as rivers and motorways are present [19], as these features act as barriers to isolate badger populations. Such features may therefore be worth incorporating into analyses of bTB in cattle populations since they are likely to have a knockon effect.
Mean degree (i.e. mean number of CPs) and density of mapbased CP measures were considerably altered by modifying classification of such CPs by presence of landscape features. When scaled up to a network at the regional or national scale, landscape features could affect contact patterns considerably and therefore potentially also affect transmission of disease through livestock populations. Point distance <1 km created network properties closest to that of mapbased CPs, followed by areaweighted tessellation, and then by <3 km point distance. Of note, areaweighted tessellation produced mean degree results in the sample areas (Ayrshire = 6.25; Aberdeenshire = 5.95) similar to that observed over the whole of GB by Keeling et al.[3] (6.5, in supplementary information). Therefore, on balance, areaweighted tessellation appears to be better than <3 km point distance at capturing mapbased contiguity: it has similar ability to discriminate between mapbased CPs and nonCPs and better ability to estimate network density and mean degree. However, one limitation of areaweighted tessellation is that it does not identify the variations in mean degree under mapbased CP definitions by livestock species kept on holding (and potentially other predictors of degree as well). This is likely to be important given the differences observed in FMD transmissibility between sheep and cattle during the 2001 outbreak [3].
Notably, the two sample areas showed that the different CP measures performed fairly consistently between them. The Ayrshire sample had a much higher number of farm premises than the Aberdeenshire sample however, and this brought to light some differences in the landscapes. Ayrshire had a higher mean degree than Aberdeenshire for mapbased CP definitions, indicating that the livestock farming landscape is less fragmented, and that farm premises on average have more connections to other farm premises. This reflects what is already known about the different farming landscapes of the two areas – Aberdeenshire’s being largely composed of mixed cropping and livestock, and Ayrshire’s being predominantly dairy cattle farming [12]. However, network density is lower in the Ayrshire sample. This is because it has about 72% more farm premises compared to the Aberdeenshire sample, meaning that the total number of possible connections is increased disproportionately to the actual number of connections that exist. The proportion of mapbased CPs identified was slightly higher in Ayrshire with <1 km point distance, and slightly lower with <3 km point distance, than compared to Aberdeenshire, both of which may also be attributable to the farming landscape being less fragmented and more tightlypacked with premises in Ayrshire.
In Scotland, Cattle Tracing System (CTS) Links enable premises to move certain livestock freely between paired premises, as described in Orton et al. [20]. Given that, in the largest CTS Link that they identified, the majority of premises were in Scotland [20], knowing which premises are linked to one another will likely considerably affect the contact structures of the networks, when scaling up to the national level. Including this information in future analyses would be greatly beneficial.
This paper has considered field contiguity throughout the analysis. However, ultimately it is livestockinhabited field contiguity that would be the key measure of interest when looking to incorporate contiguity information into analysis of livestock disease spread between premises. The next step will be to create a reliable automated process so that the process of examining contiguity can be extended to larger areas.
Conclusions
This paper has demonstrated that none of the Euclidean point distance, Voronoi tessellation, or areaweighted tessellation measures discriminate particularly well between mapbased CPs and nonCPs as identified from premises field boundaries. Therefore, including contiguity as based on field edges rather than on areaweighted tessellation around farm premises point locations may improve model accuracy. Furthermore, taking topographic features into account can have a considerable impact on which premises are considered to be contiguous or noncontiguous, and on the resulting mean degree and network density. Thus, if such features are known to prevent transmission between contiguous premises (as has been demonstrated for rivers and railways for FMD [10]), including this level of detail could likely also improve the individual farmlevel accuracy of model predictions.
Abbreviations
bTB: Bovine tuberculosis; CP: Contiguous premises; CTS: Cattle tracing system; DC: Dangerous contact; FMD: Footandmouth disease; IP: Infected premises; PPV: Positive predictive value; Se: Sensitivity; TSS: True skill statistic.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
JSF contributed to the research design, gathered the mapbased contiguity data, conducted the analyses, and drafted the manuscript. TP contributed to the research design, and helped to draft the manuscript. MJT conducted the necessary work to identify contiguous premises on the basis of areaweighted tessellation and helped to draft the manuscript. MEJW conceived the research and research design, and helped to draft the manuscript. All authors have read and approved the final manuscript.
Acknowledgements
This research is funded by Scottish Governmentfunded Centre of Expertise in Animal Disease Outbreaks (EPIC). The Scottish Government provided the IACS and Agricultural Census data but was not involved with the analysis or writeup. The point location data were provided by the AHVLA, and they were also not involved in the analysis or write up. We thank Matt Keeling for his advice on coding areaweighted tessellation, and Paul Bessell for his help processing the point location data.
References

Gibbens JC, Sharpe CE, Wilesmith JW, Mansley LM, Michalopoulou E, Ryan JBM, Hudson M: Descriptive epidemiology of the 2001 footandmouth disease epidemic in Great Britain: the first five months.

Ferguson NM, Donnelly CA, Anderson RM: The FootandMouth Epidemic in Great Britain: pattern of spread and impact of interventions.

Keeling MJ, Woolhouse MEJ, Shaw DJ, Matthews L, ChaseTopping M, Haydon DT, Cornell SJ, Kappey J, Wilesmith J, Grenfell BT: Dynamics of the 2001 UK foot and mouth epidemic: stochastic dispersal in a heterogeneous landscape.

Tildesley MJ, Deardon R, Savill NJ, Bessell PR, Brooks SP, Woolhouse MEJ, Grenfell BT, Keeling MJ: Accuracy of models for the 2001 footandmouth epidemic.

Ferguson NM, Donnelly CA, Anderson RM: Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain.

Tildesley MJ, Bessell PR, Keeling MJ, Woolhouse MEJ: The role of preemptive culling in the control of footandmouth disease.

Kao RR: The impact of local heterogeneity on alternative control strategies for footandmouth disease.

Thompson D, Muriel P, Russell D, Osborne P, Bromley A, Rowland M, CreighTyte S, Brown C: Economic costs of the foot and mouth disease outbreak in the United Kingdom in 2001.
Revue Scientifique Et Technique De L Office International Des Epizooties 2002, 21(3):675687.

Savill NJ, Shaw DJ, Deardon R, Tildesley MJ, Keeling MJ, Woolhouse MEJ, Brooks SP, Grenfell BT: Topographic determinants of foot and mouth disease transmission in the UK 2001 epidemic.

Bessell PR, Shaw DJ, Savill NJ, Woolhouse MEJ: Geographic and topographic determinants of local FMD transmission applied to the 2001 UK FMD epidemic.

White PW, Martin SW, Frankena K, O'Keeffe JJ, More SJ, De Jong MCM: How important is "neighbourhood" in the persistance of bovine tuberculosis in Irish cattle herds? In Proceedings of the Society for Veterinary Epidemiology and Preventive Medicine: 28–30 March 2012. Edited by Parkin TDH, Kelly LA. Glasgow: the SVEPM Executive Committee; 2012:273286.

Holland JP, MorganDavies C, Waterhouse T, Thomson S, Midgley A, Barnes A: An analysis of the impact on the natural heritage of the decline in hill farming in Scotland.
A Scottish Natural Heritage Commissioned Report No. 454 2011.
[http://www.snh.gov.uk/publicationsdataandresearch/publications/searchthecatalogue/publicationdetail/?id=1793 webcite]

Thomson S: Foot and mouth disease review: structure of the scottish livestock industry.
An AA211 Special Study Report for The Scottish Government’s Rural and Environment Research and Analysis Directorate 2008.
http://www.scotland.gov.uk/Publications/2008/06/19154131/0 webcite]

Allouche O, Tsoar A, Kadmon R: Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS).

McBride JL, Ebert EE: Verification of quantitative precipitation forecasts from operational numerical weather prediction models over Australia.

Saseendran SA, Singh SV, Rathore LS, Das S: Characterization of weekly cumulative rainfall forecasts over meteorological subdivisions of India using a GCM.

Elmore KL, Weiss SJ, Banacos PC: Operational ensemble cloud model forecasts: some preliminary results.

Accadia C, Mariani S, Casaioli M, Lavagnini A, Speranza A: Verification of precipitation forecasts from two limitedarea models over Italy and comparison with ECMWF forecasts using a resampling technique.

Woodroffe R, Donnelly CA, Jenkins HE, Johnston WT, Cox DR, Bourne FJ, Cheeseman CL, Delahay RJ, CliftonHadley RS, Gettinby G, Gilks P, Hewinson RG, McInerney JP, Morrison WI: Culling and cattle controls influence tuberculosis risk for badgers.

Orton RJ, Bessell PR, Birch CPD, O'Hare A, Kao RR: Risk of footandmouth disease spread due to sole occupancy authorities and linked cattle holdings.