Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, USA

Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, New York, USA

Department of Electrical and Computer Engineering, University of Houston, Texas, USA

Abstract

One of the major goals in biomedical image processing is accurate segmentation of networks embedded in volumetric data sets. Biological networks are composed of a meshwork of thin filaments that span large volumes of tissue. Examples of these structures include neurons and microvasculature, which can take the form of both hierarchical trees and fully connected networks, depending on the imaging modality and resolution. Network function depends on both the geometric structure and connectivity. Therefore, there is considerable demand for algorithms that segment biological networks embedded in three-dimensional data. While a large number of tracking and segmentation algorithms have been published, most of these do not generalize well across data sets. One of the major reasons for the lack of general-purpose algorithms is the limited availability of metrics that can be used to quantitatively compare their effectiveness against a pre-constructed ground-truth. In this paper, we propose a robust metric for measuring and visualizing the differences between network models. Our algorithm takes into account both geometry and connectivity to measure network similarity. These metrics are then mapped back onto an explicit model for visualization.

Introduction

Three-dimensional biomedical data sets often contain complex anatomical structures that are difficult to segment and reconstruct. Of particular interest are filament networks embedded in volumetric data. Examples of these include vascular and neuronal networks. With increased use of high-throughput imaging, there has been significant interest in fast and accurate segmentation algorithms for large data sets. However, segmentation of filament networks in microscopy data sets continues to be a difficult problem. While there has been an effort to distribute tracking algorithms both commercially and as open source through software packages such as the Farsight Toolkit

One of the major roadblocks preventing broad use of these algorithms is the inability to compare the effectiveness of filament segmentation results, which can contain multiple geometric and connectivity errors (Figure

Explicit representation of a neuron model

**Explicit representation of a neuron model**. (left) The network can be represented as a graph structure, where nodes are end points and branch points. Each fiber is represented by a single edge. (right) The same network is shown with several common errors introduced.

Previous work

There is an extensive body of published work on filament segmentation in biomedical data sets. A review of vessel extraction techniques for vascular trees is given by Kirbas and Quek

While new and improved segmentation techniques are proposed yearly, there are few methods for quantitatively comparing results to an established ground truth. The DIADEM metric

Geometric methods

The most basic approach for comparing an explicit representation is the use of standard geometry metrics, such as those used for validation in surface segmentation and surface simplification (level-of-detail). An overview of these methods as they are applied to geometric models is provided by Luebke

We first consider the mean squared error, which is computed by averaging the square of minimum distances from one geometric model

The Path2Path metric was proposed as a method for comparing the geometric characteristics of a neuron in order to facilitate queries into online neuronal databases. This metric forgoes the standard graph representation of a neuronal tree in favor of a collection of geometric paths that extend from the root node to the end of each neuronal process.

The neurons are then compared by determining the amount of energy required to optimally morph the set of paths in

The geometry metric proposed in this paper is based on a similar principle to the MSE. The lack of commutativity is addressed by computing a bi-directional measurement, which is incorporated into the geometric FPR and FNR. The distance sensitivity is eliminated by scaling the error between

Topological methods

Recent methods proposed for validating tree-like segmentations of vasculature and neurons are based on topological approaches. These methods leverage the hierarchical structure of the model in order to quantify segmentation error. Unlike geometric approaches, these methods account for connectivity and can also incorporate some basic geometric information. The two methods that we address are the constrained Tree Edit Distance (TED)

The constrained TED provides a metric that identifies the number of edits that must be performed on a given model

The DIADEM Metric incorporates geometric characteristics of the two models in order to better map branches in the test case model to corresponding geometry in the ground truth. Branch points and end points, for example, are mapped between the models

One of the fundamental problems with current topological approaches is that they depend on the topology of the input models to be tree-like. Therefore, these methods cannot be directly applied to interconnected networks, such as microvascular networks and large-scale reconstructions of neural networks. In addition, current topological techniques are highly sensitive to errors in connectivity, which are commonly encountered using automated segmentation techniques. This is demonstrated in Figure

Artifacts encountered when using hierarchical techniques

**Artifacts encountered when using hierarchical techniques**. (a) A purkinje cell from the Virtual Neuromorphology Electronic Database is shown with two errors introduced: (1) a gap in a fiber and (2) a geometric distortion. The red region indicates the error evaluated using a hierarchical metric. (b) Our proposed method correctly identifies the gap as a small geometric error corresponding with a single missing connection. (c) Our method can also identify a geometric distortion, even though there is no resulting error in topology. (d) Despite this error, edge mapping is consistent across the test case and ground truth.

The connectivity metric that we propose relies on mapping of branch points and end points between the ground truth and test case, which is similar to the initial approach taken by the DIADEM metric. We then use a graph traversal method to evaluate connectivity locally

Proposed methods

The method that we propose compares both the geometry and connectivity of two interconnected networks. Based on a single parameter σ, defining the sensitivity of the metric, our algorithm returns four normalized values characterizing the degree of similarity between two input networks. These metrics are then mapped onto the original input models so that differences between the networks can be visualized. In all cases presented here, the parameter σ is set to the mean fiber radius, however other values can be used. Higher values of σ result in a decrease in the detected error (FNR and FPR).

In the following sections, we describe the input to our proposed algorithm and define the terms used to process network models.

Input models

The most common format for storing traced neurons is the SWC file. SWC files are supported by popular network simulation programs, including NEURON

Terminology

When describing connectivity operations, we use the following terminology:

• A

• An

• A

Note that a fiber can consist of multiple points that describe its geometric shape. While these points are used to evaluate the network geometry, a fiber is represented topologically by a single edge (Figure

• Gaps in fibers.

• Excess edges forming loops or spurs.

Note that a spur connected to correctly segmented geometry is identified as an error even though it is not strictly a topological change.

Overview

The metric proposed in this paper provides false positive and false negative rates for network geometry and connectivity. The proposed geometry metric integrates a weighted distance function along all curves in a network. We show that this can be evaluated efficiently in

We measure connectivity differences by using geometric information to map between nodes and edges in both networks. We then find a set of edges and nodes common to the ground truth and test case. Excess features are then used to quantify the connectivity differences between the two networks.

Geometry

In this section, we first identify the fundamental problem with applying the standard geometry metrics described previously to network segmentations. We then describe the method used by NetMets to address these issues. Error metrics such as MSE and the Hausdorff distance provide a global measure of model similarity, which is ideal when constructing a mesh based on a source model. However, these techniques are not robust for fibrous models, and often apply excessive penalties for relatively small errors. Consider a test case _{1 }and _{2}, our proposed algorithm estimates the ratio of the length of fiber in _{1 }that has no correspondence in _{2 }to the total fiber length in _{1}. This estimate is computed by placing an implicit Gaussian envelope around _{2 }and integrating along the set of curves representing fibers in _{1}. In order to quantify both missed fibers and false positives, we perform a bi-directional measurement, comparing _{1 }to _{2 }as well as comparing _{2 }to _{1}.

In the following sections, we describe common geometric errors encountered in network segmentation. We then describe the theory behind our proposed measurement as well as implementation details and methods for improving accuracy.

Common geometry errors

Errors in segmentation consist of both undetected and spurious fibers as well as deformations in fibers. Thinning algorithms

Tracking methods

Geometry metric

Given two networks _{1 }and _{2}, our proposed metric returns a value that estimates the ratio between the length of fibers in _{1 }that do not exist in _{2 }and the total length of fibers in _{1}. This is described by the following equation:

where _{1}∩_{2 }is the set of fibers common to both networks and integration refers to the length of fiber in the corresponding set. Since this metric does not consider the topological structure of each network, it may be helpful to think of _{1 }and _{2 }as the sets of all points that lie on the curves representing the corresponding network. However, it is impractical to evaluate the statement _{1}∩_{2 }for an explicit model since it is extremely unlikely that fibers in _{1 }and _{2 }will precisely overlap.

Consider the descretization of Equation 1 onto a three-dimensional grid with a voxel size of some small number

where _{1 }and _{2}, the metric itself becomes unrealistically strict. We overcome this problem by scaling the points in _{1 }by a weighted distance field based on the geometry of _{2}:

where _{2}) is the distance between _{2 }and _{2 }and weighting the points in _{1 }based on their position within this field. The value

An analysis that takes into account both spurious and undetected fibers requires a bidirectional measurement. Since the described metric _{1}, _{2}) provides an estimate of the fraction of _{1 }that is not contained in _{2}, a bidirectional measurement is used to determine the rate of false positives and false negatives:

where _{FNR }_{FPR }_{GT }_{T }

Note that Equation 3 can be determined for subsets of the network. The metric value for a single point is used for visualization while integration along a single fiber is used to determine weights for the connectivity metric.

Implementation

The distance function _{2}) is evaluated using a nearest-neighbor search. The sample points for _{2 }are stored in a kd-tree _{1 }are used to determine the distance to the closest point on _{2}. This distance is then used to evaluate the geometry metric (Equation 3).

Accuracy

The accuracy of the geometry metric is of particular interest since the distance function _{1 }and _{2}.

In the case of a regular sampling interval of

for all

Regular and Grid-based sampling methods where

**Regular and Grid-based sampling methods where d is the actual distance and m is the measured distance**. (left) For regular subdivision, the worst-case error in the distance estimate

We place a tighter error bound on _{1 }and _{2 }on a common uniform grid (Figure

for all

Connectivity

Network connectivity is determined by converting each network to a graph based on the previous definitions. A network

Differences in topology can make network connectivity difficult to evaluate

**Differences in topology can make network connectivity difficult to evaluate**. (a-b) A single missing node or additional edge can alter the connectivity connectivity. (c) Multiple edges in one graph can correspond to a single edge in the other.

Because of this complexity, we inform our connectivity metric using key pieces of geometric information. In particular, our algorithm creates a mapping between detected nodes in the test-case to those in the ground-truth. In the ideal case where all nodes in the ground truth are detected, comparing connectivity is a trivial matter of finding edges that are present in both the ground-truth and test-case. However, this is insufficient when nodes are present in only one network. In particular, undetected or falsely detected nodes cause edges to become subdivided or result in topological changes (Figure

Common connectivity errors

Common connectivity errors include additional edges and gaps. In the case of thinning algorithms, these edges are often due to high-frequency noise which form loops or spines on existing fibers. Many segmentation methods produce small gaps in fibers, forming multiple discontinuous segments in place of a single continuous fiber. These errors are difficult to detect geometrically, since disconnected points can occupy the same spatial position in the geometric model. The purpose of the proposed connectivity metric is to give equal weight to graph edges, independent of the length of the associated fibers.

Connectivity metric

The proposed connectivity metric quantifies the quality of a segmentation based on the rate of false positives and false negatives, similar to the proposed method for geometry. The graphs _{T }_{T}_{T}_{GT }_{GT}_{GT}_{T }_{GT}

where

Graph initialization

Each node is initialized with a three-dimensional coordinate from the explicit model. Each node _{T }_{GT }

All colors are initialized to _{i }_{GT}_{T}

and assign a color to _{i }_{j}

The color of each vertex is a unique identifier indicating the nearest node in _{GT}

All edges _{T }_{GT }

where |

Core connectivity

We define the core connectivity of a graph _{c }_{c}, E_{c}_{c }_{c }_{c }_{i}, v_{j}_{c }

Given a node _{i}_{i }_{i}_{c }

Neighborhood search from a single node

**Neighborhood search from a single node v (green) to its neighbors**. Color indicates

Evaluating the core connectivity

**Evaluating the core connectivity**. (a) Graph coloring uses a nearest-neighbor search to associate nodes in the test case to those in the ground truth. Nodes with color _{c}_{c}

Comparison

Once the core connectivity is established, the resulting core graphs _{GTc }_{Tc }_{T }_{Tc }_{GT }_{GTc }_{GT }_{GT }_{GTc}

Evaluating the connectivity metric

**Evaluating the connectivity metric**. (a) The initial graphs representing _{1 }and _{2 }are shown with colored nodes. (b) The core connectivity is computed by combining edges that produce the shortest paths to adjacent colored nodes. (c) The graphs representing core connectivity are then compared to find inconsistencies. (d) Valid connections are then mapped back to the original graphs to determine the false-positive rate and false-negative rate.

Implementation

Graph coloring is performed using a nearest-neighbor search, similar to the method described in Section . This search requires a maximum

The shortest-path algorithm has a time complexity of

Visualization

The geometry and connectivity metrics proposed in this paper provide a global measure for comparing interconnected networks. However, one of the principle advantages of the proposed algorithm is the ability to localize geometric and connectivity errors. If properly visualized, this can allow developers to quickly identify cases where segmentation algorithms fail and provide insight into improving algorithms. In this section, we describe techniques that we have employed to visualize the differences between networks.

Several methods have been proposed for visualizing fiber structures, particularly in the field of diffusion tensor MRI. The most common methods use streamlines and stream tubes

Color mapping

The selection of appropriate color maps for scalar field visualization is a difficult problem, particularly when the scalar field is mapped onto a three-dimensional structure. The rainbow color map (Figure

Colormapping for visualizing geometric error

**Colormapping for visualizing geometric error**. (a) Rainbow colormapping is frequently used to characterize scalar fields and can make prominent errors, such as unsegmented filaments, easy to identify. (b) Blackbody radiation has been shown to provide a better perceptual indication of varying scalar fields, however this mapping often obscures shading, which provides context for three-dimensional structure. (c) Isoluminant shading overcomes these problems at the expense of lower dynamic range. (d) A diverging color (default) maps from cool to warm hues, providing higher dynamic range without obscuring shading.

Divergent color mapping (Figure

Geometry

We visualize geometric differences by mapping the geometry metric directly onto the explicit representation of the original models. The global geometry metric described earlier is evaluated by integrating along the curves representing fibers in the network model _{1}. The resulting value provides a very general measure of the average distance between _{1 }and another network _{2}. For visualization, the value at each point on _{1 }is used to highlight the specific differences in geometry between _{1 }and _{2}.

We display this information by extruding a tube along all fibers in _{1}. A colormap is applied to indicate the value of the weighted _{2 }distance field (Equation 3) at each point on _{1}. This is implemented by storing the value of the weighted distance field at each point in the explicit model:

where _{N}

Hierarchical proxy network comparison

**Hierarchical proxy network comparison**. Geometric error for the (a) ground-truth and (b) test-case are shown. Hue indicates the value of the geometry metric, where blue indicates a strong correspondence and red indicates an error. (c-d) Connectivity shows mapped edges rendered in the same color. Undetected nodes are rendered in red and unmapped edges are white.

Interconnected network representing an organic molecule

**Interconnected network representing an organic molecule**. Geometric error for the (a) ground-truth and (b) test-case are shown, where red indicates significant deviation. (c-d) Connectivity shows mapped edges rendered in the same color. Undetected nodes are rendered in red and unmapped edges are white.

Connectivity

The concept of network connectivity is significantly more abstract, since a one-to-one correspondence between fibers in _{1 }and _{2 }often does not exist. For example, fibers that are subdivided (Figure _{1 }being mapped to multiple fibers in _{2}. This can also result in multiple fibers in _{1 }overlapping when mapped to _{2}. In the case of spurious or undetected fibers, a mapping between _{1 }and _{2 }does not exist.

We use several methods to visualize errors in connectivity. First of all, undetected or spurious nodes are rendered as red spheres, while detected nodes are gray. This simple strategy is used to visualize regions where connectivity errors are frequently made. Where a mapping exists between edges in the ground truth and test case, corresponding edges are color-coded. This is shown for both hierarchical (Figure

Finally, allowing the selection of fibers is an important feature for understanding errors that can occur in connectivity. This is seen in the visualization of cerebellar fibers from the DIADEM data set (Figure

Segmentation of cerebellar climbing fibers from the DIADEM Challenge data set

**Segmentation of cerebellar climbing fibers from the DIADEM Challenge data set **

Results

In this section, we demonstrate how the NetMets software can be used to compare explicit interconnected networks in several cases relevant to current research needs. We first show how NetMets can be used evaluate the performance of an automated segmentation algorithm on a data set distributed as part of the DIADEM Challenge. We then evaluate the performance of the same algorithm on fluorescence microscopy data. Next, we show that NetMets can be useful for comparing different manual tracings of the same network structure. Finally, we demonstrate how the bi-directional measurement used by our proposed metric algorithm can be useful in evaluating segmentation effectiveness when only an incomplete ground truth is available.

Evaluating segmentation algorithms

One of the primary motivations for this work is to provide a quantitative method for evaluating the performance of segmentation algorithms as well as an intuitive visualization approach for identifying where segmentation errors arise in the data. We show how NetMets is suited for this task by performing automated segmentation of two data sets. The first data set is a bright-field microscopy image of a series of cerebellar climbing fibers. This data set is distributed through the DIADEM Challenge

Automated segmentation was performed using ridge detection, followed by dilation and topology-preserving thinning

Segmentation of the cerebellar climbing fibers data set produces a reasonable model of the prominent geometric features (Figure

Our second data set consists of a stack of confocal images of an astrocyte network in close proximity to a blood vessel in the mouse brain. We apply the same automated skeletonization algorithm to these images after inverting their intensity. Upon close inspection of the resulting model we find errors similar to those in our previous data set, where small and low-intensity fibers are undetected (Figure

Validation of an automated segmentation algorithm on an astrocyte network

**Validation of an automated segmentation algorithm on an astrocyte network**. (a) A maximum intensity projection of the image stack with (b) close-up. (c) Mapping of the geometric error onto the ground-truth model. The edge mapping and connectivity is shown for both the (d) ground truth and (e) test case.

Comparing manual segmentations

While a significant amount of current research in the area of neuronal segmentation is directed toward fully-automated reconstruction, building an accurate ground-truth can also be a difficult problem. This is particularly true for extremely dense and complex data sets, where manual segmentation is a time-consuming process that introduces fatigue in the experts producing the desired model. Recent work by Helmstaedter et al.

In this section, we demonstrate the use of NetMets for comparing the results of two manually-constructed models from a confocal image stack of mouse brain microglia (Figure

Comparing two manually-constructed models of microglia

**Comparing two manually-constructed models of microglia**. Both models were traced using Neuromantic by human operators. (a) A maximum intensity projection of the original confocal image stack and (b) a volume visualization of the target microglia. (c-d) The geometric error is shown on both models.

Subgraph comparison

With the development of new high-throughput imaging methods, the size and complexity of data sets can make it impractical to construct a complete ground truth. One possible solution to this problem is to manually label small subsets of the raw data. However, thorough validation of a large data set would require manual labeling of several small subsets to provide a statistically viable sample size. This often makes manual tracing more complex by introducing artificial fiber terminations at the boundaries of these subsets. Since current tools like Neuromantic allow semiautomated tracing, it is often easier to manually trace long fibers than to start and terminate several small ones. It would therefore be convenient to create a ground truth that represents a subset of complete fibers in the data set. However, this would cause properly segmented fibers to be incorrectly labeled as false-positives when there is no corresponding segmentation in the ground-truth. One of the advantages of using a bi-directional measurement like the one we have proposed is that this case can be, to some extent, recognized and corrected.

We demonstrate this by creating an incomplete ground truth for a mouse brain microvascular data set imaged using a high-throughput imaging technique called Knife-Edge Scanning Microscopy (KESM)

Evaluating a segmentation with an incomplete ground truth

**Evaluating a segmentation with an incomplete ground truth**. (a) Volume visualization of raw KESM data showing microvessels in the mouse brain. (b) The complete test case compared to an incomplete ground truth. Red fibers indicate tracked vessels not present in the ground-truth model. (c) Fibers culled by setting a geometric error threshold of 0.9. (d) Geometric error in the ground-truth model. The connectivity graph is shown for the (e) ground-truth and (f) test-case. Red and green arrows indicate breaks in the test-case fibers, resulting in connectivity errors. Black arrow indicates an incorrectly mapped edge.

Conclusion and future work

In this paper, we propose robust methods for quantifying and visualizing differences in interconnected fiber networks. This work is motivated by the need to validate segmentation algorithms for interconnected networks in biomedical imaging. As biomedical data sets increase in size and complexity, qualitative comparison has become insufficient to address this issue. The techniques that we propose build on the quantification principles introduced by the DIADEM Challenge

Current advances in high-throughput imaging are motivating research into robust and generalized segmentation algorithms, which are particularly useful in the field of connectomics

In addition, more accurate edge mapping between the ground truth and test cases would be useful for visualization, since one of the more common errors we have found in our tracing examples are fibers that terminate early, putting them out of range of the corresponding ground truth end node. While this is taken into account in the geometry metric, it can provide for confusing visualization when exploring the connectivity graph.

Finally, previous methods such as the DIADEM Metric provide advantages that may improve the effectiveness of our proposed algorithms. In particular, the use of fiber length as a geometric measurement can capture errors that are not recognized by our algorithm, such as erroneous fibers that are in close proximity to actual geometry. The NetMets software is available online as open source at

List of abbreviations used

MSE: Mean Squared Error; DIADEM: Digital Reconstruction of Axonal and Dendritic Morphology; FPR: False Positive Rate; FNR: False Negative Rate; TED: Tree Edit Distance; FN: False Negative; FP: False Positive; KESM: Knife-Edge Scanning Microscopy.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

D. Mayerich implemented the NetMets software and imaged KESM data, C. Bjornsson prepared and imaged the tissue samples and created ground-truth models, J. Taylor created ground-truth models and performed automated segmentation, B. Roysam developed automated segmentation algorithms and metrics.

Acknowledgements

This work was funded in part by the Beckman Institute for Advanced Science and Technology and the Center for Biotechnology and Interdisciplinary Studies.

This article has been published as part of