Department of Biomedical Engineering, Eindhoven University of Technology

Abstract

Background

Techniques for reconstruction of biological networks which are based on perturbation experiments often predict direct interactions between nodes that do not exist. Transitive reduction removes such relations if they can be explained by an indirect path of influences. The existing algorithms for transitive reduction are sequential and might suffer from too long run times for large networks. They also exhibit the anomaly that some existing direct interactions are also removed.

Results

We develop efficient scalable parallel algorithms for transitive reduction on general purpose graphics processing units for both standard (unweighted) and weighted graphs. Edge weights are regarded as uncertainties of interactions. A direct interaction is removed only if there exists an indirect interaction path between the same nodes which is strictly more certain than the direct one. This is a refinement of the removal condition for the unweighted graphs and avoids to a great extent the erroneous elimination of direct edges.

Conclusions

Parallel implementations of these algorithms can achieve speed-ups of two orders of magnitude compared to their sequential counterparts. Our experiments show that: i) taking into account the edge weights improves the reconstruction quality compared to the unweighted case; ii) it is advantageous not to distinguish between positive and negative interactions since this lowers the complexity of the algorithms from NP-complete to polynomial without loss of quality.

Background

Techniques for the reconstruction of biological networks, such as genetic, metabolic or signaling networks, are used for getting insight into the inner mechanisms of the cell. They are usually based on perturbation experiments, e.g., gene knockouts or knockdowns, in which one or more network nodes (e.g. genes) are systematically perturbed and the effects on the other nodes are observed. More concretely, in the context of genetic regulatory networks with knockout experiments, the nodes of the networks are genes. Each gene is knocked out at least once and the expressions of the other genes are measured for each knockout experiment. The expression change with regard to the unperturbed wild type defines the influence of the knocked out gene on the other genes. Based on that, connections between the genes can be established. Using the difference in the expression between the perturbed and the wild type, weights and signs can be associated with the connections to quantify the influence and indicate over- and under-expression, respectively.

An important problem in this kind of network reconstruction is that direct connections between genes might be established that are spurious, i.e., do not exist in reality. We illustrate this with the following example. Suppose that the transcription factor

There are several ways to remove spurious direct relations depending on the representation of the biological networks. For instance, in

Using TR for filtering out spurious connections was proposed by Wagner ^{
′
}, if there is an alternative chain of interactions between ^{
′
}. Going back to our example above, TR would mean that the undesired edge between

The first algorithms for TR date from the seventies

In

With regard to the definition of transitive closure for weighted graphs and the general theoretical background, the closest to our work is

Unlike the previous work in

We present parallel versions of the TR algorithms for both unweighted and weighted directed graphs. These algorithms are developed for general purpose graphics processing units (GPUs). GPUs have been extensively used for various applications, including bioinformatics and systems biology. Since GPUs are standard in modern computers, parallel algorithms become an attractive possibility to speed up computations.

The crucial idea of TR on GPUs is to formulate the algorithm in terms of matrix operations. Since GPUs are very efficient in implementing the latter, this results in a remarkable speed-up so that networks consisting of tens of thousands of nodes can be handled within seconds on a standard desktop computer.

Parallel algorithms for computing transitive closure of a graph, which is closely related to TR, have been developed before e.g.,

Approach for unweighted graphs

We adopt a graph-theoretic framework to formally represent biological networks. A ^{
′
})∈_{
i,j
} of the matrix have value 1, if there is an edge from nodes

A _{
ij
}=(_{0},_{1},…,_{
m
}), where _{0}=_{
m
}=_{
l−1},_{
l
})∈_{
l
}, where 1≤_{
ij
} whose first and last node coincide, i.e., _{
ij
}with _{
ij
}). Let

Definition 1 (**Unweighted Transitive Closure and Reduction**)

The ^{
T
}=(^{
T
}) with (^{
T
}if and only if there exists a path from ^{
t
}=(^{
t
}), i.e., with the least number of edges, such that (^{
t
})^{
T
}=^{
T
}.

Intuitively, this means that the transitive closure is preserved by the reduction, i.e., no information about reachability is lost. For an acyclic graph ^{
t
} can be obtained by removing each

Transitive reduction of an acyclic graph

**Transitive reduction of an acyclic graph.** The edge (

The definition of TR can be extended in a natural way to graphs with cycles. However, the reduced graphs are not unique and in general cannot be generated by deleting edges from the graph (see Figure

Transitive reduction of a cyclic graph

**Transitive reduction of a cyclic graph.** The graphs in **(b)** and **(c)** are both transitive reductions of the graph in **(a)**, since all three graphs have the same transitive closure. One can see that edges that do not exist in the original graph may occur in its transitive reductions, like (

Extension to weighted graphs

We aim at modelling experiments that use node perturbations, e.g., gene knockouts, for the reconstruction of interactions between nodes. We already saw in the example above that spurious direct interactions are added if there exist an indirect path between two genes. Hence, the outcome of the experiments actually produces a transitive closure of the real (original) network. By applying TR as a kind of inverse operation of the transitive closure, we try to cancel these side effects by removing direct interactions between two nodes, if there is an alternative indirect path between them. Finding the TR amounts to reconstruction of the network as by removing those direct interactions we usually obtain a good approximation of the real network.

However, sometimes both direct and indirect interactions can exist at the same time. Examples of this are the feed-forward loops that occur in the genetic networks of many organisms

The interaction uncertainties are represented as _{maxe∈E
}{

Definition 2 (**(Minimal) Transitive Interaction Uncertainty**)

_{maxe∈Edges(P)}{

Note that if edge (

By putting the last two inequalities together, we can refine the edge preservation criterion to

Definition 3 (Weighted Transitive Reduction)

The ^{
t
}=(^{
t
},^{
t
}) with ^{
t
}={^{
t
}(^{
t
}.

Informally, edge (^{
t
} if and only if its weight/uncertainty equals the minimal transitive uncertainty of all paths between

Transitive reduction of a weighted graph

**Transitive reduction of a weighted graph.** Edge (

Of course, there are other possible options to define the path weights based on the edge weights. For instance, summing up the edge weights to obtain the weight of the path is in some cases even a more natural choice than the max-min (weakest link) approach that we use. However, in the case when p-values (or similar metrics in the interval [0,1], e.g., correlation) are used, this is not the best option. Summing up the p-values of all edges in the path can produce a result which is greater than 1, i.e., something which is not a probability. Since a trivial path consisting of only one edge is also a path, it is preferable to have a weight for non-trivial paths that is of the same nature as the edge weight.

In general, the refined notion of TR with weights does not entirely resolve the anomaly of removing a direct interaction which exists in the network. One way to further improve the filtering of the edges is to use thresholds. We introduce a _{low} determining that any edge _{low} is unconditionally kept in ^{
t
}, i.e., regardless of the existence of more certain indirect interactions. In this way we ensure that interactions which are measured with high certainty are not removed from the network. Similarly, we use an _{up} such that any edge _{up} is unconditionally removed from the network. Hence, very uncertain connections are always removed from the original graph _{up}is actually independent of the TR concept and it can be done as a pre- or post-processing step.

This can be shown by reasoning towards a contradiction. Assume that thresholding (TH) and TR are dependent, in other words, that the final result depends on the order in which TH and TR are performed. Say we have a graph ^{
′
} which is not present in either TH(^{
t
}) or TH(^{
t
}, but it is in the other. Let it not be present in TH(^{
t
}) (the case when it is not in TH(^{
t
}is similar). Then, either it was removed by TR (case 1), or by TH (case 2). Case 1: If ^{
′
} with ^{
t
}. Therefore, we must have _{up}. But then, also _{up}, in other words, _{up}. But then, it would also be removed when applying TH on

Using a threshold _{low}splits the edge weights into two sets. Within the sets the difference between values does not play any role. For instance, assuming _{low} = 0.5, the difference between the edge weights 0.8 and 0.7 is the same as between 0.16 and 0.06. An analogous remark holds also for _{up}.

For many graph problems, the unweighted problem is often a special case of a more general weighted problem. For example, an algorithm to determine shortest paths in a weighted graph can be used to find shortest paths in unweighted graphs by assigning the same positive weight to every edge. For our problem of (weighted) TR, a similar analogy does not hold, i.e., we cannot use the algorithm for weighted cyclic graphs to calculate the TR of an unweighted cyclic graph. This fact results from the different natures of our definitions: For the weighted case, we choose the greatest weight on a path without considering its length. For the unweighted case, however, we will be adding up the edges of paths to obtain their length (cf. next section). The first approach is not affected by cycles – the transitive interaction uncertainty of a path with cycles is always greater or equal to the one for the same path with all cycles removed, thus, cycles are ignored “automatically” when searching a path with the minimal transitive interaction uncertainty. In contrast, for the second approach, cycles must be actively detected to ensure that only paths in which no node occurs more than once are considered (the longest simple path problem is NP-hard). Figure

Transitive reduction of cyclic unweighted graphs needs cycle detection

**Transitive reduction of cyclic unweighted graphs needs cycle detection.**

Implementation

After briefly discussing the emergence of many-core processors and the resulting need for parallelisation of programs, this section presents our parallelised algorithms for transitive reduction: first, for unweighted acyclic graphs and afterwards the extension to weighted cyclic graphs. Finally, we discuss the problems with cycles in unweighted graphs.

Although

The

Algorithm for unweighted acyclic graphs

For obtaining the TR of an unweighted acyclic graph ^{
T
} would be calculated first. Then, in a second step, all edges ^{
T
} and (^{
T
}, i.e., there exists an alternative path (

Algorithm 1 gives a pseudo-code description of our approach. First, the integer-valued adjacency matrix _{
i,k
}≥1, and one from _{
k,j
}≥1, this gives a path from _{
i,j
} is set to two, denoting that there exists an indirect path between _{
i,j
}=2, by setting _{
i,j
}:=0 (lines 7–9). Finally, the transitively reduced matrix

Algorithm 1

**Pseudo-code description of parallelised transitive reduction of unweighted acyclic graphs**

1: read

2: copy

3: **for**
**do sequentially**

4:

**for**
**do in parallel**

5:

**if**
_{
i,k
}≥1 and _{
k,j
}≥1**then**

6:

_{
i,j
}:=2

7: **for**
**do in parallel**

8:

**if**
_{
i,j
}=2**then**

9:

_{
i,j
}:=0

10: copy

11: write

In the ^{2}threads of a kernel containing the loop’s body. Thus, according to its thread ID, each thread executes the loop’s body for one particular

Complexity

The algorithm iterates ^{2}elements of the matrix: n times in lines 3–6 and finally once in lines 7–9. However, different steps of the same iteration can run in parallel on different processors. Thus, the overall time complexity depending on the number of processors ^{2}. In this model, the time complexity becomes ^{3}) as for the Floyd-Warshall algorithm. Similarly, the space (memory) complexity is ^{2}).

Correctness

We claim that the output matrix is the TR of the input and prove this for the sequential version, i.e., with sequential execution of the parallel loops, first, before considering the specifics of parallelisation. Thereby we can rely on two things: The correctness of the Floyd-Warshall algorithm (cf. _{
i,j
}=2 holds before lines 7–9, i.e., there exists a path from _{
i,j
}=2 and the edge will be deleted.

For the correctness of the parallelised algorithm, we have to show that the steps performed in parallel are independent and do not interfere. For the parallel do-loop in lines 7–9 this is obvious as each iteration reads from and writes to its individual memory location. Analogously, all iterations of the inner loop of the Floyd-Warshall algorithm in lines 4–6 write to different memory locations. Furthermore, it cannot happen that for a fixed _{0} the elements _{0},_{0}) were already changed by the iterations for (_{0},_{0}) and (_{0},_{0}), respectively, since both check ^{th} iteration are completed before the (^{th}iteration starts.

Algorithm for weighted (cyclic) graphs

In order to obtain the TR of a weighted graph, according to the definition in Section “Background”, all edges (^{2}. Again, we use a variant of the Floyd-Warshall algorithm whose pseudo-code description can be found in lines 3–7 of Algorithm 2. The matrix _{
i,j
}=_{
i,j
}=⊤. Then, for increasing _{
i,j
}, _{
i,k
}, and _{
k,j
}contain the minimal transitive interaction uncertainty of all paths that go from _{
i,j
} is updated to it (lines 6–7). Figure _{
i,j
}:=⊤ (lines 8–10) and the reduced matrix can be copied to and stored by the host (lines 11–12)

An example how the Floyd-Warshall algorithm calculates the minimal transitive interaction uncertainties

**An example how the Floyd-Warshall algorithm calculates the minimal transitive interaction uncertainties.** Consider order I of the nodes **(b)**: In the first iteration (^{→0.1}2^{→0.3}3 with transitive interaction uncertainty, i.e., its maximal weight, 0^{→0.3}3. Furthermore, the path 1^{→0.1}2^{→0.8}4 is considered, but as its transitive interaction uncertainty (0^{→0.5}4, nothing changes. During the third iteration (^{→0.3}3^{→0.2}4 and 1^{→0.3}3^{→0.2}4 (which corresponds to the path ^{→0.1}^{→0.3}^{→0.2}^{→0.3}4 and 1^{→0.3}4, respectively. In the last iteration (**(d)**. For order II of the nodes **(c)**, the algorithm works similar. Note that the correctness of the algorithm does not depend on the concrete order of the outer iteration (

Algorithm 2.

**Pseudo-code description of parallelised transitive reduction of weighted, potentially cyclic, graphs**
_{low}and _{up}

1: read

2: copy

3: **for**
**do sequentially**

4:

**for**
**do in parallel**

5:

_{
i,k
}),|_{
k,j
}||

6:

**if** (_{
i,j
}<0 or _{
i,j
}>_{low}) and _{
i,j
}|**then**

7:

_{
i,j
}:=−

8: **for**
**do in parallel**

9:

**if**
_{
i,j
}) or _{
i,j
}≥_{up}**then**

10:

_{
i,j
}: = T

11: copy

12: write

However, two things are slightly more involved in the presented pseudo-code: First, to save memory, we use just one matrix _{
i,j
}is overwritten. Thus, some necessary information to check the removal condition _{
i,j
}might be needed for later iterations. If, for example, in the second iteration (^{→0.8}4 was directly removed and not replaced by 3^{→0.3}4, the algorithm could not find in the next iteration (^{→0.1}3^{→0.3}4 has a better (transitive) interaction uncertainty, namely 0^{→0.5}4.

The second aspect in the pseudo-code is the incorporation of thresholding. The upper threshold _{up} is applied in line 9 (_{
i,j
}≥_{up}), ensuring that all edges with this or a bigger weight are always deleted. Due to the one-matrix representation, the lower threshold _{low}cannot be applied in this post processing fashion. Instead, every edge with this or a lower weight is skipped in line 6 (_{
i,j
}>_{low}) so that its original weight is preserved in _{
i,j
}.

Time complexity

The same as for Algorithm 1.

Correctness

We claim that the output of our algorithm is the thresholded weighted TR of the provided input, i.e., that in the end _{
i,j
}=^{
t
}, and otherwise _{
i,j
}=⊤ holds. Again, we prove the correctness of the sequential algorithm first, before we argue about the issues that come up with its parallelisation. Furthermore, we rely on the correctness of the Floyd-Warshall algorithm, i.e., that it correctly determines for each pair of nodes the minimum of the maximal weights of all paths between them.

First, we show that **all edges which are not in****
E
**

One might wonder whether the protection of edges by the lower threshold affects the correctness of the Floyd-Warshall algorithm. Consider, for example, the situation depicted in Figure _{low}=0^{→0.5}
^{→0.2}
^{→0.3}
^{→0.1}
^{→0.2}
^{→0.3}
^{
′
})=0_{low}, e.g., 0_{low}, e.g., 0^{
′
}) is better.

Protecting edges by lower threshold _{low}**= 0.5**does not affect correctness

**Protecting edges by lower threshold **
**
t
**

It remains to show, that **the edges that are in****
E
**

Finally, we have to argue that the **parallelisation of the algorithm** does not break its correctness. For the post processing in lines 8–10 this is definitely the case since every iteration reads only from and writes only to its individual memory location _{
i,j
}. As already discussed for the algorithm for unweighted acyclic graphs, it is crucial that the outer loop (line 6) of the Floyd-Warshall algorithm over _{0} of the outer loop every iteration (_{0},_{0}) writes only to its own memory location _{0}th outer iteration. The values of

Results and discussion

We implemented our unweighted and weighted transitive reduction (TR) algorithms in the tools

Scalability experiments with

In this set of experiments we used networks of various sizes (1,000, 2,500, and 10,000 nodes), with and without weights, as inputs for the TR algorithms. The unweighted networks were generated using the Directed Scale Free Graph algorithm ^{a}. Using moderated

The goal of this set of experiments was to test the scalability of

For each graph size, we considered both acyclic unweighted and cyclic weighted graphs. We tested five implementations of TR algorithms. On acyclic unweighted graphs we applied Wagner’s algorithm

The results of the algorithms are summarized in Tables ^{b}. For each graph the absolute runtimes were measured five times. The results for all runs on all graphs of the same size and type were very similar (with a spread less than 5%) and the averages are shown in Tables

**Unweighted**

**Weighted**

W = Wagner’s algorithm, STR = Sequential

**size**

**1,000**

**2,500**

**10,000**

**1,000**

**2,500**

**10,000**

**W [sec]**

2.14

34.33

2137.18

NA

NA

NA

**STR [sec]**

1.18

18.42

1186.39

7.52

120.21

10524.08

**PTR [sec]**

1.77

1.84

3.27

2.58

6.69

114.00

**STR vs. W**

1.82

1.86

1.80

NA

NA

NA

**PTR vs. W**

1.21

18.67

653.05

NA

NA

NA

**PTR vs. STR**

0.67

10.02

362.92

2.92

17.97

92.32

**Unweighted**

**Weighted**

W = Wagner’s algorithm, STR = Sequential

**size**

**1,000**

**2,500**

**1,000**

**2,500**

**W [sec]**

2.46

38.93

NA

NA

**STR [sec]**

1.20

18.55

5.86

91.12

**PTR [sec]**

1.74

1.82

2.44

6.37

**STR vs. W**

2.05

2.10

NA

NA

**PTR vs. W**

1.41

21.37

NA

NA

**PTR vs. STR**

0.69

10.18

2.40

14.31

It is important to note that the structure and the density of an input graph has no significant effect on the performance of the TR algorithms. This can be concluded when comparing the results in Tables ^{c}. In the case of dense Erdős-Rényi graphs the average runtimes for the weighted graphs of size 1000 and 2500 nodes are 2.50 and 7.36, which are comparable with their counterparts in Table

Quality experiments with the

In recent years, the Dialogue of Reverse Engineering Assessments and Methods (

In our evaluation with the

The network reconstruction was done in two steps, described in more detail in

The output files representing the reconstructed graphs were evaluated using the corresponding Matlab scripts provided by the

In the

The results of our experiments are given in Table

**Network + reconstruction method**

**TP**

**TN**

**FP**

**FN**

**AUROC**

**AUPR**

TP = true positives, TN = true negatives, FP = false positives, FN = false negatives. AUROC = area under the receiver-operator characteristics curve and AUPR = area under the precision-recall curve are computed by the

Network 1 (176 edges)

-perturbation graph

114

9467

257

62

0.8851

0.5138

-perturbation graph +

107

9532

192

69

0.8848

0.5082

-perturbation graph +

108

9547

177

68

0.8854

0.5366

-perturbation graph +

109

9582

142

67

0.8857

0.5475

Network 2 (249 edges)

-perturbation graph

106

9389

262

143

0.7877

0.3577

-perturbation graph +

98

9411

240

151

0.7871

0.3455

-perturbation graph +

92

9473

178

157

0.7874

0.3636

-perturbation graph +

85

9516

135

164

0.7874

0.3604

Network 3 (195 edges)

-perturbation graph

93

9446

259

102

0.8490

0.3353

-perturbation graph +

91

9451

254

104

0.8488

0.3313

-perturbation graph +

90

9543

162

105

0.8495

0.3574

-perturbation graph +

89

9563

142

106

0.8496

0.3673

Network 4 (211 edges)

-perturbation graph

112

9403

286

99

0.8474

0.3932

-perturbation graph +

111

9418

271

100

0.8474

0.3938

-perturbation graph +

101

9510

179

110

0.8478

0.4214

-perturbation graph +

96

9538

151

115

0.8477

0.4201

Network 5 (193 edges)

-perturbation graph

66

9230

477

127

0.7667

0.1580

-perturbation graph +

66

9230

477

127

0.7667

0.1580

-perturbation graph +

56

9409

298

137

0.7665

0.1653

-perturbation graph +

52

9495

212

141

0.7666

0.1661

AUPR (area under the precision-recall curve) and AUROC (area under the receiver-operator curve) are quite standard scoring metrics for binary classifiers, computed using the TP (true positives), TN (true negatives), FP (false positives), and FN (false negatives). For the definitions and a more detailed discussion on the scoring metrics for the

The overall _{1}and _{2} are the overall pAUPR and pAUROC values, respectively. The latter are obtained as geometric means of the individual pAUPR and pAUROC values of each of the five networks. Intuitively, the pAUPR and pAUROC are

The results with

At first sight, the fact that by omitting the signs there is virtually no loss of quality of reconstruction can be paradoxical. However, in many cases the relative loss of information by the omission of the edge signs is compensated by the lower threshold _{low} in the TR algorithm. We illustrate this with the following example. Consider a subnetwork of three nodes

Conclusions

We presented parallel versions of algorithms for transitive reduction (TR) to reconstruct perturbation networks. The main improvement of our algorithms compared to the existing methods is the speed-up and scalability without loss of reconstruction quality. Moreover, our algorithms are applicable to both weighted and unweighted networks. The gain of the TR is significant since it mostly removes spurious direct interactions, which are overlooked by the first filtering step that produces the so-called perturbation graph.

We implemented our algorithms in the tool ^{3}), where

Since the TR algorithms depend on the threshold, fine tuning of this parameter might require several experiments. The advantage of obtaining the results within seconds or in the worst case minutes, instead of hours, can be very significant. Currently, it takes less than a couple of minutes to process with

Our weighted TR algorithm is independent of the nature of weights. Therefore, instead of correlations which were used for the

Finally, although our approach is already quite effective despite its simplicity, it is worth considering combining it with other reconstruction methods.

Availability and requirements

● **Project name:**

● **Project home page:**

● **Operating system(s):** Linux, Mac OS, Windows

● **Programming language:** C, CUDA

● **Other requirements:** CUDA

● **License:** none

● **Any restrictions to use by non-academics:** none

Endnotes

^{a}A special version of ^{b}The reason for this is that, unfortunately, generating graphs of size 10,000 with ^{c}These additional experimental results can be obtained at

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

DB, WL, and PH conceived the project. DB came up with the idea to use GPUs for TR. All authors contributed to the theoretical part of the paper. DB, MO, and AW drafted the manuscript. MO and AW implemented the TR algorithms, performed the experiments and evaluated them. WL, AW, MO and DB implemented part of the support programs. All authors read and approved the final manuscript.

Acknowledgements

We would like to thank Perry Moerland, Barbera van Schaik, Piet Molenaar, and Mark van den Brand for the inspiring discussions, Steffen Klamt for the feedback on

Partially supported by the project