Detection of the dominant direction of information flow and feedback links in densely interconnected regulatory networks
1 Ariadne Inc., 9430 Key West Ave. Suite 113 Rockville, MD 20850, USA
2 Departamento de Fisica, Universidad de Santiago de Chile, Casilla 302, Correo 2, Santiago, Chile
3 Department of Condensed Matter Physics and Materials Science, Brookhaven National Laboratory, Upton, New York 11973, USA
BMC Bioinformatics 2008, 9:424 doi:10.1186/1471-2105-9-424Published: 8 October 2008
Finding the dominant direction of flow of information in densely interconnected regulatory or signaling networks is required in many applications in computational biology and neuroscience. This is achieved by first identifying and removing links which close up feedback loops in the original network and hierarchically arranging nodes in the remaining network. In mathematical language this corresponds to a problem of making a graph acyclic by removing as few links as possible and thus altering the original graph in the least possible way. The exact solution of this problem requires enumeration of all cycles and combinations of removed links, which, as an NP-hard problem, is computationally prohibitive even for modest-size networks.
We introduce and compare two approximate numerical algorithms for solving this problem: the probabilistic one based on a simulated annealing of the hierarchical layout of the network which minimizes the number of "backward" links going from lower to higher hierarchical levels, and the deterministic, "greedy" algorithm that sequentially cuts the links that participate in the largest number of feedback cycles. We find that the annealing algorithm outperforms the deterministic one in terms of speed, memory requirement, and the actual number of removed links. To further improve a visual perception of the layout produced by the annealing algorithm, we perform an additional minimization of the length of hierarchical links while keeping the number of anti-hierarchical links at their minimum. The annealing algorithm is then tested on several examples of regulatory and signaling networks/pathways operating in human cells.
The proposed annealing algorithm is powerful enough to performs often optimal layouts of protein networks in whole organisms, consisting of around ~104 nodes and ~105 links, while the applicability of the greedy algorithm is limited to individual pathways with ~100 vertices. The considered examples indicate that the annealing algorithm produce biologically meaningful layouts: The function of the most of the anti-hierarchical links is indeed to send a feedback signal to the upstream pathway elements. Source codes of F90 and Matlab implementation of the two algorithms are available at http://www.cmth.bnl.gov/~maslov/programs.htm webcite