Steiner tree methods for optimal sub-network identification: an empirical study
Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms Universitat Bonn, Dahlmannstr 2, 53113 Bonn, Germany
BMC Bioinformatics 2013, 14:144 doi:10.1186/1471-2105-14-144Published: 30 April 2013
Analysis and interpretation of biological networks is one of the primary goals of systems biology. In this context identification of sub-networks connecting sets of seed proteins or seed genes plays a crucial role. Given that no natural node and edge weighting scheme is available retrieval of a minimum size sub-graph leads to the classical Steiner tree problem, which is known to be NP-complete. Many approximate solutions have been published and theoretically analyzed in the computer science literature, but far less is known about their practical performance in the bioinformatics field.
Here we conducted a systematic simulation study of four different approximate and one exact algorithms on a large human protein-protein interaction network with ~14,000 nodes and ~400,000 edges. Moreover, we devised an own algorithm to retrieve a sub-graph of merged Steiner trees. The application of our algorithms was demonstrated for two breast cancer signatures and a sub-network playing a role in male pattern baldness.
We found a modified version of the shortest paths based approximation algorithm by Takahashi and Matsuyama to lead to accurate solutions, while at the same time being several orders of magnitude faster than the exact approach. Our devised algorithm for merged Steiner trees, which is a further development of the Takahashi and Matsuyama algorithm, proved to be useful for small seed lists. All our implemented methods are available in the R-package SteinerNet on CRAN (http://www.r-project.org webcite) and as a supplement to this paper.