CASCADE: a novel quasi all paths-based network analysis algorithm for clustering biological interactions
1 Department of Computer Science and Engineering, State University of New York, Buffalo, NY 14260, USA
2 Department of Pharmaceutical Sciences, State University of New York, Buffalo, NY 14260, USA
BMC Bioinformatics 2008, 9:64 doi:10.1186/1471-2105-9-64Published: 29 January 2008
Quantitative characterization of the topological characteristics of protein-protein interaction (PPI) networks can enable the elucidation of biological functional modules. Here, we present a novel clustering methodology for PPI networks wherein the biological and topological influence of each protein on other proteins is modeled using the probability distribution that the series of interactions necessary to link a pair of distant proteins in the network occur within a time constant (the occurrence probability).
CASCADE selects representative nodes for each cluster and iteratively refines clusters based on a combination of the occurrence probability and graph topology between every protein pair. The CASCADE approach is compared to nine competing approaches. The clusters obtained by each technique are compared for enrichment of biological function. CASCADE generates larger clusters and the clusters identified have p-values for biological function that are approximately 1000-fold better than the other methods on the yeast PPI network dataset. An important strength of CASCADE is that the percentage of proteins that are discarded to create clusters is much lower than the other approaches which have an average discard rate of 45% on the yeast protein-protein interaction network.
CASCADE is effective at detecting biologically relevant clusters of interactions.