Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Twelfth International Conference on Bioinformatics (InCoB2013): Computational Biology

Open Access Research

PLW: Probabilistic Local Walks for detecting protein complexes from protein interaction networks

Daniel Lin-Kit Wong1*, Xiao-Li Li1*, Min Wu1, Jie Zheng23 and See-Kiong Ng1

Author Affiliations

1 Data Analytics Department, Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore

2 School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore

3 Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore

For all author emails, please log on.

BMC Genomics 2013, 14(Suppl 5):S15  doi:10.1186/1471-2164-14-S5-S15

Published: 16 October 2013

Abstract

Background

Many biological processes are carried out by proteins interacting with each other in the form of protein complexes. However, large-scale detection of protein complexes has remained constrained by experimental limitations. As such, computational detection of protein complexes by applying clustering algorithms on the abundantly available protein-protein interaction (PPI) networks is an important alternative. However, many current algorithms have overlooked the importance of selecting seeds for expansion into clusters without excluding important proteins and including many noisy ones, while ensuring a high degree of functional homogeneity amongst the proteins detected for the complexes.

Results

We designed a novel method called Probabilistic Local Walks (PLW) which clusters regions in a PPI network with high functional similarity to find protein complex cores with high precision and efficiency in <a onClick="popup('http://www.biomedcentral.com/1471-2164/14/S5/S15/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1471-2164/14/S5/S15/mathml/M1">View MathML</a>(|V| log |V| + |E|) time. A seed selection strategy, which prioritises seeds with dense neighbourhoods, was devised. We defined a topological measure, called common neighbour similarity, to estimate the functional similarity of two proteins given the number of their common neighbours.

Conclusions

Our proposed PLW algorithm achieved the highest F-measure (recall and precision) when compared to 11 state-of-the-art methods on yeast protein interaction data, with an improvement of 16.7% over the next highest score. Our experiments also demonstrated that our seed selection strategy is able to increase algorithm precision when applied to three previous protein complex mining techniques.

Availability

The software, datasets and predicted complexes are available at http://wonglkd.github.io/PLW webcite