Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

Open Access Highly Accessed Methodology article

A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

Min Li12*, Hanhui Zhang1, Jian-xin Wang1* and Yi Pan12*

Author Affiliations

1 School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China

2 Department of Computer Science, Georgia State University, Atlanta, GA 30302-4110, USA

For all author emails, please log on.

BMC Systems Biology 2012, 6:15  doi:10.1186/1752-0509-6-15

Published: 10 March 2012



Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value.


In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Density of Maximum Neighborhood Component (DMNC), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC), Range-Limited Centrality (RL), L-index (LI), Leader Rank (LR), Normalized α-Centrality (NC), and Moduland-Centrality (MC). Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN) is more than 50% when predicting no more than 500 proteins.


We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.