A kinetic model of the evolution of a protein interaction network
BMC Genomics 2013, 14:172 doi:10.1186/1471-2164-14-172Published: 14 March 2013
Known protein interaction networks have very particular properties. Old proteins tend to have more interactions than new ones. One of the best statistical representatives of this property is the node degree distribution (distribution of proteins having a given number of interactions). It has previously been shown that this distribution is very close to the sum of two distinct exponential components. In this paper, we asked: What are the possible mechanisms of evolution for such types of networks? To answer this question, we tested a kinetic model for simplified evolution of a protein interactome. Our proposed model considers the emergence of new genes and interactions and the loss of old ones. We assumed that there are generally two coexisting classes of proteins. Proteins constituting the first class are essential only for ecological adaptations and are easily lost when ecological conditions change. Proteins of the second class are essential for basic life processes and, hence, are always effectively protected against deletion. All proteins can transit between the above classes in both directions. We also assumed that the phenomenon of gene duplication is always related to ecological adaptation and that a new copy of a duplicated gene is not essential. According to this model, all proteins gain new interactions with a rate that preferentially increases with the number of interactions (the rich get richer). Proteins can also gain interactions because of duplication. Proteins lose their interactions both with and without the loss of partner genes.
The proposed model reproduces the main properties of protein-protein interaction networks very well. The connectivity of the oldest part of the interaction network is densest, and the node degree distribution follows the sum of two shifted power-law functions, which is a theoretical generalization of the previous finding. The above distribution covers the wide range of values of node degrees very well, much better than a power law or generalized power law supplemented with an exponential cut-off. The presented model also relates the total number of interactome links to the total number of interacting proteins. The theoretical results were for the interactomes of A. thaliana, B. taurus, C. elegans, D. melanogaster, E. coli, H. pylori, H. sapiens, M. musculus, R. norvegicus and S. cerevisiae.
Using these approaches, the kinetic parameters could be estimated. Finally, the model revealed the evolutionary kinetics of proteome formation, the phenomenon of protein differentiation and the process of gaining new interactions.