Skip to main content
  • Oral presentation
  • Open access
  • Published:

Mining expression-dependent modules in the human interaction network

Motivation

We propose a novel method for automatic module extraction from protein-protein interaction networks. While most previous approaches for module discovery are based on graph partitioning [1], our algorithm can efficiently enumerate all densely connected modules in the network. As currently available interaction data are incomplete, this is a meaningful generalization of clique search techniques [2]. In comparison with partitioning methods, the approach has the following advantages: the user can specify a minimum density for the outcoming modules and has the guarantee that all modules that satisfy this criterion are discovered. Moreover, it provides a natural way to detect overlapping modules. Many proteins are not steadily present in the cell, but are specifically expressed in dependence of cell type, environmental conditions, and developmental state. Therefore we introduce an additional constraint for modules which accounts for differential expression.

Results

We analysed human interaction data from MINT, Intact, HPRD, and DIP in the context of tissue-specific gene expression data in human provided by Su et al. [3]. We discretized the expression information into binary states (expressed versus not expressed) and searched for densely connected modules where all proteins are expressed in at least 3 tissues and all proteins are not expressed in at least 10 tissues. To deal with the fact that protein interaction data contain a high number of false positives, we computed reliability scores for each experimental source. Similarly to the work by Jansen et al. [4], we used for that purpose a gold standard set of known interactions as well as a gold standard set of false interactions and calculated the likelihood ratio, which was used to assign edge weights to the interaction graph. The density of a module is defined as the sum of the edge weights inside the module divided by the maximal possible weight sum for a module of that size.

Setting the minimum density threshold to 35% and removing modules that are totally contained in other modules, we obtained a set of 949 differentially expressed modules. They were ranked in descending order according to the average weight per node (see [5]), so larger and denser modules appear first. On the one hand, we discovered known complexes and modules that link strongly cooperating complexes like MCM and ORC. On the other hand, we found extensions of known complexes that confirm hypothetical functional annotation in Uniprot as well as modules which are not contained in the manually curated set of known complexes, but share the same functional annotation. Finally, some modules are candidates for further biological investigation, containing proteins with unknown functional relationships.

Conclusion

We developed a general method for exhaustive dense module extraction from networks. Remarkably, it allows to determine exact P-values for the predicted modules without having to rely on any network model and can easily integrate information from different heterogeneous data sources.

References

  1. Newman MEJ: Modularity and community structure in networks. PNAS 2006, 103(23):8577–8582. 10.1073/pnas.0601602103

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. Palla G, Derenyi I, Farkas I, Vicsek T: Uncovering the overlapping community structure of complex networks in nature and society. Nature 2005, 435: 814–818. 10.1038/nature03607

    Article  CAS  PubMed  Google Scholar 

  3. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB: A gene atlas of the mouse and human protein-encoding transcriptomes. PNAS 2004, 101(16):6062–6067. 10.1073/pnas.0400782101

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 2003, 302: 449–453. 10.1126/science.1087361

    Article  CAS  PubMed  Google Scholar 

  5. Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 2003, 4: 2. 10.1186/1471-2105-4-2

    Article  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgements

We are grateful to Andreas Rüpp for providing a curated set of known human complexes and to Gunnar Rätsch for his encouragement and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elisabeth Georgii.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Georgii, E., Dietmann, S., Uno, T. et al. Mining expression-dependent modules in the human interaction network. BMC Bioinformatics 8 (Suppl 8), S4 (2007). https://doi.org/10.1186/1471-2105-8-S8-S4

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-8-S8-S4

Keywords