Reasearch Awards nomination

Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

A survey of protein interaction data and multigenic inherited disorders

Antonio Mora12, Katerina Michalickova3 and Ian M Donaldson12*

Author affiliations

1 Department for Molecular Biosciences, University of Oslo, P.O. Box 1041 Blindern, 0316, Oslo, Norway

2 The Biotechnology Centre of Oslo, University of Oslo, P.O. Box 1125 Blindern, 0317, Oslo, Norway

3 Scientific Computing Group, University of Oslo, P.O. Box 1059 Blindern, Oslo, Norway

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2013, 14:47  doi:10.1186/1471-2105-14-47

Published: 11 February 2013

Abstract

Background

Multigenic diseases are often associated with protein complexes or interactions involved in the same pathway. We wanted to estimate to what extent this is true given a consolidated protein interaction data set. The study stresses data integration and data representation issues.

Results

We constructed 497 multigenic disease groups from OMIM and tested for overlaps with interaction and pathway data. A total of 159 disease groups had significant overlaps with protein interaction data consolidated by iRefIndex. A further 68 disease overlaps were found only in the KEGG pathway database. No single database contained all significant overlaps thus stressing the importance of data integration. We also found that disease groups overlapped with all three interaction data types: n-ary, spoke-represented complexes and binary data – thus stressing the importance of considering each of these data types separately.

Conclusions

Almost half of our multigenic disease groups could potentially be explained by protein complexes and pathways. However, the fact that no database or data type was able to cover all disease groups suggests that no single database has systematically covered all disease groups for potential related complex and pathway data. This survey provides a basis for further curation efforts to confirm and search for overlaps between diseases and interaction data. The accompanying R script can be used to reproduce the work and track progress in this area as databases change. Disease group overlaps can be further explored using the iRefscape plugin for Cytoscape.