Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

Open Access Methodology article

Novel semantic similarity measure improves an integrative approach to predicting gene functional associations

Fatemeh Vafaee1*, Daniela Rosu3, Fiona Broackes-Carter1 and Igor Jurisica1234*

Author affiliations

1 Ontario Cancer Institute and Campbell Family Cancer Research Institute, Princess Margaret Cancer Centre, University Health Network, Toronto, Canada

2 Department of Medical Biophysics, University of Toronto, Toronto, Canada

3 Department of Computer Science, University of Toronto, Toronto, Canada

4 Techna Institute, University Health Network, Toronto, Canada

For all author emails, please log on.

Citation and License

BMC Systems Biology 2013, 7:22  doi:10.1186/1752-0509-7-22

Published: 14 March 2013

Abstract

Background

Elucidation of the direct/indirect protein interactions and gene associations is required to fully understand the workings of the cell. This can be achieved through the use of both low- and high-throughput biological experiments and in silico methods. We present GAP (Gene functional Association Predictor), an integrative method for predicting and characterizing gene functional associations. GAP integrates different biological features using a novel taxonomy-based semantic similarity measure in predicting and prioritizing high-quality putative gene associations. The proposed similarity measure increases information gain from the available gene annotations. The annotation information is incorporated from several public pathway databases, Gene Ontology annotations as well as drug and disease associations from the scientific literature.

Results

We evaluated GAP by comparing its prediction performance with several other well-known functional interaction prediction tools over a comprehensive dataset of known direct and indirect interactions, and observed significantly better prediction performance. We also selected a small set of GAP’s highly-scored novel predicted pairs (i.e., currently not found in any known database or dataset), and by manually searching the literature for experimental evidence accessible in the public domain, we confirmed different categories of predicted functional associations with available evidence of interaction. We also provided extra supporting evidence for subset of the predicted functionally-associated pairs using an expert curated database of genes associated to autism spectrum disorders.

Conclusions

GAP’s predicted “functional interactome” contains ≈1M highly-scored predicted functional associations out of which about 90% are novel (i.e., not experimentally validated). GAP’s novel predictions connect disconnected components and singletons to the main connected component of the known interactome. It can, therefore, be a valuable resource for biologists by providing corroborating evidence for and facilitating the prioritization of potential direct or indirect interactions for experimental validation. GAP is freely accessible through a web portal: http://ophid.utoronto.ca/gap webcite.

Keywords:
Gene functional association perdition; Protein interaction prediction; Functional interactome; Gene annotation; Semantic similarity measure; Systems biology