A literature-based similarity metric for biological processes
1 Biocomputing Unit. Centro Nacional de Biotecnologia – CSIC, Madrid, Spain
2 Dpto. Arquitectura de Computadores y Automatica. Universidad Complutense de Madrid, Madrid, Spain
3 Dpto. Microbiologia II. Facultad de Farmacia. Universidad Complutense de Madrid, Madrid, Spain
4 Unidad de Proteomica UCM – Parque Cientifico de Madrid, Madrid, Spain
BMC Bioinformatics 2006, 7:363 doi:10.1186/1471-2105-7-363Published: 26 July 2006
Recent analyses in systems biology pursue the discovery of functional modules within the cell. Recognition of such modules requires the integrative analysis of genome-wide experimental data together with available functional schemes. In this line, methods to bridge the gap between the abstract definitions of cellular processes in current schemes and the interlinked nature of biological networks are required.
This work explores the use of the scientific literature to establish potential relationships among cellular processes. To this end we haveused a document based similarity method to compute pair-wise similarities of the biological processes described in the Gene Ontology (GO). The method has been applied to the biological processes annotated for the Saccharomyces cerevisiae genome. We compared our results with similarities obtained with two ontology-based metrics, as well as with gene product annotation relationships. We show that the literature-based metric conserves most direct ontological relationships, while reveals biologically sounded similarities that are not obtained using ontology-based metrics and/or genome annotation.
The scientific literature is a valuable source of information from which to compute similarities among biological processes. The associations discovered by literature analysis are a valuable complement to those encoded in existing functional schemes, and those that arise by genome annotation. These similarities can be used to conveniently map the interlinked structure of cellular processes in a particular organism.