Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the Sixth Annual MCBIOS Conference. Transformational Bioinformatics: Delivering Value from Genomes

Open Access Open Badges Proceedings

PathBinder – text empirics and automatic extraction of biomolecular interactions

Lifeng Zhang1, Daniel Berleant2*, Jing Ding3, Tuan Cao1 and Eve Syrkin Wurtele1

Author Affiliations

1 Iowa State University, Ames, Iowa, USA

2 University of Arkansas at Little Rock, Little Rock, Arkansas, USA

3 Ohio State University Medical Center, Columbus, Ohio, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 11):S18  doi:10.1186/1471-2105-10-S11-S18

Published: 8 October 2009



The increasingly large amount of free, online biological text makes automatic interaction extraction correspondingly attractive. Machine learning is one strategy that works by uncovering and using useful properties that are implicit in the text. However these properties are usually not reported in the literature explicitly. By investigating specific properties of biological text passages in this paper, we aim to facilitate an alternative strategy, the use of text empirics, to support mining of biomedical texts for biomolecular interactions. We report on our application of this approach, and also report some empirical findings about an important class of passages. These may be useful to others who may also wish to use the empirical properties we describe.


We manually analyzed syntactic and semantic properties of sentences likely to describe interactions between biomolecules. The resulting empirical data were used to design an algorithm for the PathBinder system to extract biomolecular interactions from texts. PathBinder searches PubMed for sentences describing interactions between two given biomolecules. PathBinder then uses probabilistic methods to combine evidence from multiple relevant sentences in PubMed to assess the relative likelihood of interaction between two arbitrary biomolecules. A biomolecular interaction network was constructed based on those likelihoods.


The text empirics approach used here supports computationally friendly, performance competitive, automatic extraction of biomolecular interactions from texts.

Availability webcite.