Entities, relations, events: representing biomolecular semantics

Pyysalo, Sampo

doi:10.1186/1471-2105-11-S5-O6

Volume 11 Supplement 5

Workshop on Advances in Bio Text Mining

Oral presentation
Open access
Published: 20 September 2010

Entities, relations, events: representing biomolecular semantics

Sampo Pyysalo¹

BMC Bioinformatics volume 11, Article number: O6 (2010) Cite this article

2051 Accesses
2 Citations
Metrics details

Biomedical information extraction efforts have until recently primarily focused on the detection of mentions of named entities (NEs) (e.g. genes and proteins) and the recognition of simple associations of these entities, predominantly modeled as pairwise relations. While applicable to many key tasks such as the recognition of protein-protein interactions, the limitations of the relation representation are becoming increasingly apparent in the pursuit of advanced extraction and text mining targets such as Gene Ontology annotations and metabolic and signaling pathways.

A number of recent studies have proposed more expressive alternatives to the relation representation, along with annotated resources such as the BioInfer (http://www.it.utu.fi/BioInfer) and GENIA Event (http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/) corpora. A major step toward practical systems capable of extracting such representations was taken in the BioNLP 2009 Shared Task on Event Extraction [1]. Providing annotation for gene/protein NEs as a starting point, the task centered on the extraction of an event representation that can capture the associations of arbitrary numbers of participants in specified roles (e.g. Theme and Cause). The representation further connects events to specific statements in text and treats them as primary objects of annotation, allowing events to act as participants in other events and to be specified as being negated or stated speculatively.

Mentions of entity names (e.g. p53) serve as the basis for event extraction as they provide a connection to specific real-world entities. However, this choice implies some approximations in representation: statements involving, for example, complex of c-Rel and p50 are modeled as events with the NEs (c-Rel and p50) as participants. Marking either a non-specific term such as complex or the entire phrase as a participant can capture more context, but also opens a new question for automatic processing: what do events involving such entities imply for the NEs that connect the representation to reality?

Pairwise relations specifying how NEs are associated with terms in their context provide one possible answer. A small set of basic relation types with well-defined semantics such as object-component (for e.g. complex-subunit associations) and collection-member (for family-protein) can characterize many NE-term associations and provide specific meaning to general terms [2]. Re-introducing pairwise relations in this role suggests a detailed representation where both NEs and general terms are marked as entities, relations connect the two, and events model statements of change involving the entities, with specific NEs and terms originally stated as participants (Figure 1).

Whether the detail afforded by such a model is of sufficient practical value to overweigh the challenges in its automatic extraction remains an interesting question for future study.

References

Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J: Overview of BioNLP'09 Shared Task on Event Extraction. Proceedings of the BioNLP 2009 Shared Task, Boulder, Colorado 2009, 1–9. [http://www.aclweb.org/anthology/W09–1401]
Google Scholar
Pyysalo S, Ohta T, Kim JD, Tsujii J: Static Relations: a Piece in the Biomedical Information Extraction Puzzle. In Proceedings of Natural Language Processing in Biomedicine (BioNLP) NAACL 2009 Workshop. Boulder, Colorado: Association for Computational Linguistics; 2009:1–9.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Tokyo, Japan
Sampo Pyysalo

Authors

Sampo Pyysalo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sampo Pyysalo.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Pyysalo, S. Entities, relations, events: representing biomolecular semantics. BMC Bioinformatics 11 (Suppl 5), O6 (2010). https://doi.org/10.1186/1471-2105-11-S5-O6

Download citation

Published: 20 September 2010
DOI: https://doi.org/10.1186/1471-2105-11-S5-O6

Workshop on Advances in Bio Text Mining

Entities, relations, events: representing biomolecular semantics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Workshop on Advances in Bio Text Mining

Entities, relations, events: representing biomolecular semantics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us