Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Seventh International Conference on Bioinformatics (InCoB2008)

Open Access Open Badges Research

Semi-automatic conversion of BioProp semantic annotation to PASBio annotation

Richard Tzong-Han Tsai1*, Hong-Jie Dai23, Chi-Hsin Huang2 and Wen-Lian Hsu23*

Author affiliations

1 Department of Computer Science & Engineering, Yuan Ze University, Chung-Li, Taiwan, R.O.C

2 Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan, R.O.C

3 Department of Computer Science, National Tsing-Hua University, Hsinchu, Taiwan, R.O.C

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2008, 9(Suppl 12):S18  doi:10.1186/1471-2105-9-S12-S18

Published: 12 December 2008



Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator.


Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs.


Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.