This article is part of the supplement: Seventh International Conference on Bioinformatics (InCoB2008)
Semi-automatic conversion of BioProp semantic annotation to PASBio annotation
1 Department of Computer Science & Engineering, Yuan Ze University, Chung-Li, Taiwan, R.O.C
2 Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan, R.O.C
3 Department of Computer Science, National Tsing-Hua University, Hsinchu, Taiwan, R.O.C
BMC Bioinformatics 2008, 9(Suppl 12):S18 doi:10.1186/1471-2105-9-S12-S18Published: 12 December 2008
Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator.
Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs.
Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.