Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Fourth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBio) 2010

Open Access Proceedings

Automatic classification of sentences to support Evidence Based Medicine

Su Nam Kim12, David Martinez12*, Lawrence Cavedon123 and Lars Yencken12

Author Affiliations

1 NICTA VRL, The University of Melbourne, 3010, Australia

2 Department of Computer Science and Software Engineering, The University of Melbourne, 3010, Australia

3 School of Computer Science and IT, RMIT University, Melbourne 3000, Australia

For all author emails, please log on.

BMC Bioinformatics 2011, 12(Suppl 2):S5  doi:10.1186/1471-2105-12-S2-S5

Published: 29 March 2011

Abstract

Aim

Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels.

Method

We constructed a corpus of 1,000 medical abstracts annotated by hand with specified medical categories (e.g. Intervention, Outcome). We explored the use of various features based on lexical, semantic, structural, and sequential information in the data, using Conditional Random Fields (CRF) for classification.

Results

For the classification tasks over all labels, our systems achieved micro-averaged f-scores of 80.9% and 66.9% over datasets of structured and unstructured abstracts respectively, using sequential features. In labeling only the key sentences, our systems produced f-scores of 89.3% and 74.0% over structured and unstructured abstracts respectively, using the same sequential features. The results over an external dataset were lower (f-scores of 63.1% for all labels, and 83.8% for key sentences).

Conclusions

Of the features we used, the best for classifying any given sentence in an abstract were based on unigrams, section headings, and sequential information from preceding sentences. These features resulted in improved performance over a simple bag-of-words approach, and outperformed feature sets used in previous work.