Email updates

Keep up to date with the latest news and content from BMC Medical Informatics and Decision Making and BioMed Central.

Open Access Research article

Identifying work related injuries: comparison of methods for interrogating text fields

Kirsten McKenzie1*, Margaret A Campbell1, Deborah A Scott12, Tim R Discoll3, James E Harrison4 and Roderick J McClure5

Author Affiliations

1 National Centre for Health Information Research and Training, Queensland University of Technology, Victoria Park Road, Kelvin Grove, Queensland, 4059, Australia

2 Queensland Injury Surveillance Unit (QISU), Mater Hospital, Stanley Street, Brisbane, Queensland, 4101, Australia

3 School of Public Health, University of Sydney, Fisher Road, Camperdown, New South Wales, 2050, Australia

4 Research Centre for Injury Studies, Flinders University, Laffer Drive, Bedford Park, South Australia, 5042, Australia

5 Monash University Accident Research Centre, Monash University Clayton Campus, Melbourne, Victoria, 3800, Australia

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2010, 10:19  doi:10.1186/1472-6947-10-19

Published: 7 April 2010

Abstract

Background

Work-related injuries in Australia are estimated to cost around $57.5 billion annually, however there are currently insufficient surveillance data available to support an evidence-based public health response. Emergency departments (ED) in Australia are a potential source of information on work-related injuries though most ED's do not have an 'Activity Code' to identify work-related cases with information about the presenting problem recorded in a short free text field. This study compared methods for interrogating text fields for identifying work-related injuries presenting at emergency departments to inform approaches to surveillance of work-related injury.

Methods

Three approaches were used to interrogate an injury description text field to classify cases as work-related: keyword search, index search, and content analytic text mining. Sensitivity and specificity were examined by comparing cases flagged by each approach to cases coded with an Activity code during triage. Methods to improve the sensitivity and/or specificity of each approach were explored by adjusting the classification techniques within each broad approach.

Results

The basic keyword search detected 58% of cases (Specificity 0.99), an index search detected 62% of cases (Specificity 0.87), and the content analytic text mining (using adjusted probabilities) approach detected 77% of cases (Specificity 0.95).

Conclusions

The findings of this study provide strong support for continued development of text searching methods to obtain information from routine emergency department data, to improve the capacity for comprehensive injury surveillance.