Identifying work related injuries: comparison of methods for interrogating text fields
1 National Centre for Health Information Research and Training, Queensland University of Technology, Victoria Park Road, Kelvin Grove, Queensland, 4059, Australia
2 Queensland Injury Surveillance Unit (QISU), Mater Hospital, Stanley Street, Brisbane, Queensland, 4101, Australia
3 School of Public Health, University of Sydney, Fisher Road, Camperdown, New South Wales, 2050, Australia
4 Research Centre for Injury Studies, Flinders University, Laffer Drive, Bedford Park, South Australia, 5042, Australia
5 Monash University Accident Research Centre, Monash University Clayton Campus, Melbourne, Victoria, 3800, Australia
BMC Medical Informatics and Decision Making 2010, 10:19 doi:10.1186/1472-6947-10-19Published: 7 April 2010
Work-related injuries in Australia are estimated to cost around $57.5 billion annually, however there are currently insufficient surveillance data available to support an evidence-based public health response. Emergency departments (ED) in Australia are a potential source of information on work-related injuries though most ED's do not have an 'Activity Code' to identify work-related cases with information about the presenting problem recorded in a short free text field. This study compared methods for interrogating text fields for identifying work-related injuries presenting at emergency departments to inform approaches to surveillance of work-related injury.
Three approaches were used to interrogate an injury description text field to classify cases as work-related: keyword search, index search, and content analytic text mining. Sensitivity and specificity were examined by comparing cases flagged by each approach to cases coded with an Activity code during triage. Methods to improve the sensitivity and/or specificity of each approach were explored by adjusting the classification techniques within each broad approach.
The basic keyword search detected 58% of cases (Specificity 0.99), an index search detected 62% of cases (Specificity 0.87), and the content analytic text mining (using adjusted probabilities) approach detected 77% of cases (Specificity 0.95).
The findings of this study provide strong support for continued development of text searching methods to obtain information from routine emergency department data, to improve the capacity for comprehensive injury surveillance.