Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups
- Equal contributors
1 Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Carl-Neuberg-Str. 1, 30625 Hanover, Germany
2 Peter L. Reichertz Institute for Medical Informatics, University of Braunschweig - Institute of Technology and Hanover Medical School, Mühlenpfordtstr. 23, 38106 Braunschweig, Germany
3 Geriatrics Research Group, Department of Geriatric Medicine, Charité University Medicine, Reinickendorfer Str. 61, 13347 Berlin, Germany
BMC Medical Informatics and Decision Making 2012, 12:19 doi:10.1186/1472-6947-12-19Published: 14 March 2012
Hospital in-patient falls constitute a prominent problem in terms of costs and consequences. Geriatric institutions are most often affected, and common screening tools cannot predict in-patient falls consistently. Our objectives are to derive comprehensible fall risk classification models from a large data set of geriatric in-patients' assessment data and to evaluate their predictive performance (aim#1), and to identify high-risk subgroups from the data (aim#2).
A data set of n = 5,176 single in-patient episodes covering 1.5 years of admissions to a geriatric hospital were extracted from the hospital's data base and matched with fall incident reports (n = 493). A classification tree model was induced using the C4.5 algorithm as well as a logistic regression model, and their predictive performance was evaluated. Furthermore, high-risk subgroups were identified from extracted classification rules with a support of more than 100 instances.
The classification tree model showed an overall classification accuracy of 66%, with a sensitivity of 55.4%, a specificity of 67.1%, positive and negative predictive values of 15% resp. 93.5%. Five high-risk groups were identified, defined by high age, low Barthel index, cognitive impairment, multi-medication and co-morbidity.
Our results show that a little more than half of the fallers may be identified correctly by our model, but the positive predictive value is too low to be applicable. Non-fallers, on the other hand, may be sorted out with the model quite well. The high-risk subgroups and the risk factors identified (age, low ADL score, cognitive impairment, institutionalization, polypharmacy and co-morbidity) reflect domain knowledge and may be used to screen certain subgroups of patients with a high risk of falling. Classification models derived from a large data set using data mining methods can compete with current dedicated fall risk screening tools, yet lack diagnostic precision. High-risk subgroups may be identified automatically from existing geriatric assessment data, especially when combined with domain knowledge in a hybrid classification model. Further work is necessary to validate our approach in a controlled prospective setting.