Accurate, timely and automated identification of patients at high risk for severe clinical deterioration using readily available clinical information in the electronic medical record (EMR) could inform health systems to target scarce resources and save lives.
We identified 7,466 patients admitted to a large, public, urban academic hospital between May 2009 and March 2010. An automated clinical prediction model for out of intensive care unit (ICU) cardiopulmonary arrest and unexpected death was created in the derivation sample (50% randomly selected from total cohort) using multivariable logistic regression. The automated model was then validated in the remaining 50% from the total cohort (validation sample). The primary outcome was a composite of resuscitation events, and death (RED). RED included cardiopulmonary arrest, acute respiratory compromise and unexpected death. Predictors were measured using data from the previous 24 hours. Candidate variables included vital signs, laboratory data, physician orders, medications, floor assignment, and the Modified Early Warning Score (MEWS), among other treatment variables.
RED rates were 1.2% of patient-days for the total cohort. Fourteen variables were independent predictors of RED and included age, oxygenation, diastolic blood pressure, arterial blood gas and laboratory values, emergent orders, and assignment to a high risk floor. The automated model had excellent discrimination (c-statistic=0.85) and calibration and was more sensitive (51.6% and 42.2%) and specific (94.3% and 91.3%) than the MEWS alone. The automated model predicted RED 15.9 hours before they occurred and earlier than Rapid Response Team (RRT) activation (5.7 hours prior to an event, p=0.003)
An automated model harnessing EMR data offers great potential for identifying RED and was superior to both a prior risk model and the human judgment-driven RRT.
Keywords:Cardiopulmonary arrest; Forecasting; Medical informatics; Models; Statistical; Medicine; Intensive care units
Out of intensive care unit (ICU) cardiac arrests and unexpected deaths are common despite evidence that patients often show signs of clinical deterioration hours in advance [1-4]. This has prompted national organizations to recommend the implementation of rapid response teams (RRTs) as a strategy to prevent hospital deaths . Such recommendations were made despite conflicting evidence regarding the benefits of RRTs [3,6-10]. Some have speculated that the indeterminate benefit of RRTs is due to insufficiently predictive activation criteria and poor response time by clinical staff . Early warning systems have been developed to identify deteriorating patients using readily available clinical information . However, these early warning systems may not be adequate because they 1) require monitoring and activation by often overburdened clinical staff, 2) fail to systematically monitor all patients, and 3) demonstrate only modest accuracy identifying which patients are at risk of out of ICU cardiopulmonary arrest and death. Early warning systems that are timely, accurate, automated, and comprehensive in their surveillance are needed.
The increasing use of electronic medical records (EMR) in health care makes the use of computerized prediction models possible. These models could represent powerful avenues to identify patients at high risk of adverse events [13,14]. Though a few studies have examined the accuracy of clinical automation to identify patients at risk of clinical deterioration, they retain limited utility since they do not fully harness the EMR, produce no actionable alerts, define primary outcomes differently, and do not allow for monitoring patients in real time [15,16].
This study sought to 1) derive and validate an automated prediction model based on near real-time EMR data to identify patients at high risk of out of ICU resuscitation events and death (RED), 2) compare the test operating characteristics of the new automated model to the previously published Modified Early Warning Score (MEWS)  and human judgment-activated institutional RRT, and 3) determine if the automated model detected RED events sooner than the human judgment activated RRT.
Setting and patient population
The automated prediction model was constructed using data from adult patients admitted to Parkland Hospital, a large urban academic hospital in Dallas, TX, between May 18, 2009 and March 31, 2010. Patients were included in the study if they were admitted to the internal medicine ward from either the emergency department (ED) or outpatient clinics. Additionally, patients were included if they were admitted to the ICU from the ED. Patients were excluded if they were directly admitted to the surgical floor or obstetrics or had a do not resuscitate (DNR) order at admission. However, any hospital patient-days prior to a patient consenting to a DNR order were included. To determine if early collection of data was predictive of events, all variables included in the automated model were obtained from the previous calendar day defined as time period between 12:00 AM and 11:59 PM. Therefore, events that occurred on the first day of each hospitalization were excluded. We also excluded any data within one hour of an event to make sure the model did not include factors that were early signs of resuscitation care. Patient-days that occurred after an event were excluded. The research protocol was approved by The University of Texas Southwestern Institutional Review Board (IRB) which concluded that the research presented no more than minimal risk of harm to subjects. Therefore, the IRB waived the need for informed consent.
The primary outcome variable was defined as resuscitation events or death (RED). Resuscitation events were defined as out of ICU hospital codes and unplanned transfers to the ICU. Hospital codes included cardiopulmonary arrests (CPA) and acute respiratory compromise (ARC) events, regardless of location, except those that occur in the ICU for ICU length of stays >24 hours. CPA was defined as an event in which chest compressions and/or defibrillation are delivered, and an ARC event was defined as an event requiring emergency assisted ventilation . These events were identified electronically through the hospital’s internal registry which is structured on the American Heart Association’s Get With The Guidelines – Resuscitation national registry, formally known as the National Registry of Cardiopulmonary Resuscitation . This registry collects data on in-hospital resuscitation events from hospitals across the United States to provide feedback on an institution’s resuscitation practices and patient outcomes. Unplanned ICU transfers included any transfers from the internal medicine ward or ED to a medical or cardiac ICU requiring an ICU length of stay >24 hours. We used unplanned ICU transfer in the definition of a RED event because these patients were in critical condition and would have a high likelihood of CPA or death had the transfer not occurred. There are no elective admissions to the ICU at this institution. Unexpected death was defined as: 1) an in-hospital death that occurred on the medical ward; or 2) death that occurred in patients transferred to a medical or cardiac ICU team with an ICU length of stay <24 hours. Patient death and transfers to the medical or cardiac ICU were identified electronically in the hospital’s EMR. The date and time of bedside RRT activation was extracted from the hospital’s systematic log of all RRT calls. Data used to predict the primary outcome were extracted from the previous calendar day.
We developed a conceptual model of RED events based on a comprehensive review of the literature and expert clinical opinion. Candidate predictor variables for the automated model were those extractable from the hospital EMR (EPIC Systems Corporation, Verona, WI). Data from the previous 24 hours calendar day were used to determine the daily risk. Potential predictor variables included the most abnormal laboratory value or vital sign in the 24 hours period between 12:00 AM to 11:59 PM on each hospital day. We also examined other possible indicators of impending RED events such as STAT physician orders and medications. Medications of interest were those thought to increase risk of serious adverse events according to the Institute for Safe Medication Practices (ISMP). The MEWS is a previously published risk score based on the number and degree of vital sign and level of consciousness (LOC) abnormalities (Additional file 1: Appendix A) . We determined LOC using a text-processing algorithm to read the free text in nursing notes. Finally, we postulated that patients who were more ill or unstable in subtle, hard-to-measure ways could be preferentially admitted to certain non-ICU medical floors, so we classified medicine wards accounting for the top 15% of RED as “high risk floors.”
Format: DOCX Size: 14KB Download file
Derivation and validation of the automated prediction model
The automated model was constructed in stages. First, the total cohort was randomly split into derivation (50%) and validation (50%) subsamples. We constructed the final model using the derivation cohort. Second, recursive partitioning was used to identify significant cut-points in continuous candidate variables that were associated with an increased rate of RED events. Third, candidate predictors of RED events were identified using univariate logistic regression. Continuous variables were examined for nonlinear effects by testing the contributions of spline functions and variable transformations. Fourth, candidate variables significant at p ≤ 0.20 were entered into a multivariate logistic regression model. Final model variables were selected on the basis of conceptual and statistical significance (p ≤0.05). The unit of analysis was in patient-days.
The model based on the derivation dataset was validated by comparing its performance in the validation sample. Model discrimination was assessed with the c-statistic and calibration with the Hosmer-Lemeshow goodness-of-fit test . Using cut-points determined by the derivation subsample, five risk categories were created on quintiles of predicted risk and graphically assessed in the validation sample. To account for within patient correlation, we used robust variance–covariance matrix estimators for computing standard errors for model coefficients.
Prior to model development, we assessed all variables for missing values. For categorical and continuous variables with less than 2% missing data, a missing category was created, and the event rate was compared with and pooled into the most appropriate reference group. For categorical and continuous variables that had greater than 2% missing data and were not measured from one day to the next, a “never measured” category was created and risk was compared to the other categories or cut-points and pooled into the appropriate reference group. Documentation by exception is a common approach in the predictive model literature [13,14,19,20].
We determined relative contribution of each predictor to RED events by examining the marginal increase in the model chi-square accounted for by each predictor as it was added and removed from the final automated model [21,22].
Comparing performance of the automated model to the MEWS
Patients were classified to be at risk of RED events at a probability threshold of 4% as determined by the automated model. Since the baseline risk for RED events was assumed to be 1%, we considered a four times greater than average risk an important threshold for concern. Variables used to calculate the MEWS were obtained in the previous calendar day between 12:00 AM to 11:59 PM. If a patient experienced a RED event, data from the previous calendar day and those up to one hour prior to the event were used to calculate the MEWS. A MEWS of ≥5 was considered the critical threshold based on the literature . Sensitivity, specificity, positive predictive value, and negative predictive value were determined for both the automated model and the MEWS. The test operating characteristics of the automated model and the MEWS were compared using the c-statistic. Confidence intervals were constructed for the c-statistics at the 95% level .
Comparing performance of the automated model to the institutional RRT
The institutional RRT is deployed when one or more of the following is present in a patient: 1) heart rate <40 or >130 beats/min, 2) systolic blood pressure <90 mmHg, 3) respiratory rate <8 or >30 breaths/min, 4) partial pressure of oxygen <88% on room air, 5) oxygen requirement >50%, and 6) acute change in mental status. We calculated the sensitivity, specificity, positive predictive value and negative predictive value, along with 95% confidence intervals, for both the automated model and the institutional RRT. Moreover, we evaluated a subgroup of patients that experienced an event who activated the institutional RRT and had a predicted probability of a RED event of 4% by the automated model (model activation). In this subgroup, we aimed to determine the difference in time between model activation and RRT deployment. We also evaluated the time difference between the automated trigger of a RED event (patient’s predicted probability of a RED event exceeds 4%) and RRT deployment, regardless of an event. Our hypothesis was that the automated model would detect a patient who had a RED event well in advance of the institutional RRT. We compared this time difference using a paired Student’s t-test. Analyses were conducted using STATA statistical software (version 10.0; STATA Corp, College Station, TX) and RTREE .
A total of 7,466 hospitalized patients accounted for 46,974 patient-days. The derivation and validation cohorts were evenly matched across demographic, clinical, provider orders, administered medications and summary variables (Table 1). Mean age was 51.2 in the derivation cohort and 51.4 in the validation cohort, and 56.1% and 54% were male, respectively.
Table 1. Cohort characteristics (N=46,974 patient-days)
Primary outcomes and predictors of RED events
Major clinical deterioration occurred in 1 in 100 admissions (1.3% and 1.2% of hospitalizations in the derivation and validation cohorts). The univariate predictors of RED events are shown in Table 2 and included: older age (>54 years), abnormal vital signs (temperature >99.5, respiratory rate >24 bpm, DBP >125 mm/Hg), abnormal laboratory values (e.g., potassium >5.1 mEq/L, glucose >600 mg/dL, sodium <128 mEq/L), abnormal arterial blood gas (ABG) results (pCO2 ≤22 mmHg or pCO2 >70 mmHg), STAT physician orders (CBC order, electrocardiogram order, ABG order), high risk floor assignment, high alert medication orders (ISMP high alert medications, antidote medications, IV fluid bolus), level of consciousness, and the MEWS score.
Table 2. Univariate predictors of RED events (N= 23,127 patient days)
In the multivariable analysis, 14 variables were independent predictors of RED events including: age >54 years, abnormal vital signs (DBP >120 mmHg, SpO2 ≤86%) abnormal laboratory values (AST >250 U/L, white blood cell count >11 × 103 cells/mm3, platelets <100 × 103 cells/mm3, potassium >5.1 mEq/L), abnormal ABG results (pCO2 ≤22 mmHg or pCO2 >70 mmHg), physician orders for an ABG, electrocardiogram, STAT orders for head computed tomography (CT) or magnetic resonance imaging, chest CT, abdominal ultrasound, and chest x-rays, high risk floor assignment and summary MEWS score (Table 3). The strongest individual indicators of RED events were: abnormal ABG results and high risk floor assignment.
Table 3. Multivariate predictors of RED events (derivation cohort, N= 23,127)
Performance of the automated model
The final automated model had good discrimination in both the derivation and validation dataset with a c-statistic of 0.87 (95% CI 0.85-0.89) and 0.85 (95% CI 0.82 - 0.87), and was well-calibrated (Hosmer Lemeshow test p=0.12). It also stratified patients across a wide spectrum of risk from 0.14% in the lowest quintile to 4.3% in the highest one (Figure 1). The principal influencing variables in the automated model as assessed by the uniquely attributable chi-square were high risk floor assignment (37.9%) followed by the MEWS (25.5%), demographics, laboratory and vital signs (18.2%), and physician orders (18.4%).
Figure 1. Observed rates of RED events stratified by quintiles of risk in the automated model. Legend: Group 1 is the lowest quintile of risk and group 5 is the highest quintile of risk. The Figure shows comparable performance in the derivation (white bars) and validation (black bars) samples.
Comparing the performance of the automated model performance and the MEWS
The automated model was both more sensitive (51.6% and 42.2%) and specific (94.3% and 91.3%) than the MEWS. The positive predictive value (PPV) of the automated model was superior to the MEWS (10% and 5.6%). The negative predictive values (NPV) were similar (99.4% and 99.2%). The automated model performed significantly better than the MEWS with a c-statistic of 0.85 (95% CI 0.82 - 0.87) compared to a c-statistic of 0.75 (95% CI 0.71 - 0.78) (Figure 2).
Figure 2. Comparing Receiver Operator Characteristic curve performance for final automated model versus the MEWS.
Comparing the performance of the automated model to the RRT calls
The RRT was activated for 357 of eligible study patients as part of usual care during the study period. The automated model was more sensitive than the RRT (51.6% vs. 25.8%). However, it was slightly less specific than the RRT (98.8% vs. 94.3%). The RRT had a better PPV than the automated model (21% and 10%) and similar NPV (99.1% and 99.4%). The median number of times the automated model flagged patients at risk per day during the study period was 9 and the median number of RRT calls per day was 2.
There were a total of 17 patients who were at risk of RED events by the automated model, where the institutional RRT was deployed and experienced a RED event. The automated model predicted an event 15.9 (±7.7) hours before the actual event occurred compared to the RRT which was called a mean of 8.4 (±8.5) hours prior the actual event (p=0.003). Overall, the automated model also determined a patient to be at risk 5.7 hours (95% CI 3.1-8.3) earlier than the RRT was called for all types of RED events.
We developed and validated a novel, automated model using the EMR for predicting RED events in patients admitted to the hospital. From a statistical perspective, the automated model had excellent discrimination, was well-calibrated, and had outstanding specificity (94.3%) and good sensitivity (51.6%). The automated model also had better discrimination, sensitivity and specificity than the previously published MEWS. From a practical standpoint, the model identified patients destined to have RED event on average 16 hours (or more than one nursing shift) before they actually experienced a major clinical event. Further, the automated model was able to accurately predict RED events using information obtained from the previous 24 hours. Together with its ability to screen all patients systematically and automatically, low false positive rate, and advance notice, the automated model appears to provide both accurate and actionable intelligence.
Since the growing standard of care is to use the RRTs to meet this goal, we were particularly interested in the more practical comparison of the new model to the human or manually activated RRT approach used in our hospital. Overall, the automated model had twice the sensitivity of the RRT (51.6% v. 25.8%), demonstrating that computerized surveillance is likely to identify more patients at risk for major adverse events compared to providers’ clinical judgment. The automated model achieved this much higher sensitivity with only a small trade-off in specificity (94.3% v. 98.8%). Perhaps of greatest importance from a patient safety viewpoint, the automated model flagged patients 5.7 hours sooner than the RRT. Accurately identifying patients earlier in of the course of physiological deterioration should be expected to yield greater opportunity for rescue.
The superior performance of the new model likely came from the richer source of information available in the EMR which is unavailable to simpler vital sign based models. In addition, monitoring physician orders for ECG, ABG or other STAT orders appears to be an important predictive measure, perhaps reflecting a physician’s escalating concern about a patient’s stability. Novel variables, such as high risk floor assignment, may be a proxy for nurse staffing ratios, physician team composition, or other unknown system or process-related factors that are associated with increased acuity or risk.
We were somewhat surprised that none of the medication variables were included in the final model, despite looking at many candidate predictors. This result may be due to the administration of antidote medicines that occur late in the process of clinical deterioration. The risk of causing RED events due to use of high risk medicines may be mediated through their effect on vital sign and laboratory abnormalities and partly depend on a patient’s underlying hepatic and renal physiological reserve. There is a need to explore more complex drug interactions and their association with adverse events.
The 1.3% prevalence in this study is similar to that seen in other studies [3,6]. The performance of the MEWS in this study was also consistent with prior reports (c-statistic=0.75), confirming its moderate predictive capabilities [12,15]. Our institution had an RRT call rate similar to those observed elsewhere .
Several limitations are worth noting. First, we used retrospective data from a single urban health system to derive and validate our model. While the rate of RED events and RRT calls in this sample is similar to other studies, the generalizability of this model to other patient populations and health systems is unknown and merits further investigation . Second, the derivation and validation of the novel model was done retrospectively, so the next step would be prospective validation ideally in more than one setting. Third, and even more importantly, the ultimate value of the automated model will depend on whether it can realistically be used in real-time and if flagging patients at high risk will change clinical management, improves patient outcomes and/or reduces human surveillance burden. While we hypothesize that earlier warning and proper identification of patients at risk will decrease RED events, this has yet to be shown. Fourth, although the automated model achieves a c-statistic of 0.85, there is a moderate false positive rate. However, given the severity of RED events, we accept the false positive rate in exchange for greater model sensitivity. More work is necessary to prevent the activation of overburdened clinical staff to false alerts. Fifth, there may be some difficulty generalizing “high risk floors”, although, institutions can determine the rate of RED for each floor and establish which areas comprise the top 15% of events. Finally, our model uses data derived from a comprehensive EMR, so it may only be useful in such settings. However, the deployment of integrated EMRs in hospitals has been accelerating greatly due to recent federal investments in health information technology and is expected to continue over the next 5 to 10 years [25-27]. While our model has robust predictive capabilities, we believe employing additional technologies such as natural language processing may further improve prediction. Another area of promise involves more sophisticated adverse drug event detection software to further classify risk and improve prediction of poor hospital outcomes.
One in 100 hospitalized medical patients experienced RED events, among the most serious of all adverse patient safety outcomes. The novel, EMR-based model we developed was better at predicting these serious adverse events compared to prior risk models and the human judgment based RRT approach. While formal prospective implementation and evaluation of such a computerized RED event risk detection strategy is needed in the form of a controlled trial, this automated prediction model could be a powerful tool in the effort to reduce out of ICU CPA, unplanned transfers to the ICU, and death. Models such as ours may foreshadow higher level “meaningful use” of EMRs to improve inpatient outcomes.
ABG: Arterial blood gas;ARC: Acute respiratory compromise;CBC: Complete blood count;CPA: Cardiopulmonary resuscitation;CT: Computed tomography;DNR: Do not resuscitate;ECG: Electrocardiogram;ED: Emergency department;EMR: Electronic medical record;ICU: Intensive care unit;ISMP: Institute for safe medication practices;IV: Intravenous;LOC: Level of consciousness;MEWS: Modified early warning score;NPV: Negative predicted value;PPV: Positive predicted value;RED: Resuscitation events and death;RRT: Rapid response team
The authors of this study declare no competing interest with regards to this publication.
CAC and RA participated in the study concept and design, and acquisition of data. CAA, CAC, EAH, JJS, CEG, SZ and RA participated in the analysis and interpretation of the data. CAA, CAC, LC and RA participated in drafting the manuscript. CAA, CAC, SZ, EAH, JJS, CEG, LC and RA provided critical revisions of the manuscript. CAA, CAC, SZ and RA participated in the statistical analysis of the data. JJS and RA provided administrative, technical and material support for the research. RA provided supervision. All authors read and approved the final manuscript.
This work was funded by the Parkland Health & Hospital System. RA and CC were additionally supported by the Commonwealth Fund Grant Number 20100323, titled “Harnessing EMR Data to Reduce Readmissions: Developing and Validating a Real Time Predictive Model.” CAA was supported in part by UT-STAR, NIH/NCATS Grant Number KL2 RR024983. The content is solely the responsibility of the authors and does not necessarily represent the official views of UT-STAR, UT Southwestern Medical Center and its affiliated academic and health care centers, the National Center for Advancing Translational Sciences, or the National Institutes of Health. SZ was supported in part by National Institutes of Health Grant Number UL1 RR024982, titled, “North and Central Texas Clinical and Translational Science Initiative”.
We would also like to thank Adeola Jaiyeola, MD, MHSc and Brett Moran, MD for their participation in protocol development and manuscript revision. We also want to thank Jan Ross, M.S. for copyediting this manuscript.
Buist MD, Jarmolowski E, Burton PR, Bernard SA, Waxman BP, Anderson J: Recognising clinical instability in hospital patients before cardiac arrest or unplanned admission to intensive care. A pilot study in a tertiary-care hospital.
Med J Aust 1999, 171(1):22-25. PubMed Abstract
Amarasingham R, Moore BJ, Tabak YP, Drazner MH, Clark CA, Zhang S, Reed WG, Swanson TS, Ma Y, Halm EA: An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data.
Kho A, Rotz D, Alrahi K, Cardenas W, Ramsey K, Liebovitz D, Noskin G, Watts C: Utility of commonly captured data from an EHR to identify hospitalized patients at risk for clinical deterioration. AMIA. Chicago, Illinois: Annual Symposium Proceedings/AMIA Symposium; 2007:404-408.
Jt Comm J Qual Patient Saf 2007, 33(9):569-574. PubMed Abstract
Get With The Guidelines-Resuscitation.
CMS EHR Incentive Program.
The pre-publication history for this paper can be accessed here: