Surgeries and other procedures can influence the risk of death in hospital. All published scales that predict post-operative death risk require clinical data and cannot be measured using administrative data alone. This study derived and internally validated an index that can be calculated using administrative data to quantify the independent risk of hospital death after a procedure.
For all patients admitted to a single academic centre between 2004 and 2009, we estimated the risk of all-cause death using the Kaiser Permanente Inpatient Risk Adjustment Methodology (KP-IRAM). We determined whether each patient underwent one of 503 commonly performed therapeutic procedures using Canadian Classification of Interventions codes and whether each procedure was emergent or elective. Multivariate logistic regression modeling was used to measure the association of each procedure-urgency combination with death in hospital independent of the KP-IRAM risk of death. The final model was modified into a scoring system to quantify the independent influence each procedure had on the risk of death in hospital.
275 460 hospitalizations were included (137,730 derivation, 137,730 validation). In the derivation group, the median expected risk of death was 0.1% (IQR 0.01%-1.4%) with 4013 (2.9%) dying during the hospitalization. 56 distinct procedure-urgency combinations entered our final model resulting in a Procedural Index for Mortality Rating (PIMR) score values ranging from -7 to +11. In the validation group, the PIMR score significantly predicted the risk of death by itself (c-statistic 67.3%, 95% CI 66.6-68.0%) and when added to the KP-IRAM model (c-index improved significantly from 0.929 to 0.938).
We derived and internally validated an index that uses administrative data to quantify the independent association of a broad range of therapeutic procedures with risk of death in hospital. This scale will improve risk adjustment when administrative data are used for analyses.
Surgeries and procedures are major functions of hospitals that importantly influence patient outcomes and hospital performance. Procedural outcomes are often used to compare surgeons, clinical divisions, hospitals, and health jurisdictions. Many different types of surgeries and procedures exist in different specialties, involving very different patient populations. As a result, the influence of different types of procedures on hospital outcomes can vary greatly.
Quantifying the independent influence of a broad range of different types of procedures on outcomes would allow analysts, administrators, and researchers to measure, compare, and adjust for the importance of each procedure. Six indexes have been developed to quantify the risk of post-operative death after a range of surgeries (Table 1) [1-6]. Each of these indexes, however, requires clinical information that is usually unavailable in routinely collected administrative data.
Table 1. Summary of previous indexes predicting risk of death following surgery
In this study, we derived and internally validated an index to measure the influence of a broad range of surgeries on in-hospital mortality. Our goal was to quantify the independent association of all procedures with the risk of death in hospital. To do this, we first grouped procedures based on administrative codes and the procedure's urgency status and then determined which of these procedure-urgency groups were associated with risk of death in hospital after adjusting for factors that are highly predictive of this outcome. We then created a scoring system to quantify the independent association of significant procedures with risk of death in hospital. This index can be calculated using administrative data and estimates the risk of death in hospital from these procedures that is independent of other factors associated with this outcome. It can be used to help risk-adjust analyses using administrative data that have death in hospital as an outcome. Such analyses could be done to identify factors independently associated with death in hospital and, in some situations, compare quality of care between institutions.
This study took place at The Ottawa Hospital (TOH), a tertiary-care teaching facility with three sites that averaged 20 000 admissions annually during the study period. TOH functions within a publicly funded health care system. TOH is the sole regional provider of trauma care, thoracic surgery, and neurosurgical interventions, and provides most of the region's oncological care.
We included all admissions to the hospital (including same-day surgeries) between 1 April 2004 and 1 April 2009. "Same-day surgeries" included patients who had their surgery on the same day on which they were admitted to hospital. These patients were typically discharged home the same day but may have been kept in hospital if complications occurred or if additional monitoring was required. We started patient recruitment in April 2004 to ensure that our hospital had at least two years of experience coding procedures with the Canadian Classification of Interventions (CCI) coding system (which was introduced in April 2002). Patient recruitment ended in April 2009 (the last complete year of data available when the analyses were conducted). To apply the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM)  - the method used to adjust for other risk factors associated with death in hospital - we excluded all patients with age ≤ 15 years at admission, all delivery-related obstetrical admissions, and those who were transferred to or from TOH. Throughout this study, the unit of analysis was the hospitalization.
We used multiple binomial logistic regression to derive our index. We chose death in hospital as the model outcome because it is accurately recorded and is important to all potential users of the index. There were a total of 4013 hospital deaths (2.9% of all admissions) in the derivation cohort. Our logistic model could therefore test a maximum of 400 procedures or surgeries (i.e. 10 deaths per exposure) to safely avoid problems with over-fitting and model instability .
We identified candidate procedures using their Canadian Classification of Interventions (CCI) code. The CCI system contains more than 18,000 unique codes. We therefore grouped procedures using the first five alpha-numerics of each code (which identifies the anatomical area and the intervention type) and limited our study to therapeutic procedures (i.e. CCI section 1). We used the admission status of the hospitalization (i.e. elective vs. non-elective admission) to classify the procedure urgency since urgency is an important and independent predictor of post-procedural outcomes [9-14]. Procedures that could not be performed electively (such as cardiac resuscitation, implantation of an internal device in the thoracic descending aorta, and control of bleeding in the thoracic cavity) were classified as "non-elective" regardless of the admission status of the hospitalization.
There were 3984 unique procedure-urgency combinations during the study period. Since this exceeded the maximum number of variables allowed in our model without overfitting (n = 400), we used three filters to exclude procedures. First, we only included procedures that were conducted on the day of the principal procedure (defined as the procedure considered by the health records analyst to be most significant during the patient's hospital stay). In 5% of hospitalizations, coded procedures occurred on more than one day. In such cases, only procedures that occurred on the day of the principal procedure were considered. Second, procedures had to be conducted at least once per month at our hospital during the study period (independent of its urgency status). Finally, the p-value for the association of the procedure with death in hospital (after adjusting for risk of death in-hospital measured with KP-IRAM) had to be less than 0.5.
Adjusting for Risk of Death in Hospital
To adjust for risk of death in hospital due to patient and hospitalization factors, we used the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM) . This model was derived and internally validated on almost 260,000 hospitalizations at 17 hospitals belonging to the Kaiser Permanente Health Plan and was subsequently validated at our hospital . The KP-IRAM includes six covariates including: patient age; patient sex; admission urgency (i.e. elective or emergent) and service (i.e. medical or surgical); admission diagnosis; severity of acute illness as measured by the Laboratory-based Acute Physiology Score (LAPS); and chronic comorbidities measured by the Comorbidity Point Score (COPS). Using the admission diagnosis, hospitalizations were grouped into "Primary Conditions," and a separate logistic regression model was created for each group. Interaction terms between age, LAPS, and comorbidity score were included. The model had excellent discrimination (c-statistic = 0.88) and calibration (p-value of Hosmer Lemeshow statistic for the entire cohort was 0.66) for all-cause death in hospital.
We made three minor modifications to the KP-IRAM for this study. First, Canada switched from the International Classification of Diseases (ICD) 9-CM system (used in the KP-IRAM) to the ICD-10-CA system in 2002. We therefore used tables (provided by Canadian Institute for Health Information) to translate ICD-9-CM admission diagnoses to ICD-10-CA codes. Second, we measured chronic comorbidities using the Elixhauser Index  instead of the COPS because the KP-IRAM performed equally well using either comorbidity index . Finally, the KP-IRAM was calculated on the day of the procedure (rather than at admission) for people who had one of the procedures included in the model. This model was used to estimate each patient's risk of death in hospital at the time of the procedure (expressed as a number that ranged between 0 and 1).
Creation of the Procedural Index for Mortality Risk (PIMR) Score
We randomly separated patients into equally sized derivation and validation groups. Using the derivation group, we ran a binary logistic regression model with death in hospital as the outcome and the KP-IRAM estimated risk as the adjusting covariate. The index day for patients undergoing one of the procedures considered for the model was the day of the procedure. For all other patients, the index day was the day of admission. Values of all covariates for the KP-IRAM model were those on the index day. We used stepwise variable selection to identify which candidate procedure-urgency combinations were independently associated with death in hospital. Surgeries with a 2-sided p-value less than 0.05 were retained in the model.
We then used the methods described by Sullivan et. al.  to modify the parameter estimates of this regression model into an index. The number of points assigned to each procedure equaled its regression coefficient divided by the coefficient in the model with the smallest absolute value. We rounded this quotient to the nearest whole number. This number translated the parameter estimates into units relative to the procedure with the smallest, independently significant association with death in hospital. Therefore, the association of a procedure assigned two points was twice as important for predicting risk of death in hospital as a procedure with one point. Each person's total Procedural Independent Mortality Risk (PIMR) score was then calculated by summing up the points of all significant procedural groups for which they had been coded.
When calculating the PIMR score, we tallied up only those procedures that were performed on the index day (i.e. the day on which the principal procedure was conducted). Procedures done on other days did not influence the PIMR score. The PIMR score also did not capture whether or not the procedure was the first procedure conducted during the hospitalization.
Assessment of the PIMR score
In the validation group, we described the distribution of the PIMR score and used logistic regression to measure the association of the PIMR score alone with risk of death in hospital.
We then measured the influence of the PIMR score on risk of death in hospital independent of other factors associated with this outcome. "Discrimination" measures a model's ability to distinguish between patients who did and did not die in hospital and was measured using the c-statistic . "Calibration" measures the accuracy of a model's predicted risk of death and was measured by dividing the study cohort into deciles and strata based on the estimated risk of death. Within each decile and stratum, observed and expected death rates were deemed similar if the 95% confidence interval around the former (calculated using exact methods ) included the latter. Overall calibration was summarized using the Hosmer Lemeshow statistic . Table cells containing less than five observations were censored to maintain patient confidentiality.
In the validation group, we then compared the predictive performance of models containing the KP-IRAM with and without the PIMR score. To do this, we used two statistical measures: the Integrated Discrimination Improvement (IDI)  and the Net Reclassification Improvement (NRI) . The IDI is the discrimination slope (the mean predicted risk in patients with the event minus that of patients without the event) of a model with the KP-IRAM and PIMR as independent predictors minus the discrimination slope of a model with the KP-IRAM alone as the independent predictor. An IDI above zero indicates improved discrimination (i.e. a larger separation in mean predicted risk between events and nonevents) with the addition of the PIMR. The NRI represents the net proportion of correct reclassification (with correct reclassification defined as the predicted risk moving upwards for events and downwards for non-events) among events and non-events (calculated separately and then summed) when the predicted risk from the model with KP-IRAM and PIMR is compared to that from the model with KP-IRAM alone. We also calculated the net number of correct reclassifications when the PIMR was added to the KP-IRAM.
SAS 9.2 (Cary, NC) was used for all analyses. The study was approved by The Ottawa Hospital Research Ethics Board.
There were 369 588 admissions to The Ottawa Hospital between 1 April 2004 and 1 April 2009. 93 971 of these hospitalizations were excluded from this study because patients were less than 15 years of age (n = 36 820), patients were transferred from or to another hospital (n = 12 931), or admissions were obstetrical and delivery-related (n = 44 220). We excluded another 157 admissions because they were missing a primary condition group (required to calculate the KP-IRAM). This left a total of 275 460 hospital admissions (137 730 in both the derivation and the validation group) consisting of 172 396 unique individuals. A description of patients in the derivation cohort is provided in Table 2. The validation group did not differ significantly from the derivation group (see additional file 1).
Table 2. Description of study hospitalizations in derivation cohort
In the entire cohort, a total of 1939 therapeutic procedures were coded during the study period. 1436 procedures were excluded because less than one procedure per month was performed during the study period. The remaining 503 procedures included a total of 938 procedure-urgency combinations. After adjusting for the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM) death risk estimate, the p-value of the association of 726 of these procedure-urgency combinations exceeded 0.5 in the derivation cohort and were therefore excluded. This left a total of 212 procedure-urgency combinations (including 168 individual surgeries) expressed as binomial (i.e. 1/0) variables that were offered to the logistic model (see additional file 2).
Additional file 2. List of the 212 unique procedure-urgency combinations offered to the multivariate logistic model. Additional file 2 contains the frequency (in the derivation set), description, and 5-digit CCI code of the 212 procedure-urgency combinations that were offered to the multivariate logistic model. The p-value for the association of each of these 212 procedure-urgency combinations with death in hospital was < 0.5 (after adjusting for the risk of death in-hospital, as measured with KP-IRAM).
Format: DOC Size: 272KB Download file
This file can be viewed with: Microsoft Word Viewer
After adjusting for important patient and admission factors, 56 procedure-urgency combinations (comprising 52 individual procedures) were independently associated with death in hospital (Table 3). 37 emergent and eight elective procedures were independently associated with an increased risk of death in hospital, while four emergent and seven elective procedures were protective. In the validation set, there were 22 664 (16.4%) admissions where the patient underwent at least one PIMR procedure, with 83% of these procedures occurring within the first three days of the hospitalization. Procedures having the strongest association with death in hospital included cardiac resuscitation, ventriculectomy, pericardial drainage, and pelvic irradiation. A full description of each procedure that was independently associated with death in hospital is given in Additional File 3.
Table 3. Procedures independently associated with death in hospital
Additional file 3. Full description of all procedure-urgency combinations independently associated with in-hospital death. Additional file 3 contains the frequency and full CCI code and description of all procedure-urgency combinations independently associated with in-hospital death (i.e. included in the PIMR index), as observed in the derivation set.
Format: DOC Size: 450KB Download file
This file can be viewed with: Microsoft Word Viewer
Four procedures were independently associated with risk of death in hospital regardless of whether the procedure was done emergently or electively (Table 4). In two cases, the elective version of the procedure was assigned more points (indicating a higher risk of death in hospital) than the emergent version of the procedure.
Table 4. Procedures independently associated with risk of death in-hospital regardless of procedure urgency
Parameter estimates for procedures in the final logistic model were modified into the Procedural Index for Mortality Risk (PIMR) score (Table 3). The PIMR score for individual procedures ranged from -7 to +11. Since 84% of admissions had none of the included procedures, most hospitalizations had a total PIMR score of 0 (Figure 1, left axis). The risk of death in hospital was significantly associated with the PIMR score (Figure 1, right axis). By itself, the PIMR score was moderately discriminative for death in hospital (c-statistic 67.3%, 95% CI 66.6%-68.0%).
Figure 1. Frequency distribution of the total Procedural Index for Mortality Risk (PIMR) score among validation admissions. The horizontal axis presents the total PIMR score. The bars and left vertical axis presents the percent of hospitalizations with each total PIMR score. Individual PIMR scores were grouped to contain at least 0.5% of all admissions. The line and right vertical axis presents the observed number of deaths (with 95% confidence intervals) in each PIMR score.
The total PIMR score significantly changed the expected risk of death in hospital beyond that estimated by the KP-IRAM (Figure 2). The total PIMR score also significantly improved the ability to predict risk of in-hospital death beyond that generated by the KP-IRAM. Model discrimination improved, as indicated by the c-statistic (increased from 0.929 [95% CI 0.926-0.932] to 0.938 [0.935-0.941]) and the Integrated Discrimination Improvement (IDI) (0.04327, 95% CI 0.0384-0.0482; p < .0001). Model calibration (Figure 3) did not change (Hosmer-Lemeshow fit statistic decreased from 37.56 to 36.51). The Net Reclassification Improvement (NRI) analysis showed that although the overall net proportion of correct reclassification was negative (-18.4%), the overall net number of correct reclassifications was positive (+17 923 or 13% of the entire cohort, Table 5).
Figure 2. Effect of adding the Procedural Index for Mortality Risk (PIMR) score to the Kaiser Permanente Inpatient Risk Adjustment Methodology (KP-IRAM) on predicted risk of death in hospital. This graph presents the expected probability of death in hospital (vertical axis) for varying PIMR scores (horizontal axis). Risks are presented for people whose expected risk of death in hospital (based on the KP-IRAM) was at the 25th percentile (0.01%, solid line), 50th percentile (0.11%, long-dashed line), and 75th percentile (1.35%, short-dashed line).
Figure 3. Calibration of KP-IRAM and PIMR to predict death in hospital. These figures compare observed and expected death rates when the validation group was divided into expected risk deciles (top) and strata (bottom). The decile plot presents observed mortality rates with 95% confidence intervals with those in red significantly differing from expected.
Table 5. Results of the Net Reclassification Improvement (NRI) analysis:
We derived and internally validated an index that used administrative data to quantify the relative contribution of a broad range of therapeutic procedures on the risk of death in hospital. We identified 52 procedures which (after adjusting for a robust and validated hospital mortality model) were significantly associated with the risk of death in hospital. We modified this model into an index that reflects the independent contribution of each procedure to the risk of death in hospital. By itself, and when added to an accurate model to predict hospital mortality, the total Procedural Index for Mortality Risk (PIMR) score significantly predicted risk of death in hospital.
The importance of surgical interventions on hospital outcomes is reflected by the large number of indexes that use patient and hospitalization factors to predict the risk of post-procedural death (Table 1) [1-6]. The clinical variables in these indexes, along with their simplicity, increase their face validity to practicing clinicians. However, these clinical variables prohibit calculation of these indexes using administrative data. To develop our index, we started with a validated, highly accurate model to predict hospital mortality risk in all hospital patients. We then determined the risk of death after a broad range of procedures independent of that predicted from the KP-IRAM. Both by itself and when added to the KP-IRAM model, the PIMR was significantly associated with the risk of death in hospital.
The PIMR would primarily be used in analyses involving administrative data. Expressing this risk as a simple score facilitates our understanding of the relative importance of various interventions on death in hospital. When combined with the KP-IRAM, the PIMR had excellent discrimination and calibration for predicting risk of death in hospital. It is notable that the discrimination achieved with the KP-IRAM and PIMR was similar to that achieved using clinical based models (Table 1). The PIMR will allow researchers and administrators to gauge patient and procedural complexity of individual surgeons, services, or hospitals for descriptive or comparative purposes and will also let analysts adjust for the influence of a large range of therapeutic procedures on risk of hospital mortality.
The independent association between many of the PIMR procedures and risk of hospital death may reflect unresolved confounding of patient or hospitalization factors. The significance of several procedures (e.g. cardiac resuscitation) is likely due to important clinical events (e.g. cardiac arrest) that are identified by the procedure code and are not captured by the KP-IRAM. Further work is required to determine how much mortality risk is due to the procedure and how much is due to other underlying patient factors.
The addition of the PIMR to the KP-IRAM model significantly improved the ability to predict hospital mortality. The absolute increase of the model's c-statistic was small (0.009 or 0.9%). Several studies have shown that the overall, sequential improvement of model performance decreases as more and more variables are added [23,24]. However, the c-statistic of the KP-IRAM was already very high without the PIMR score (92.9%). With the PIMR, the C-statistic increased more than 10% of the distance between the KP-IRAM and perfect discrimination. This indicates, along with the results presented in Figure 1, the strength of PIMR to predict risk of death with or without other covariates associated with death risk in hospital.
We believe that two steps could greatly improve the PIMR. The PIMR relies on procedure codes whose accuracy has not been validated. Our study's objective was to derive and validate an index that determines the independent influence of various procedures on hospital mortality. Strictly speaking, however, the PIMR measures the independent influence of codes for various procedures - rather than the procedures themselves - on hospital mortality. Without knowing the accuracy of each code for its respective procedure, we are uncertain how strong a surrogate each code is for the actual procedure. Before one uses the PIMR for individual patient risk prediction, the accuracy of the procedure codes contained in the PIMR should be validated.
The second major limitation of the PIMR is its imputation of procedural urgency using admission urgency status. For most hospitalizations, admission and procedural urgency will be identical but situations could arise in which they would differ. For example, consider a patient admitted electively for a hip replacement who has an acute myocardial infarction requiring an emergent angioplasty. In this case, the angioplasty urgency would be misclassified as an elective procedure. We believe that this bias explains why the number of points assigned to two elective procedures exceeded that for their emergent counterpart (Table 4). The PIMR could be improved by using an accurate classification of procedural urgency.
There are other limitations to the PIMR. First, the PIMR requires that surgical procedures are coded using the Canadian Classification of Interventions (CCI). Without validated translation tables to other procedural coding systems, this limits its use to Canadian institutions. Second, the PIMR was derived and validated in a single hospital. While objective and universal criteria are used to code procedures, it is possible that local coding practices could change the PIMR's validity in other patient populations. Third, most procedures are not included in the PIMR because they were not independently associated with risk of death in hospital. As a result, the PIMR should be used as an adjunct to other factors associated with risk of death in hospital - such as those in the KP-IRAM - to compare outcomes after various surgeries. Researchers should exercise some caution if this index is used when inferring institutional quality of care issues using hospital mortality. Some of the components in the PIMR (such as heart resuscitation) could result from poor quality of care, the adjustment of which could hide such problems.
Finally, our analyses did not include surgeries that were infrequently conducted at our hospital.
We have derived and internally validated an index to express the independent association of a broad range of procedures with risk of death in hospital. When this is added to a validated hospital death risk index, we can accurately predict post-procedural risk of death.
The authors declare that they have no competing interests.
CvW conceived of the study; directed the study design and statistical analysis; and drafted the manuscript. JW participated in the study design; performed the statistical analysis; created the tables, additional files, and figures; and edited the manuscript. CB performed the literature search and extracted the information in Table 1. AF provided the study data, participated in the study design, and reviewed the manuscript for important clinical and intellectual content. All authors have read and approved the final manuscript.
This study was supported by the Department of Medicine, University of Ottawa.
Tran Ba LP, du Montcel ST, Duron JJ, Levard H, Suc B, Descottes B, Desrousseaux B, Hay JM: Elderly POSSUM, a dedicated score for prediction of mortality and morbidity after major colorectal surgery in older patients.
Alves A, Panis Y, Mantion G, Slim K, Kwiatkowski F, Vicaut E: The AFC score: validation of a 4-item predicting score of postoperative mortality after colorectal resection for cancer or diverticulitis: results of a prospective multicenter study in 1049 patients.
Prytherch DR, Whiteley MS, Higgins B, Weaver PC, Prout WG, Powell SJ: POSSUM and Portsmouth POSSUM for predicting mortality. Physiological and Operative Severity Score for the enUmeration of Mortality and morbidity.
JAMA: The Journal of the American Medical Association 2007, 297:71-6. Publisher Full Text
Alves A, Panis Y, Mathieu P, Mantion G, Kwiatkowski F, Slim K, Association Française de Chirurgie: Postoperative mortality and morbidity in French patients undergoing colorectal surgery: results of a prospective multicenter study.
The pre-publication history for this paper can be accessed here: