Email updates

Keep up to date with the latest news and content from BMC Health Services Research and BioMed Central.

Open Access Research article

Comparison of Rx-defined morbidity groups and diagnosis- based risk adjusters for predicting healthcare costs in Taiwan

Raymond NC Kuo13 and Mei-Shu Lai23*

Author Affiliations

1 Institute of Health Care Organization Administration, College of Public Health, National Taiwan University, Taipei, Taiwan

2 Institute of Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan

3 Center for Health Insurance Research, College of Public Health, National Taiwan University, Taipei, Taiwan

For all author emails, please log on.

BMC Health Services Research 2010, 10:126  doi:10.1186/1472-6963-10-126


The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1472-6963/10/126


Received:11 January 2010
Accepted:17 May 2010
Published:17 May 2010

© 2010 Kuo and Lai; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Medication claims are commonly used to calculate the risk adjustment for measuring healthcare cost. The Rx-defined Morbidity Groups (Rx-MG) which combine the use of medication to indicate morbidity have been incorporated into the Adjusted Clinical Groups (ACG) Case Mix System, developed by the Johns Hopkins University. This study aims to verify that the Rx-MG can be used for adjusting risk and for explaining the variations in the healthcare cost in Taiwan.

Methods

The Longitudinal Health Insurance Database 2005 (LHID2005) was used in this study. The year 2006 was chosen as the baseline to predict healthcare cost (medication and total cost) in 2007. The final sample size amounted to 793 239 (81%) enrolees, and excluded any cases with discontinued enrolment. Two different kinds of models were built to predict cost: the concurrent model and the prospective model. The predictors used in the predictive models included age, gender, Aggregated Diagnosis Groups (ADG, diagnosis- defined morbidity groups), and Rx-defined Morbidity Groups. Multivariate OLS regression was used in the cost prediction modelling.

Results

The concurrent model adjusted for Rx-defined Morbidity Groups for total cost, and controlled for age and gender had a better predictive R-square = 0.618, compared to the model adjusted for ADGs (R2 = 0.411). The model combined with Rx-MGs and ADGs performed the best for concurrently predicting total cost (R2 = 0.650). For prospectively predicting total cost, the model combined Rx-MGs and ADGs (R2 = 0.382) performed better than the models adjusted by Rx-MGs (R2 = 0.360) or ADGs (R2 = 0.252) only. Similarly, the concurrent model adjusted for Rx-MGs predicting pharmacy cost had a better performance (R-square = 0.615), than the model adjusted for ADGs (R2 = 0.431). The model combined with Rx-MGs and ADGs performed the best in concurrently as well as prospectively predicting pharmacy cost (R2 = 0.638 and 0.505, respectively). The prospective models showed a remarkable improvement when adjusted by prior cost.

Conclusions

The medication-based Rx-Defined Morbidity Groups was useful in predicting pharmacy cost as well as total cost in Taiwan. Combining the information on medication and diagnosis as adjusters could arguably be the best method for explaining variations in healthcare cost.

Background

Diagnosis information is commonly used for defining morbidities and for estimating the risk of healthcare utilization. Diagnosis based comorbidity scales and risk adjustment tools, such as the Charlson Comorbidity Index[1], Elixhauser index[2], the Johns Hopkins Adjusted Clinical Group (ACG) case-mix system[3], and the Diagnostic Cost Group Hierarchical Condition Category (DCG/HCC) model[4,5] have been verified for their effective use in adjusting healthcare costs risks [6-11]. Although administrative data seems to be comprehensive, efficient, low cost, and are most likely to prevent several common biases associated with primary data, the accuracy and quality of the diagnosis coding remains suspect [12-15]. Previous studies found that the diagnoses identified by administrative data were highly specific but varied greatly in sensitivity and therefore recommended that all available sources of data (e.g. prescription claims database)should be included in order to overcome the potential limitations that come with a single source of data[15]. Pine et. al. also argued that risk-adjustment based entirely on administrative data is imperfect because these data do not discriminate between comorbidities and complications, and the limited numbers of secondary diagnoses within the data may not properly reflect the sickest patients [16].

Prescription claim data has several additional strengths for capturing morbidity conditions compared to diagnoses data. Healthcare purchasers (insurers) that provide a drug benefit package, claim that prescription data is often more reliable, timely, complete, and less of a gamble than diagnostic data [12,13,17]. In addition, for persons with a stable, well-managed chronic disease, a medication-based risk instrument may capture their health risk even without the diagnosis information reported by the providers [17]. Several medication-based morbidity measures have been developed. The Chronic Disease Score (CDS) developed by a team of physicians, pharmacists, and health services researchers at the Center for Health Studies, Group Health Cooperative of Puget Sound (GHC), is an early model for measuring morbidity conditions based on prescription data [18]. Then Clark et. al. demonstrated an approach to assign empirically derived weights for the CDS [19]. Afterwards, the CDS was revised to incorporate more drugs used for treating diseases and conditions in order to fulfil the needs to measure the health status and the risk of healthcare utilization among different types of populations [12,17,20,21]. Although these medication-based risk adjustment tools have been tested, and were found to be valid in predicting future healthcare utilization, most of these tools incorporate a coding algorithm that is applied in the U.S. (i.e. required medication data contains the U.S. National Drug Codes (NDC) or the American Hospital Formulary Service (AHFS) Drug codes) , which makes studies conducted outside the U.S. operationally cumbersome.

The Johns Hopkins Adjusted Clinical Groups (ACG) system was developed to predict healthcare utilization and costs based on groupings of diagnoses [22-24]. The former version of the ACG system provided the Aggregated Diagnosis Codes (ADGs; 32 diagnosis clusters) and ACGs (mutually exclusive, health status categories defined by morbidity, age, and sex) of a given population based on diagnosis data. Version 7.1 of the ACG system incorporated Rx-defined Morbidity Groups (Rx-MGs) into predictive models. Unlike earlier developed medication-based risk adjustment tools which include medication therapeutic classes to identify any limited chronic diseases or conditions, the Rx-MG algorithm first reduces nearly 90 000 U.S. NDCs to approx. 2700 units, then assigns each medication use into one of the 60 Rx-MGs based on criteria consisting of primary anatomico-physiological system, morbidity differentiation, expected duration, and severity [24,25]. For medication data collected outside the U.S., an international mapping algorithm within the ACG system also performs the Rx-MG assignment based on the WHO Anatomical Therapeutic Chemical (ATC) classification [26]. This feature makes the ACG system stand out from the other medication-based risk adjustment tools in that it can be applied to countries where the medication data contains neither NDC nor AHFS codes.

This study aimed to verify if the Rx-MGs of the Johns Hopkins ACG system could be used for adjusting risk and for explaining the variations in healthcare cost in Taiwan. Previous researches have shown diagnosis-based ADGs to be a valid morbidity measure as well as risk adjust instrument for the NHI claim data in Taiwan [27,28], but the application of Rx-MGs in empirical research remains absent. Although in recent studies the Rx-MGs were tested and found to be valid risk adjusters within predictive models (PMs), nevertheless, those studies are based on the limited ranking of age or populations with selected health conditions [24,29,30]. In the present study we compared the performance of Rx-MGs to ADGs and other diagnosis-based risk adjusters for predicting the (concurrent and prospective) total cost and the medication cost under the NHI. The performance of Rx-MGs models were tested with a sample that can represent the entire population. The fit of these models was also tested by age groups to ensure generalizability.

Methods

Risk Adjustment Instruments

Two types of risk adjusters within the Johns Hopkins ACG system were chosen for the present study: the diagnosis-based ADGs and the medication-based Rx-MGs [24]. Studies have found the Elixhauser's comorbidity index to be statistically slightly superior to the Charlson system at adjusting for comorbidity [31,32]. Therefore, the Deyo's Charlson Comorbidity Index (CCI) [33] and the Elixhauser's Index[2] were adopted as competitors to the Rx-MGs. All of the morbidity groups or prescription groups measured by those instruments were treated as dichotomous variables in predictive models. We used the ICD codes cited by Quan et. al. to determine if each of these diagnoses were included in any of the Deyo's CCI or Elixhauser's Index [34]. Instead of using the original coding algorithms, the enhanced ICD-9-CM coding algorithms for Charlson and Elixhauser's index were adopted to solve: (1) discrepancies among coding algorithms for some conditions; (2) inconsistent defining of the 6 shared comorbidities of Deyo's and Elixhauser's original ICD-9-CM coding algorithms.

Study Populations

Taiwan launched a universal National Health Insurance (NHI) Program on March 1, 1995. As of 2007, 22.60 million of Taiwan's 22.96 million population (98.4%) were enrolled in the NHI program [35]. And, as of December 2008, 18 829 hospitals and healthcare providers (92% of all healthcare facilities in Taiwan) and 4180 pharmacies were contracted by the Bureau of National Health Insurance [36]. The NHI program features universal access to healthcare, healthcare with acceptable quality, comprehensive benefits (inpatient and ambulatory care, dental services, traditional Chinese medicine therapy, surgery, examinations, laboratory tests, prescription medications, nursing care, hospital rooms, preventive services, and certain OTC drugs). These features make the NHI claim data an appropriate source for comparing the performance of diagnosis-based as well as medication-based risk adjustment instruments.

The Longitudinal Health Insurance Database 2005 (LHID2005), which consists of one million out of 25.68 million National Health Insurance enrollees in 2005, was used in this study. The LHID2005 database was derived by the Bureau of National Health Insurance (BNHI), Department of Health and maintained by the National Health Research Institutes (NHRI) so as to make it accessible to scientists in Taiwan for research purposes. The use of the data in this study was reviewed and granted by the NHRI. The data used in this study has no unique patient identifier nor any information that could violate the privacy protection policy. All case IDs required for data linkage were encrypted before being released. There is no significant difference in the gender or age distribution, nor is there an average insured payroll-related amount between the patients in the LHID2005 and the original population [35]. This study chose 2006 as the baseline year to predict healthcare cost (medication and total cost) in 2007. The final sample size was 793 239 (81%) which excludes cases with discontinued enrolment in 2006. Because those cases which were not fully enrolled in the NHI program in 2006 had less opportunity for access to healthcare covered by the NHI, the costs of that group might be under-estimated. To test for model fit, the sample was randomly divided into the estimation (training) sample (476 558; 60%) and the validation (testing) sample (316 681; 40%).

Data Analysis

The information on the prescriptions in LHID2005 includes outpatients/clinics, inpatients, and contracted pharmacies (community pharmacies). Diagnosis data combined the diagnosis codes derived from inpatient and outpatient/clinic claims. Studies show that the truncation of healthcare expenditures in predictive models provides more stable and more robust estimates than using raw dollars [24,37]. But, the cut-offs used for defining the outliers in those researches ranged in general from 0.5% to 20% [38-43], or were set for a fixed amount by the researchers [17,24]. In the present study, we capped pharmacy cost and the total cost at the top 1% of the cases, which are the maximums of USD 1846 and USD 7538 in 2006 as well as USD 2062 and USD 9446 in 2007, respectively.

The diagnoses derived from the National Health Insurance claim data were entered into the Johns Hopkins ACG system for ADGs assignment. The prescription codes within the claim data were first mapped to the WHO ATC codes, then entered into the Johns Hopkins ACG system for Rx-MGs assignment. For measuring the Charlson Index and the Elixhauser's index, the diagnoses for all cases were first screened by a pre-defined algorithm to improve the specificity of these codes, excluding outpatient diagnoses which were identified as with a same disease/condition but had been reported less than 3 times within the year, or it they all appeared in the same month. The exclusion criteria was not applied for the data which were input in the ACG system because the precise algorithm for assigning each single ICD code to the ADG was not disclosed by the Johns Hopkins ACG team. Another concern was that the ADG categories include acute diseases/conditions that are not included by the Charlson Index and the Elixhauser's index. Therefore, excluding those ICD codes that were reported less than 3 times may underestimate the existing acute diseases/conditions.

Multivariate OLS regression was used in the cost prediction modelling. The risk adjusters used in the predictive models included age, gender, Deyo's CCI, Elixhauser's Index, ADGs, and Rx-MGs. Because previous studies found that prior cost is a comparatively accurate predictor of true costs [44], it was also included for prospective prediction in this study. Because the relationship between prior- and current-year costs may not be strictly linear [45], we also examined a functional form that included a squared term of costs in 2006. There were five alternative models for the concurrent prediction and seven models for the prospective prediction fitted in this study. For concurrent prediction, the first model controlled for age and gender only, and was followed by models including Deyo's CCI, Elixhauser's index, ADGs, and Rx-MGs. The fifth model combined both ADGs and Rx-MGs for comparing models that included only one of these indexes. For prospective prediction, the alternative models included the five for concurrent prediction, as well as added models that were adjusted by prior cost and the square term of prior cost. The coefficients of each morbidity group within the selected indices were estimated from the estimation sample. Then the coefficients, excluding those which were statistically non-significant in each alternative model (see appendix), were applied in the validation sample. The performance of each alternative model was compared by its predictive R-square and mean of absolute prediction error (MAPE) estimated by the validation sample. Another indicator was also provided in which the MAPE is divided by the mean of cost, so that the MAPEs could be compared across the models with different means of cost. The fit of the selected models was also tested by age groups (< 18, 18-64, > = 65) for sensitivity analysis. The pharmacy cost and total cost of each group were capped at the top 1% of the cases.

Results

Patient characteristics

As shown in Table 1, the estimation and the validation sample have the same distribution of age, gender, number of Rx-MGs, and healthcare utilizations. There were 11% of cases with zero Rx-MGs in the estimation samples as well as in the validation sample. The average numbers of Rx-MGs for both samples are 7.19 and 7.20. Also, 29% of cases were with more than 10 Rx-MGs. Compared to the year 2006, the mean of the total cost increased by 12% and the mean of the total cost increased by about 10% in 2007.

Table 1. Characteristics of estimation and validation samples

The distribution of each Rx-MG was similar in both samples (see Table 2). A few Rx-MGs had cases less than 1%, and the number of cases for 'Immune disorders' (ALLx040) and 'Cystic fibrosis' (RESx030) were less than 100. Prevalence of several acute diseases/conditions, identified by Rx-MGs, was above 50% among the two samples: 'Allergy/immunology, acute minor', 'Gastrointestinal/hepatic, acute minor', 'Pain and inflammation', 'Infectious, acute minor', and 'Respiratory, acute minor'. The prevalence of all Rx-MGs had no significant differences among the two samples, except for 'Endocrine, Bone disorders'

Table 2. Frequency of Rx-MGs in 2006, by study sample

Performance comparisons among predictive models

The predictive R-squares of five models predicting total cost concurrently ranged from 0.089 to 0.650 (see Table 3). For those models with cost adjusted by diagnosis-based morbidity measures, the ADGs model performed better than others. The Rx-MGs model has a predictive R-square 0.618, which explains the 21% more variance than the ADGs model. The model that combined ADGs and Rx-MGs had the highest predictive R-square (0.650) as well as the lowest MAPE rate (54.6%) among all models. The prospective prediction models had lower predictive R-squares than the concurrent prediction models. All of the seven models explained less than 50% of the variations in the total cost for 2007. Similar to the concurrent prediction models, the prospective prediction model which combined ADGs and Rx-MGs had a predictive R-square (0.382) that was higher than those using either ADGs or Rx-MGs. The MAPE rate was the lowest (75.9%) among all models except for those that included prior cost. The model which included prior cost increased 0.08 in R-square. The model with the square term for prior cost had no considerable improvement in predictive R-square.

Table 3. Predictive models for total cost

As shown in Table 4, the Rx-MGs models also performed better than the diagnosis-based models for predicting medication cost concurrently and prospectively. But, unlike the results of the total cost prediction models, the ADGs models had a lower predictive R-squares and a higher MAPE rate than the model adjusted by Elixhauser's index for predicting medication cost. The models which combined ADGs and Rx-MGs also improved slightly over the model adjusted by Rx-MGs only. The ADGs and Rx-MGs combined model had a remarkable improvement in predictive R-square after adding the predictor of prior medication cost. The predictive R-square seemed to have only a negligible improvement if the square term of prior medication cost was added.

Table 4. Predictive models for medication cost

Comparing model performance across age groups

The performance of three alternative models was compared across three age groups: < 18, 18-64, > = 65. After being capped at the 99-percentile of costs for all age groups, the result showed that models that applied to all age ranks had the highest predictive R-squares of all other sub-samples (see Table 5). The 18-64 year old age group had the highest predictive R-squares for all alternative models compared to the other two age groups. For all three sub-samples, the performance of the predictive models was similar to the whole sample: the models that were adjusted for prior cost performed the best. The result showed that R-squares for the 'under 18' age group were the lowest among all three sub-samples, implying that the predictive models are not well explained variations of costs within the sample.

Table 5. Total cost predictive models for specific age groups

Discussion

This study has demonstrated that the Rx-Defined Morbidity Groups are applicable for predicting the total cost and the medication cost in a universal health insurance system. Although a few articles attempted to predict or explain variations of medication use by applying the Johns Hopkins ACG case-mix system, these analytical models are mainly based on diagnosis-based risk adjusters (i.e. the EDCs, ADGs, or ACGs) within the ACG system [10,11,46]. Two recent articles reported studies that had applied the Johns Hopkins ACG system for identifying high-risk patients and predicting healthcare utilization. However, the authors chose predictive models embedded within the ACG system (i.e. the Dx-PM, Rx-PM, and DxRx-PM) instead of adjusting risks by original morbidity groups (i.e. the ADGs or Rx-MGs) [24,47]. Therefore, we believe that the present article is the first one to describe an empirical study using Rx-MGs for healthcare cost prediction as well as comparing the model performance with other diagnosis-based predictive models.

In this study, the model adjusted by Rx-MGs could explain over 60% of the variations for total cost and medication cost in the concurrent year. Clark et. al. used two versions of the Chronic Disease Score to explain variations of total cost, the R-squares for concurrent prediction were 0.09 and 0.19 [19]. Fishman et. al. used the Rx-Risk model to predict healthcare cost, and the validation R-square of that model was 0.0874. They also took sensitivity analyses for cases with patients younger than 18 or older than 18. The R-squares for these two sub-samples were 0.083 and 0.077, respectively [17]. Sales et. al. used Rx-Risk-V, a modification from Rx-Risk for the veteran population, to predict cost. The R-square of the concurrent prediction was 0.202 [48]. Compared to former researches using medication-based morbidity measures to predict cost, the performance of the Rx-MGs model is relatively better than others. This study also found that the Rx-MGs model is applicable to all the different age groups, although the performance varied among these groups. The Rx-MGs model also performed better than other diagnosis-based alternative models in this study. This finding is consistent with other studies which found that prescription data are superior for predicting pharmacy cost [6,24]. However, our study also found that the Rx-MGs model is superior for predicting total cost. One possible explanation for the superior performance of the Rx-MGs model compared to other medication based morbidity measures reported by previous studies is that the NHI pays for almost all prescription drugs, except for those that are very new in the market, expensive, and not yet approved by the Department of Health. Furthermore, this study aggregated prescriptions from outpatients/clinics, inpatients, and community pharmacies. This comprehensive data was intended to help capture all prescription-related morbidities for each case, something that was not done in similar studies. In addition, the Rx-MGs consisted of not only the chronic diseases or conditions, but they also included several acute diseases or syndromes. This feature makes the Rx-MGs stand out from other chronic disease focused instruments (e.g. the Chronic Disease Score) by capturing all possible risks for healthcare utilization. In addition, although the ADGs do capture the diagnoses of acute diseases or syndromes, the number of ADG categories is smaller than that of the Rx-MGs, which might explain why the performance of Rx-MGs models are superior to the ADGs models. Another possible explanation is that the annual medication cost is merely one fourth of the annual total healthcare cost in NHI. Therefore the model that can explain more variations of medication cost is expected to have a better performance for predicting total cost. However, the real cause for the gap in performance between ADGs and Rx-MGs models needs further investigation.

The predictive R-squares of the ADGs models in this study are larger than those reported by two other similar studies which also used Taiwan NHI data [27,38]. These two earlier studies did not enforce the 'full enrolment' criteria as applied in our study. Therefore the disease burden of those cases selected in these two earlier studies may not be equally accessed. Second, we capped the cost at the 99-percentile, which might be the most critical point to explain the improvement in model performance. We conducted another analysis using original cost (without capping the cost) for the prediction models. The result of that analysis showed that age/sex also adjusted for 4% to 5% of the variances, which is quite similar to Lee and Huang's findings[28]. Chang and Weiner also found that after truncating the cost at top 0.5%, the performance of the models improved significantly[38]. After adjusting for prior healthcare utilization, our proposed model combined with Rx-MGs and ADGs out-performed others models for predicting future medication cost, which could explain over 68% of the variations for future medication cost. The findings of this study are similar to the findings of Forrest, et al.'s study which showed that the Combined Diagnostic/Medication Predictive Models (DxRx-PMs) had the highest R-squares for explaining variations of pharmacy charges and total healthcare charges [24]. Other studies have shown that adding diagnosis-based morbidity measures to medication-based models could improve the prediction of total healthcare utilization [6,19,49,50]. However, those findings supported combining those two types of measure to improve cost prediction. On the other hand, Schneeweiss et.al. compared the performance of four diagnosis-based and two medication-based comorbidity scores to predict mortality. They found that while diagnoses-based scores performed better than medication-based scores in predicting future mortality, combining diagnoses and medication-based scores showed an improvement in predicting mortality [49]. The strength of employing all available diagnosis and prescription data is that some potential risk factors may not be captured in a single morbidity measurement, and each morbidity measurement captures different risks. Therefore, combining different morbidity measures in a given predictive model can be more informative than employing just one. Although using more than one morbidity measurement in a single model may raise the concern of multicollinearity, an empirical study showed that there is only a low correlation between different measures [51].

Previous studies have shown that combined prior costs and morbidity measures are important in determining future high cost patients [24,30,41]. Hsu et. al. found that incorporating information of the previous year's drug use or cost into the risk adjustment approach would greatly improve the accuracy of the prediction. They pointed out that drug costs tend to be stable from year to year and are more predictable than other types of medical costs. Therefore, ignoring past costs may result in preventable misallocation of resources and creates a strong incentives for reverse patient selection [45]. The data of our study also support that predictive models combined with Rx-MGs, ADGs, and prior cost performed the best in predicting future cost. However, investigators have argued that this could provide incentives to increase utilization or to favor a specific style of practicing medicine in addition to medical needs. Thus, payment models that include utilization measures among the predictor variables must proceed with caution [41,52].

Compared to other diagnosis-based predictive models, this study has demonstrated that the Rx-MGs model out-performs all other diagnosis-based models in explaining or predicting healthcare utilization. In future applications, the Rx-MGs could be applied for describing and comparing disease patterns among populations. The models which use Rx-MGs alone or combined with ADGs could also be applied for helping local health authorities or case managers to identify high risk populations for disease management programs [24,29,53]. A comprehensive and integrated care delivery system could be provided to those who have a high utilization of healthcare but have a low severity of illness, instead of delivering fragmented acute care to them. The Rx-MGs or other predictive models within the ACG system could also be tested for their efficiency and appropriateness in allocating healthcare resources or setting payment rates by future researchers or policy makers.

There are several limitations to this study. First, we used ADGs and Rx-MGs as risk adjusters for comparing them with two other commonly used morbidity measures. However, the Johns Hopkins ACG system provides prediction models (PMs) which include disease or frailty markers other than ADGs or Rx-MGs, and they have a better performance than the ADGs or other diagnosis-based measures. The PMs were not included as competing models in this study because the 'risk scores' provided by the Dx-PM or Rx-PM as the summary measures of disease burden were provided by the ACG system [24]. Although the more efficient risk adjusters included in the prediction models could be expected to provide the better performance in predicting cost, the performance of those models is somehow hard to compare with other models that are wholly based on morbidity measures (e.g. the Charlson Comorbidity Index). Second, we excluded those cases with discontinued enrolment in 2006 to ensure equality accessibility for healthcare covered by NHI. However, the reasons for the discontinued enrolment in NHI might be very diverse. Thus these cases that were excluded by our study might be high-risk users (e.g. cancer patients at the end-of-life year) or healthy users (e.g. young students studying abroad). Hence the analytical strategy used in this study could limit its generalizability. Another limitation is the approach to treat outliers in this study. Although we capped at the top 1% of costs, those cases with capped costs generally accounted for approximately 25% of the healthcare expenditure. That implies that the predictive models applied to real data cannot perform as well as in this study. Another analysis also found that when applying the predictive models to those high-risk users with actual cost data, the performance of the models declines significantly. This finding seems to suggest that in order to address this issue it might be best to identify and manage those cases by using the risk adjustment instruments, instead of "predicting" their future healthcare utilization [24,29]. The fourth potential limitation in this study is that we failed to incorporate socio-economic status indicators into the predictive models. However, in a recent article the authors argued that adding socioeconomic patient characteristics improves the predictive model only slightly [54]. The information on socio-economic status is quite limited in the NHI database. We carried out another analysis to incorporate household income into the predictive models. The results showed that as a proxy of the socio-economic status it did not have a statistically significant impact on costs.

Conclusions

This study demonstrated that compared to other diagnosis-based predictive models, the Rx-MGs model out-performs all other models in explaining variations of cost and predicting future healthcare utilization. For countries or regions that routinely collect prescription claim data, the Rx-MGs within the Johns Hopkins ACG case-mix system could be applied to predict future healthcare utilization as well as allocate resources for healthcare.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

KR contributed to the study design, statistical analysis, interpretation, and writing of the manuscript; LM contributed to the directing and coordinating of the study, leading the panel in developing the NHI drug codes to WHO ATC codes for the mapping algorithms, interpretation and the writing of the manuscript. Both authors have read and approved the final manuscript.

Acknowledgements

This study was supported by grants from the Department of Health, Taiwan (96Z4149; DOH098-TD-D-113-098016) and from the 'Aiming for the top, university and elite research center development plan' (MoEATU, 99RH0021). The authors would like to thank Karen Kinder Siemens and Chad Abrams for their technical support. The authors also thank Roger Haesevoets for proofreading the manuscript for English.

References

  1. Charlson ME, Pompei P, Ales KL, Mackenzie CR: A New Method of Classifying Prognostic Co-Morbidity in Longitudinal-Studies - Development and Validation.

    Journal of Chronic Diseases 1987, 40(5):373-383. PubMed Abstract | Publisher Full Text OpenURL

  2. Elixhauser A, Steiner C, Harris DR, Coffey RN: Comorbidity measures for use with administrative data.

    Medical Care 1998, 36(1):8-27. PubMed Abstract | Publisher Full Text OpenURL

  3. Weiner JP, Dobson A, Maxwell SL, Coleman K, Starfield BH, Anderson GF: Risk-adjusted Medicare capitation rates using ambulatory and inpatient diagnoses.

    Health Care Finan Rev 1996, 17(3):77-99. OpenURL

  4. Ash AS, Ellis RP, Pope GC, Ayanian JZ, Bates DW, Burstin H, Iezzoni LI, MacKay E, Yu W: Using diagnoses to describe populations and predict costs.

    Health Care Finan Rev 2000, 21(3):7-28. OpenURL

  5. Pope GC, Kautter J, Ellis RP, Ash AS, Ayanian JZ, Iezzoni LI, Ingber MJ, Levy JM, Robst J: Risk adjustment of Medicare capitation payments using the CMS-HCC model.

    Health Care Finan Rev 2004, 25(4):119-141. OpenURL

  6. Zhao Y, Ash AS, Ellis RP, Ayanian JZ, Pope GC, Bowen B, Weyuker L: Predicting pharmacy costs and other medical costs using diagnoses and drug claims.

    Medical Care 2005, 43(1):34-43. PubMed Abstract | Publisher Full Text OpenURL

  7. Charlson ME, Charlson RE, Peterson JC, Marinopoulos SS, Briggs WM, Hollenberg JP: The Charlson comorbidity index is adapted to predict costs of chronic disease in primary care patients.

    Journal of Clinical Epidemiology 2008, 61(12):1234-1240. PubMed Abstract | Publisher Full Text OpenURL

  8. Perkins AJ, Kroenke K, Unutzer J, Katon W, Williams JW, Hope C, Callahan CM: Common comorbidity scales were similar in their ability to predict health care costs and mortality.

    Journal of Clinical Epidemiology 2004, 57(10):1040-1048. PubMed Abstract | Publisher Full Text OpenURL

  9. Krop JS, Saudek CD, Weller WE, Powe NR, Shaffer T, Anderson GF: Predicting expenditures for medicare beneficiaries with diabetes - A prospective cohort study from 1994 to 1996.

    Diabetes Care 1999, 22(10):1660-1666. PubMed Abstract | Publisher Full Text OpenURL

  10. Orueta JF, Urraca J, Berraondo I, Darpon J, Aurrekoetxea JJ: Adjusted Clinical Groups (ACGs) explain the utilization of primary care in Spain based on information registered in the medical records: A cross-sectional study.

    Health Policy 2006, 76(1):38-48. PubMed Abstract | Publisher Full Text OpenURL

  11. Aguado A, Guino E, Mukherjee B, Sicras A, Serrat J, Acedo M, Ferro JJ, Moreno V: Variability in prescription drug expenditures explained by adjusted clinical groups (ACG) case-mix: A cross-sectional study of patient electronic records in primary care.

    Bmc Health Services Research 2008., 8 PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  12. Gilmer T, Kronick R, Fishman P, Ganiats TG: The medicaid R-x model - Pharmacy-based risk adjustment for public programs.

    Medical Care 2001, 39(11):1188-1202. PubMed Abstract | Publisher Full Text OpenURL

  13. Malone DC, Billups SJ, Valuck RJ, Carter BL: Development of a chronic disease indicator score using a veterans affairs medical center medication database.

    Journal of Clinical Epidemiology 1999, 52(6):551-557. PubMed Abstract | Publisher Full Text OpenURL

  14. Ghali WA, Quan H, Brant R: Risk adjustment using administrative data - Impact of a diagnosis-type indicator.

    J Gen Intern Med 2001, 16(8):519-524. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  15. Wilchesky M, Tamblyn RM, Huang A: Validation of diagnostic codes within medical services claims.

    Journal of Clinical Epidemiology 2004, 57(2):131-141. PubMed Abstract | Publisher Full Text OpenURL

  16. Pine M, Jordan HS, Elixhauser A, Fry DE, Hoaglin DC, Jones B, Meimban R, Warner D, Gonzales J: Modifying ICD-9-CM Coding of Secondary Diagnoses to Improve Risk-Adjustment of Inpatient Mortality Rates.

    Medical Decision Making 2009, 29(1):69-81. PubMed Abstract | Publisher Full Text OpenURL

  17. Fishman PA, Goodman MJ, Hornbrook MC, Meenan RT, Bachman DJ, Rosetti MCO: Risk adjustment using automated ambulatory pharmacy data - The RxRisk model.

    Medical Care 2003, 41(1):84-99. PubMed Abstract | Publisher Full Text OpenURL

  18. Vonkorff M, Wagner EH, Saunders K: A Chronic Disease Score from Automated Pharmacy Data.

    Journal of Clinical Epidemiology 1992, 45(2):197-203. PubMed Abstract | Publisher Full Text OpenURL

  19. Clark DO, Vonkorff M, Saunders K, Baluch WM, Simon GE: A Chronic Disease Score with Empirically Derived Weights.

    Medical Care 1995, 33(8):783-795. PubMed Abstract | Publisher Full Text OpenURL

  20. Fishman PA, Shay DK: Development and estimation of a pediatric chronic disease score using automated pharmacy data.

    Medical Care 1999, 37(9):874-883. PubMed Abstract | Publisher Full Text OpenURL

  21. Sloan KL, Sales AE, Liu CF, Fishman P, Nichol P, Suzuki NT, Sharp ND: Construction and characteristics of the RxRisk-V - A VA-adapted pharmacy-based case-mix instrument.

    Medical Care 2003, 41(6):761-774. PubMed Abstract | Publisher Full Text OpenURL

  22. Starfield B, Weiner J, Mumford L, Steinwachs D: Ambulatory Care Groups - a Categorization of Diagnoses for Research and Management.

    Health Serv Res 1991, 26(1):53-74. PubMed Abstract | PubMed Central Full Text OpenURL

  23. Tucker A, Weiner J, Abrams C: Health-Based Risk Adjustment: Application to Premium Development and Profiling. In Financial strategy for managed care organizations: rate setting, risk adjustment, and competitive advantage. Edited by Wrightson CW. Chicago, Ill Health Administration Press; 2002:165-225. OpenURL

  24. Forrest CB, Lemke KW, Bodycombe DP, Weiner JP: Medication, Diagnostic, and Cost Information as Predictors of High-Risk Patients in Need of Care Management.

    Am J Manag Care 2009, 15(1):41-48. PubMed Abstract | Publisher Full Text OpenURL

  25. NEW! ACG RX Predictive Model [http://www.acg.jhsph.edu/ACGDocuments/ACG%20Rx-PM%20Product%20Sheet.pdf] webcite

  26. About the ATC/DDD system [http://www.whocc.no/atcddd/] webcite

  27. Lee WC, Huang TP: Explanatory ability of the ACG system regarding the utilization and expenditure of the National Health Insurance population in Taiwan - A 5-year analysis.

    J Chin Med Assoc 2008, 71(4):191-199. PubMed Abstract | Publisher Full Text OpenURL

  28. Lee WC: Quantifying morbidities by Adjusted Clinical Group system for a Taiwan population: A nationwide analysis.

    Bmc Health Services Research 2008., 8 PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  29. Sylvia ML, Shadmi E, Hsiao CJ, Boyd CM, Schuster AB, Boult C: Clinical features of high-risk older persons identified by predictive modeling.

    DIS MANAGE 2006, 9(1):56-62. Publisher Full Text OpenURL

  30. Sylvia ML, Griswold M, Dunbar L, Boyd CM, Park M, Boult C: Guided care: cost and utilization outcomes in a pilot study.

    DIS MANAGE 2008, 11(1):29-36. Publisher Full Text OpenURL

  31. Southern DA, Quan H, Ghali WA: Comparison of the Elixhauser and Charlson/Deyo methods of comorbidity measurement in administrative data.

    Med Care 2004, 42(4):355-360. PubMed Abstract | Publisher Full Text OpenURL

  32. Stukenborg GJ, Wagner DP, Connors AF: Comparison of the performance of two comorbidity measures, with and without information from prior hospitalizations.

    Med Care 2001, 39(7):727-739. PubMed Abstract | Publisher Full Text OpenURL

  33. Deyo RA, Cherkin DC, Ciol MA: Adapting a Clinical Comorbidity Index for Use with Icd-9-Cm Administrative Databases.

    Journal of Clinical Epidemiology 1992, 45(6):613-619. PubMed Abstract | Publisher Full Text OpenURL

  34. Quan HD, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA: Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.

    Med Care 2005, 43(11):1130-1139. PubMed Abstract | Publisher Full Text OpenURL

  35. Introduction to the National Health Insurance Research Database (NHIRD), Taiwan [http://w3.nhri.org.tw/nhird/date_01.html] webcite

  36. Universal Coverage under NHI in Taiwan [http:/ / www.nhi.gov.tw/ english/ webdata.asp?menu = 11&menu_id = 290 &webdata_id = 2965] webcite

  37. Iezzoni LI: Risk adjustment for measuring health care outcomes. 3rd edition. Chicago: Health Administration Press; 2003. OpenURL

  38. Chang H-Y, Weiner J: An in-depth assessment of a diagnosis-based risk adjustment model based on national health insurance claims: the application of the Johns Hopkins Adjusted Clinical Group case-mix system in Taiwan.

    BMC Medicine 2010, 8(1):7. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  39. Meenan RT, Goodman MJ, Fishman PA, Hornbrook MC, O'Keeffe-Rosetti MC, Bachman DJ: Using risk-adjustment models to identify high-cost risks.

    Medical Care 2003, 41(11):1301-1312. PubMed Abstract | Publisher Full Text OpenURL

  40. Meenan RT, O'Keeffe-Rosetti MC, Hornbrook MC, Bachman DJ, Goodman MJ, Fishman PA, Hurtado AV: The sensitivity and specificity of forecasting high-cost users of medical care.

    Medical Care 1999, 37(8):815-823. PubMed Abstract | Publisher Full Text OpenURL

  41. Ash AS, Zhao Y, Ellis RP, Schlein M: Finding future high-cost cases: comparing prior cost versus diagnosis-based methods.

    Health Services Research 2001, 36(6 Pt 2):194-206. PubMed Abstract | PubMed Central Full Text OpenURL

  42. LeBlanc M, Moon J, Kooperberg C: Extreme regression.

    Biostatistics 2006, 7(1):71-84. PubMed Abstract | Publisher Full Text OpenURL

  43. Gregori D, Petrinco M, Barbati G, Bo S, Desideri A, Zanetti R, Merletti F, Pagano E: Extreme regression models for characterizing high-cost patients.

    J Eval Clin Pract 2009, 15(1):164-171. PubMed Abstract | Publisher Full Text OpenURL

  44. Bertsimas D, Bjarnadottir MV, Kane MA, Kryder JC, Pandey R, Vempala S, Wang G: Algorithmic Prediction of Health-Care Costs.

    Operations Research 2008, 56(6):1382-1392. Publisher Full Text OpenURL

  45. Hsu J, Huang J, Fung V, Price M, Brand R, Hui R, Fireman B, Dow W, Bertko J, Newhouse JP: Distributing $800 Billion: An Early Assessment Of Medicare Part D Risk Adjustment.

    Health Aff 2009, 28(1):215-225. Publisher Full Text OpenURL

  46. Sicras-Mainar A, Navarro-Artieda R, Ruano-Ruano I, Velasco-Velasco S, Frias-Garrido X, Llopart J, Llausi-Selles R: Efficiency in drug prescription measured by the application of adjusted clinical groups in five Spanish primary care centres.

    Value Health 2007, 10(6):A364-A364. OpenURL

  47. Calderon-Larranaga A, Abrams C, Poblador-Plou B, Weiner JP, Prados-Torres A: Applying diagnosis and pharmacy-based risk models to predict pharmacy use in Aragon, Spain: The impact of a local calibration.

    Bmc Health Services Research 2010., 10 PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  48. Sales AE, Liu CF, Sloan KL, Malkin J, Fishman PA, Rosen AK, Loveland S, Nichol WP, Suzuki NT, Perrin E, et al.: Predicting costs of care using a pharmacy-based measure risk adjustment in a veteran population.

    Medical Care 2003, 41(6):753-760. PubMed Abstract | Publisher Full Text OpenURL

  49. Schneeweiss S, Seeger JD, Maclure M, Wang PS, Avorn J, Glynn RJ: Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data.

    American Journal of Epidemiology 2001, 154(9):854-864. PubMed Abstract | Publisher Full Text OpenURL

  50. Zhao Y, Ellis RP, Ash AS, Calabrese D, Ayanian JZ, Slaughter JP, Weyuker L, Bowen B: Measuring population health risks using inpatient diagnoses and outpatient pharmacy data.

    Health Services Research 2001, 36(6 Pt 2):180-193. PubMed Abstract | PubMed Central Full Text OpenURL

  51. Baser O, Palmer L, Stephenson J: The estimation power of alternative comorbidity indices.

    Value Health 2008, 11(5):946-955. PubMed Abstract | Publisher Full Text OpenURL

  52. Robst J, Levy JM, Ingber MJ: Diagnosis-based risk adjustment for medicare prescription drug plan payments.

    Health Care Finan Rev 2007, 28(4):15-30. OpenURL

  53. Rosen AK, Wang F, Montez ME, Rakovski CC, Berlowitzi DR, Lucove JC: Identifying future high-healthcare users - Exploring the value of diagnostic and prior utilization information.

    Disease Management & Health Outcomes 2005, 13(2):117-127. OpenURL

  54. Hvenegaard A, Street A, Sorensen TH, Gyrd-Hansen D: Comparing hospital costs: What is gained by accounting for more than a case-mix index?

    Social Science & Medicine 2009, 69(4):640-647. OpenURL

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1472-6963/10/126/prepub