There is analytical potential for multiple cause of death data collected from death certificates. This study examines relationships of multiple causes of death as a function of factors available on the death certificate (demographics of decedent, place of death, type of certifier, disposal method, whether an autopsy was performed, and year of death).
Data from 326,332 Minnesota death certificates from 1990–1998 are examined. Underlying and non-underlying causes of death are examined (based on record axis codes) as well as demographic and death-related covariates. Associations between covariates and prevalence of multiple causes of death and conditional probability of underlying compared to non-underlying causes of death are examined. The occurrence of ischemic heart disease or diabetes as underlying causes are specifically examined.
Both the probability of multiple causes of death and the proportion of underlying cause compared to non-underlying cause of death are associated with demographic characteristics of the deceased and other non-medical conditions related to filing death certificate such as place of death.
Multiple cause of death data provide a potentially useful way of looking for inaccuracies in reporting of causes of death. Differences across demographics in the proportion of time a cause is selected as underlying compared to non-underlying exist and can potentially provide useful information about the overall impact of causes of death in different populations.
In their 1986 paper Israel, Rosenberg, and Curtin  gave a sort of rallying call for researchers to consider the analytical potential for multiple cause of death data collected by the United States National Center for Health Statistics (NCHS). Beginning with the implementation of the Eighth revision of the ICD in 1968, the NCHS developed and employed several computer systems to automatically select the underlying cause for each death certificate and to produce multiple cause of death data . The resulting multiple cause of death datasets by year are made publically available through the NCHS website.
Acknowledgment of the potential for multiple cause of death data analysis is increasing in other countries as well [3,4]. For example, the Australian Bureau of statistics point out that using multiple cause of death data allows researchers to: more comprehensively understanding and track death due to chronic disease which do not often appear as the underlying cause of death (e.g. Alzheimer's, diabetes, pneumonia), to provide better documentation on multi-morbid associations and the strength of associations between conditions which led to death (for example by examining the frequency of associations between diseases such as diabetes and ischaemic heart disease), and to assist in identifying problems with the process of recording and coding cause of death information .
Multiple cause of death data has been used to look at trends in certain diseases, e.g. HIV [5,6] and lung disease , but despite its availability, surprisingly few studies have looked at it broadly. Indeed there is no annual standard summary tabulation report of the multiple cause of death data put out by NCHS. This may be due in part to the overwhelming amount of information that arises when combinations of causes of death are considered. There are an enormous number of complex combinations which could be summarized and perhaps it is not clear what tables may be of general interest.
The purpose of this article is to examine some straightforward relationships of multiple causes of death as a function of factors available on the death certificate (demographics of decedent, place of death, type of certifier, disposal method, whether an autopsy was performed, and year of death). Using all death certificates issued by the state of Minnesota between 1990 and 1998 (326,332 deaths), the current study documents the relationship between these factors and the associated frequency of reporting of multiple causes of death as well as the associated frequency that a cause is considered underlying (after data processing) given that it is mentioned on the death certificate. The implication being that differences found are either due to actual differences in causes of death in these groups or due to systematic biases in the reporting of causes of death, or a combination of both. The study will not be able to discern which is the cause but hopes to contribute at the very least by providing an example of the potential relationships which can be examined with the rich multiple cause of death data.
The data used are from the Minnesota Department of Health Mortality Database and include entries from 326,332 individual death certificates, which represent all deaths occurring in Minnesota during the period of 1990–1998. Record axis codes (those codes which have been completely data processed) are used for all analyses in this paper rather than entity axis codes.
A brief description of the entity and record axis coding is given here. The translation of causes of death listed on the death certificate (see Figures 1 and 2 for the actual certificate) to the codes used for statistical analysis goes through many steps. As seen in Figures 1 and 2, the medical information which focuses on the sequence of medical conditions that resulted in death is provided in a two-part format. Part I is for the conditions which directly lead to death, and Part II is for other conditions which contribute to death but are not directly related to the immediate cause of death . The underlying cause of death is defined as the "(a) the disease or injury which initiated the train of events leading directly to death, or (b) the circumstances of the accident or violence which produced the fatal injury" . The entity axis codes represent what is actually written on the death certificate by the certifier expressed in terms of ICD codes including an indicator of which line the code came from and which position on the line it came from (if more than one code was listed per line). While the conditions listed in Part I should form a causal sequence initiated by the underlying cause listed on the lowest line, errors in properly completing the form occur regularly and a reselection of the underlying cause of death is done nationally 30–40% of the time. The decision to reselect an underlying cause other than that listed on the lowest used line in Part I is governed by a set of rules developed by WHO as part of the periodic revision of the International Classification of Disease  and is incorporated, along with a complex set of decision tables, into the Automated Classification of Medical Entities (ACME) software. The record axis codes represent a further processing of the entity axis codes to be consistent with the underlying cause data and more amenable to statistical tabulation and analysis. The record axis codes distinguish the ICD code selected as the underlying cause of death and lists all additional causes of death mentioned but does not distinguish them in terms of their ordering or original location on the death certificate. For more detail on entity and record access codes, see .
Figure 1. US standard certificate of death. Line 27 Part I and Part II are where the causes of death are listed.
Figure 2. This figure displays the backside of the certificate of death. Details are given for filling out specific lines.
Using the record axis codes, we have for each death record: one underlying cause of death and up to 14 non-underlying causes of death (with no distinction of importance given amongst the non-underlying). When we refer to a cause of death and do not want to distinguish if it is underlying or non-underlying we will refer to it as a "mentioned" cause of death.
In addition to listing one underlying and up to 14 non-underlying causes of death, each death certificate also contains information about the demographics of the deceased, including age, gender, race, marital status, and educational attainment. Also, other conditions related to the death are recorded – place and time of death, who completed the death certificate, if autopsy has been performed and type of body disposal. Minnesota Laws and guidelines govern the process for who and how a death is certified under different circumstances in Minnesota. For example when an unattended death occurs (e.g. at a persons residence) a medical examiner's investigator must arrive at the scene. The medical examiner will contact the last attending physician asking about past medical history of the decedent and most likely cause of death. When an attending physician has seen the decedent within 90 days and the death is natural, jurisdiction is usually given to the physician to certify the death. Sudden or unexpected deaths due in part to any factor other than natural disease must be referred to the medical examiner's office. Autopsies are performed at the discretion of the medical examiner but can also be performed for any death at the request of the immediate family.
The underlying and non-underlying causes of death derived from the death certificate, in this study, are coded according to the 9th revision of International Classification of Disease (ICD). The specific ICD9 codes are grouped into standard reporting of cause of death categories resulting in a total of 107 different causes of death. In this study, individuals are dichotomized as having multiple causes of death (i.e. at least one non-underlying cause) or not having multiple causes of death (i.e. only having an underlying). In addition, because heart disease is the leading cause of death and diabetes is a good example of a disease which often shows up as a non-underlying cause of death, this research investigates two sub -populations: 1) Individuals that have ischemic heart disease (ICD-9: 410 – 414) mentioned as a cause of death (n = 79,833), and 2) Individuals that have diabetes mellitus (ICD-9: 250) mentioned as a cause of death (n = 27,181). For both sub-populations, a dichotomous variable is created to indicate whether the mentioned disease is coded as the underlying cause of death or not.
Descriptive statistics including total numbers and proportion of all deaths (n = 326,332) in each of the covariate categories are reported as well as proportions of people within each covariate category who have multiple causes of death. In order to examine the association between each covariate and the dichotomous outcome of multiple causes of death, logistic regression is used to mutually adjust each factor for the others. 95% confidence intervals of odds ratios are reported. Trends in multiple cause of death reporting across time are investigated graphically.
Similarly, descriptive statistics including total numbers and proportions will be presented for the two sub-populations with ischemic heart disease (n = 79,833) or diabetes mellitus (n = 27,181) mentioned either as underlying or non-underlying cause of death. Logistic regression is used and 95% confidence intervals are reported to examine factors that are associated with each of these diseases being reported as underlying cause of death rather than non-underlying.
Overall, 68.9% of the 326,332 deaths from 1990–1998 had at least one non-underlying cause of death in addition to the underlying cause (i.e. have multiple causes). There was a noticeable decreasing trend of reporting multiple causes of death over the 9 year period from 1990 to 1998 with 74.0% in 1990 consistently dropping down to 64.8% in 1996 and remaining around 66% until 1998. Table 1 presents the marginal percentage of individuals in each demographic and death related category as well as the proportions and adjusted odds ratios of having multiple causes of death by each of the covariates. The youngest (<25) and oldest (85+) age groups had the lowest and highest percent of multiple causes of death (61.7% and 71.8%, respectively). Interestingly, the age group from 45–64 did not have higher odds of having multiple causes than the young (<25) group. The percentage of men with multiple causes of death reported was slightly higher (1%) than that of women. Individuals over 25 years old with less education had a higher percentage of multiple causes of death (71.3%) compared to those with higher education (66.6%). The most pronounced difference with respect to demographics was found in race categories, with Native American having the highest percentage of multiple cause of death (74.5%), compared to 68.9% of white.
Table 1. Percent of all deaths (n = 326,332) by each covariate. Probability of reporting multiple causes of death given covariate, marginal percent by category, and adjusted odds ratios of reporting multiple causes of death given covariates.
For places of death, hospital in-patient (73.4%) and nursing home (70.6%) had the highest probability of reporting multiple causes of death, and residence had the lowest percentage (58.7%) (Table 1). Between different types of body disposal methods, "removal", which refers to moving the body outside of the US, had the lowest percentage (47.1%) of reporting multiple cause of death. For deaths that had autopsy performed, there was an increased odds of 1.57 that multiple causes of death would be reported. In terms of different types of medical examiners, the difference was less than 1% (69.2% vs. 70.1%) marginally between physician and coroners, the two most frequently seen types of examiners, but examining this difference across time (Figure 3) found an interesting interaction effect in the trend. The physicians showed a decrease in multiple cause of death reporting while the coroners stayed constant or slightly increased over the decade. Table 2 provides reference for the 25 leading underlying causes of death and leading mentioned causes of death in this dataset. It also lists the leading causes of death which occur on death certificates only reporting an underlying cause of death with no non-underlying. The top four causes based on only one cause certificates are the same as the overall top four causes. But it is interesting that the fifth leading cause in this category is "Symptoms and ill-defined conditions" which typically are assigned as the underlying cause only if the sole cause listed.
Figure 3. Interaction between certifier and year.
Table 2. Based on Minnesota death records (n = 326,332) from 1990–1998. Top 25 causes of death ranked by underlying and any mention cause of death. Top 25 causes of death for only those deaths where only one cause was listed (i.e. n = 101,423 deaths).
The results presented so far explored how covariates may be correlated with multiple causes of death being reported. The following results pertain to the conditional probability that a particular cause of death (ischemic hear disease or diabetes) is selected as underlying given that it is mentioned. Results for the subpopulations with ischemic heart disease or diabetes mentioned are shown in Table 3 and Table 4, respectively.
Table 3. Population with Ischemic Heart Disease mentioned on death certificate (N = 79833). Marginal percent by category, conditional percent with ischemic heart disease as underlying given that it is mentioned by category and odds ratios of ischemic heart disease being reported as underlying cause when it is mentioned given covariates.
Table 4. Population with Diabetes mentioned on death certificate (N = 27181). Marginal percent by category, conditional percent with diabetes as underlying given that it is mentioned and odds ratios of diabetes being reported as underlying cause when it is mentioned given covariates.
Table 3 gives the odds ratio of ischemic heart disease being selected as underlying cause of death when it was mentioned as a cause, given the covariates main effect. Overall 77.1% of the time that heart disease was mentioned as a cause, it was selected as the underlying cause of death. The 45–65 year age group had the highest probability of heart disease being codes as underlying when it was mentioned (81.8%). Males had a slightly lower probability than females to have heart disease as underlying cause of death when it was mentioned on the death certificate. Furthermore, Blacks and Native Americans were less likely to have heart disease coded as underlying cause of death when it was present on the certificate as compared to Whites. Individuals that had an autopsy performed were less likely (0.91 odds ratio) to have ischemic heart disease selected as underlying when it was mentioned. If a physician is the death certifier, the probability of selecting heart disease as underlying cause of death is relatively the lowest when compared to coroner, osteopath and other and unknown certifiers. Amongst body disposal methods, the probability for heart disease to be reported as underlying cause was the lowest if bodies were donated (OR = .7 with "burial" as baseline category), and highest if bodies were removed (OR = 1.7). Finally, for those individuals who had heart disease mentioned on their death certificate, patients who died at a residence (not a nursing home) were most likely to have ischemic heart disease selected as the underlying cause of death (90.5% or an OR= 1.8 compare to hospital in-patient).
Unlike Ischemic heart disease, only 29.3% of deaths with diabetes mentioned on the certificate had it selected as the underlying cause of death. While only 2.3% of deaths with diabetes mentioned occurred in the youngest age group (0–44 years), (Table 4) this group has a much larger probability of having diabetes be the underlying cause compared to non-underlying (51.7% reported as underlying). Men were less likely to have diabetes selected as underlying when it is mentioned on the certificate than women. Blacks and Native Americans both have significantly higher odds (OR = 1.2 and 1.4, respectively) of diabetes being the underlying cause given that it was mentioned as compared to Whites. The role of autopsy is that it was less likely diabetes was reported as underlying (OR = 0.7) when one was performed than if one was not. Moreover, if a coroner was the death certifier, diabetes was less likely to be reported as underlying. An increase in the reporting of diabetes as underlying was found for deaths that were removed. Finally, deaths occurring outside of the hospital inpatient setting all show increased odds of diabetes being selected as the underlying cause of death when it has been mentioned.
We also examined what other leading causes of death showed up as underlying when ischemic heart disease or diabetes was mentioned on the certificate. As mentioned above, 77.1% of the individual with ischemic heart disease mentioned on their death certificate had it reported as the underlying. The second most common underlying cause of death when ischemic heart disease was mentioned was, in fact, diabetes (3.4% of the time underlying), followed by cerebrovascular disease (2.7% of the time underlying), then pneumonia (1.5% of the time underlying). When we focus on the population that has diabetes mentioned on the death certificate, as mentioned above 29.3% of the time diabetes is selected as the underlying, and the second most common underlying cause selected is ischemic hear disease at 25.4%, followed by cerebrovascular disease at 7.75% then followed by Other diseases of the heart 2.9%.
Distinct differences in the frequency of multiple causes of death were found across time, age, race, disposal method and place of death. Definitive explanations for the differences cannot be given based on this study, but it is of interest to consider plausible explanations which may motivate further investigation. The increased reporting of non-underlying causes of death as the age of the decedent increases is likely due to actual increases in co-morbidity with age, hence would be explained by actual differences in the causes of death.
The differences found in reporting of multiple causes of death for the other factors may be partly due to systematic reporting biases. According to the NCHS All Mortality Altas [, p. 3], the quality of cause of death determination in the US is affected by the accuracy and completeness of information – from medical diagnosis to final coding and processing of underlying cause of death. Although since 1968 the automated selection of the underlying cause of death has helped to reduce coding and processing errors, the completeness and accuracy of the information supplied on the certificate and the decedent's medical diagnosis remain as potential sources of error. If the certifier enters only one underlying cause and no other causes, then that cause will have to be selected as the underlying and there will not be multiple causes of deaths for that record. It is interesting to note that "Symptoms and ill-defined conditions" is the 5th most commonly reported cause of death to be the only cause of death listed on the certificate. This reporting of it as the only cause of death pushes it up to be the overall 12th leading cause of death. If almost any other cause would be listed simultaneously on the death certificate, this code would not end up as underlying.
The decreasing trend in reporting multiple causes over the decade may be reflective of a gradual change in the procedures of death certification. It would be of interest to consider this trend across different states and longer periods of time including shifts from one ICD coding system to the next.
Previous literature offers various plausible explanations to what contributes to the inaccuracy of reporting causes of death. The cause of death reported on the death certificates depends on a person's disease history that leads to death . If a person dies after a long, well-characterized illness, the cause of death on the certificate is likely to be more accurate than a sudden or unobserved death. Also, when lack of adequate information on the decedent's disease history, the more narrowly characterized the cause of death on the certificate, the more likely it is to be in error. If we assume that reporting multiple causes on the death certificate can be considerd a proxy for level of familiarity of the death certifier with the patient, we would expect that a death which occurs in a hospital or nursing home would be more likely to have multiple causes reported, possibly due to a better documentation of disease history. On the other hand, death at the ER and in particular at the person's residence, which is conceivably often sudden should show a much lower percentage of multiple cause of death reporting. Analysis results from this current study match such speculations, supporting the argument that a good understanding of disease history is crucial. Still another result that supports this conclusion is the fact that performing autopsy, which gain better understanding of the disease condition, increased the probability of reporting multiple cause of death.
Gender and race can also play a role in the accuracy of reporting. Lloyd  showed that positive predictive value of the death certificate tended to be lower in women than in men. Although no large differences were seen between men and women with respect to frequency of multiple causes, there was a higher percentage of multiple causes reported for Native Americans. It is conceivable that the high percentage of multiple cause of death observed for Native Americans might be associated with the geographic factors of concentrated residence and the unique practices of local clinics. Moreover, results (not shown) indicate differences exist across counties of Minnesota in the reporting of multiple causes of death ranging from 50% to 80%. These results support suggestions for better standardized training for physicians and coroners.
Similar to the case of reporting multiple cause of death, the selection of ischemic heart disease and diabetes as underlying compared to non-underlying differs across the several factors considered. The implications of these differences across demographics are that mortality rates would be differentially affected when underlying cause of death is used compared to any mention cause of death. For example, for diabetes we might conclude that diabetes is being under-reported in Whites compared to Blacks, Native Americans and Hispanics if only underlying cause of death were considered since the proportion of diabetes as underlying to mentioned is substantially lower in Whites. This is not to say there is any inaccuracy in the way it is being coded but it points out where multiple cause of death reporting will provide a different perspective than underlying.
Nevertheless, studies have shown the sensitivity and positive predictive value of the death certificate are particularly poor with regard to stroke and diabetes . Furthermore, Lloyd  concluded that physicians may use coronary heart disease as a "default" cause when facing some unknown cause of death cases. The fact that individuals with autopsy performed have lower probability of having heart disease selected as underlying when it is mentioned might suggest that heart disease is often over-assigned as the default disease when no further medical details are available. This is further demonstrated by the very high ratio of ischemic heart disease being coded as the underlying compared to non-underlying cause of death when the death occurred at the person's residence.
As mentioned in the introduction, one limitation of this research is the fact that there is no outside panel of experts who decide independently what the true causes of death are for each decedent, thus whether the associations we found are due to actual differences or reporting bias cannot be discerned. Therefore, this study cannot provide sensitivity or specificity per se, but it aims to identify factors that are associated with variability in reporting multiple cause of death and that perhaps contribute to inaccuracy in reporting underlying cause.
There is much to be learned from multiple cause of death data. It provides ways of looking at mortality data that go well beyond the typical examination of underlying cause of death. Future research is needed to understand further what the greatest concerns are about the accuracy of reporting causes of death. Multiple cause of death data have the potential to help point out potential concerns in the accuracy as well as provide a more complete picture of mortality for causes which are frequently not recorded as the underlying cause of death.
The author(s) declare that they have no competing interests.
MW conceived of the study and wrote most of the manuscript. JH performed the statistical analysis and wrote part of the manuscript. JO provided access to the data and collaboration regarding processes underlying data collection. DD provided details about cause of death reporting and collaboration regarding processes underlying data collection. All authors read and approved the final manuscript.
American Journal of Epidemiology 1986, 124(2):161-179. PubMed Abstract
Population Index 1995, 61(4):527-539. PubMed Abstract
Bah S: Multiple cause-of-death statistics in South Africa: Their utility and changing profile over the period 1997 to 2001. [http://www.ssc.uwo.ca/sociology/popstudies/dp/dp03-02.pdf] webcite
Gordon C: Australian Bureau of Statistics, Multiple cause of death analysis. Publication 3319.0.55.001. [http:/ / www.abs.gov.au/ Ausstats/ abs@.nsf/ Lookup/ FDB92CC903BC3DC8CA256D6B0005A769] webcite
Am J Hemotol 2001, 66(4):229-240. Publisher Full Text
Am J Hemotol 2001, 66(3):159-166. Publisher Full Text
World Health Organization: Manual of the International Statistical Classification of Diseases, Injuries, and Causes of Death, based on the recommendations of the Ninth Revision Conference,. Geneva: World Health Organization; 1975.
The pre-publication history for this paper can be accessed here: