Education level is one indicator of socioeconomic position which, in several countries including South Korea, is provided though death certificate data. Its validity determines the usefulness of death certificate data for exploring the association between socioeconomic position and mortality. This study was to compare education recorded on the death certificate with that reported before death in a nationally representative cohort of participants in the National Health and Nutrition Examination Survey (NHANES).
The 1998/2001 NHANES data contained unique 13-digit personal identification numbers that were individually linked to death certificate data from the Korean National Statistical Office. Duration of mortality follow-up was 7.1 years. The data from 513 deaths were used to determine sensitivity and specificity of education in death certificate and estimate agreement rates of education level between NHANES data and death certificate data. Odds ratios for agreement in education were also estimated. Covariates considered in the analyses were gender, age, duration between NHANES and death, and cause of death.
The proportion of deaths without recorded education in death certificate was very low (0.2%). A total of 29.4% discordant pairs were found. Sensitivity and specificity for college or higher education were 0.84 (95% confidence interval 0.71–0.97) and 0.99 (0.98–1.00). However, sensitivity was poor for middle school education. The overall agreement rate was 70.7% (66.8%–74.6%) when education was categorized into five groups and increased up to 88.9% (86.2%–91.6%) when three education categories were used. The magnitude of validity and reliability for education did not generally vary with age, duration between health survey and death, and cause of death. However, a significantly smaller likelihood of agreement was found for middle and elementary school education after adjusting for covariates.
Low percentage of missing information on education in South Korean death certificate data could provide a great potential to monitor mortality inequalities. A more collapsed categorization in education would be recommended when a more definitive conclusion on educational mortality inequality is required.
Education level is one indicator of socioeconomic position which, in several countries including South Korea, is provided though death certificate data. Its validity determines the usefulness of death certificate data for exploring the association between socioeconomic position and mortality . Although education is likely to be more accurately reported than measures such as income, some evidence from Western studies suggests inconsistencies can occur between education reported in health surveys and that subsequently recorded on death certificates [2-4]. In a study by Lerchen and colleagues, a considerable gap was found between education reported in health interview data and education reported in subsequent follow-up interviews from surviving spouses after death . Contradictory results have also been shown in other studies, with some demonstrating a tendency for education to be reported at a higher level on death certificates than at baseline health survey, [2,3] while other studies showed this was not true .
In addition, the reliability of reported education between interview survey and death certificate data may vary with decedent's gender, age, and educational level . Education can be reported less accurately for female decedents since they are less likely than males to have surviving spouses and are more likely to die at an older age . The overstatement of education was more common among those decedents who, according to health survey data, had a lower education level . However, in a later study, Rosamond and colleagues reported that death certificate education level had substantial validity and that misclassification was similar in both genders . Reports have shown that overstatement bias was more pronounced in older than younger decedents . Apart from these studies, little has been published on the validity and reliability of education between health survey data and death certificate data.
Education is known as a socioeconomic position indicator that is less likely to change over time than occupation and income . As a result, misclassification of education may not vary with the length of time between acquisition of health survey and death certificate data. However, no studies have proven this. In addition, little is known about the misclassification of education by cause of death.
Recently, several studies have been conducted to examine educational inequalities in mortality with use of census and death certificate data in South Korea [6-8]. For example, in prior studies we presented age-, gender-, and cause-specific mortality inequalities by education  and examined trends in mortality inequalities by education . However, the possibility of numerator-denominator bias  could not be excluded in these studies since unlinked data were used [6,7]. If education levels of the census were more inflated than those of death certificate data, exaggerated educational mortality differentials could be made. Son claimed that the reliability of education level between survey and death certificate data can be poor . A prior Korean study examined the reliability of reported education level between survey and death certificate data . However, due to the small number of deaths, a more detailed analysis by covariates such as cause of death and duration between health survey and death was not possible and the validity (sensitivity and specificity) of each educational category was not examined. Information on gender, age, and cause-specific level of validity is useful to assess any bias in the gender, age, and cause-specific magnitude of socioeconomic mortality inequalities measured with census and death certificated data. Therefore, the purpose of this study was to compare education level recorded on death certificates with that reported before death in a nationally representative cohort of participants in the National Health and Nutrition Examination Survey (NHANES) of South Korea. Specifically, we examined the difference in the magnitude of validity and reliability for educational attainment according to gender, age group, duration between health survey and death, educational category, and cause of death.
The Institutional Review Board of the Asan Medical Center, Seoul, approved this study. The health survey data used in this study were the 1998 and 2001 NHANES conducted by the Korea Institute for Health and Social Affairs. Information was collected from a stratified multistage probability sample of South Korean households representing the civilian, non-institutionalized population . Participation rates for the 1998 and 2001 NHANES were 89.8% and 77.3%, respectively. At a baseline visit, information on education level was obtained from an adult household member. For each family member, the highest education completed was reported. This data contained unique 13-digit personal identification numbers that were individually linked to mortality data from the Korean National Statistical Office. By law, all Korean deaths must be reported to the Korea National Statistical Office. We identified deaths that occurred in the years 1999–2005 for those in the 1998 NHANES and 2002–2005 for those in the 2001 NHANES. Duration of mortality follow-up was 7.1 years. Similar to that in NHANES, levels of education reported on death certificates were determined by the highest level of education completed. The information on education was based on reports from the decedent's kin.
Covariates considered in our analyses were gender, age, duration between survey and death, and cause of death. Duration between health survey and death was grouped into categories of 3.4 years or less and 3.5–7.1 years to allocate samples equally into groups. Causes of death were coded according to the 10th revision of the International Classification of Disease (ICD-10) and grouped as cancer (C00–C97), cardiovascular disease (I00–I99), external causes (V01–Y98), and other.
For our analysis, we first compared education level between NHANES data and death certificate data and calculated the proportion of concordant and discordant pairs. Then we determined sensitivity and specificity of education in death certificate using the standard method,  with education reported in NHANES considered the gold standard. Sensitivity and specificity by cause of death were not presented due to the small size of each education group. We also estimated the agreement rates of education level between NHANES data and death certificate data. Overstatement and understatement rates were calculated. Overstatement refers to the inflated education level recorded in death certificates compared with that in NHANES, while understatement means the reverse. Confidence intervals (CI) for sensitivity, specificity, agreement rates, overstatement rates and understatement rates were calculated by asymptotic method. For these analyses, education data from death certificates and NHANES were categorized into five levels (college or higher, high school, middle school, elementary school and no education) and three levels (college or higher, high and middle school, elementary school or less). This was consistent with categorizations used in prior two studies [6,7]. Finally, after adjusting for all covariates, we estimated odds ratios with 95% CI for agreement in educational attainment when five and three educational groups were used. All statistical analyses were performed with SAS statistical software.
Unique personal identification numbers of 12,801 subjects aged 20+ (6,900 for the 1998 NHANES and 5,901 for the 2001 NHANES) were linked to mortality information. A total of 513 deaths were identified: 376 deaths for the 1998 NHANES and 137 deaths for the 2001 NHANES. Of these, 511 (99.6%) had education levels available from both the death certificate and NHANES. One decedent had no education information in the NHANES and another decedent's education was not recorded in the death certificate. Of the 511 participants included in our analyses, 57.9% (n = 296) were men and 61.1% (n = 312) were aged 65+. The mean age of subjects was 66.4 years (SD 14.4). Decedent age varied with education from 52.7 years (SD 16.3) for those with college education or higher to 75.8 years (SD 9.2) for those with no education. A marked gender difference in education was also found with 24.7% (n = 73) of men and 67.9% (n = 146) having no education recorded in the NHANES. While the proportion of those having college or higher education only reached 2.3% (n = 5) for women, the proportion for men was 8.8% (n = 26).
Table 1 shows the cross-classification of education level by death certificate and the adult household member's report at baseline visit in the NHANES of South Korea. For 511 deaths, 150 (29.4%) discordant pairs were found. Inconsistencies were most common when education was reported as middle school in NHANES. Of 54 participants whose education was reported as middle school in NHANES, only 25 (46.3%) were recorded as having middle school education in death certificate data.
Table 1. Comparison of education level between Korea National Health and Nutrition Examination Survey (NHANES) data and death certificate data among 513 decedents
Table 2 presents the sensitivity and specificity of education in death certificate when education level derived from the health survey was considered as the gold standard. The sensitivity and specificity of the death certificate in classifying a decedent as having college or higher education were 0.84 (95% CI: 0.71–0.97) and 0.99 (95% CI: 0.98–1.00), respectively. However, for middle school education, the sensitivity was poor. While the level of sensitivity for high and middle school was greater in men than women, this pattern could not be generalized to other educational groups. A very low sensitivity (0.20) for middle school in women was based on the small sample number (n = 5). With the exception of high school, the sensitivity of education for ages 65+ was similar to that for ages 20–64. In addition, the magnitude of sensitivity and specificity did not vary with the length of time between acquisition of health survey and death certificate data.
Table 2. Sensitivity, specificity and their 95% confidence intervals (CI)* of education level recorded in death certificates using education reported in National Health and Nutrition Examination Survey (NHANES) of South Korea as the gold standard (N = 511)
Table 3 shows the overall agreement rate, overstatement rate and understatement rates with 95% CI. When education level was categorized into five groups (college or higher, high school, middle school, elementary school, and no education), the overall agreement rate was 70.7% (95% CI: 66.8%–74.6%). The magnitude of agreement rate for education did not vary with gender, age, and duration between health survey and death. This was generally true for cause specific analyses. The overstatement rate tended to be greater than the understatement rate. The education level of 87 (17.0%) subjects was reported as greater in death certificate than NHANES data while 63 (12.3%) subjects were recorded with lower education in death certificate data than in NHANES data. This was true for both gender, age and causes of death except for external causes. Meanwhile, the understatement rate for external causes tended to be greater than the rates for other three broad causes. When educational level was grouped into five categories, understatement rate for external causes was 20% while understatement rates for other broad causes were about 10%. Based on logistic analysis when the educational level was grouped into three categories, odds ratio of understatement for external causes (reference group = other broad causes) was 2.85 (1.15–7.05).
Table 3. Agreement rate (%), overstatement* rate (%), understatement* rate (%), and their 95% confidence intervals (CI)* for education level by gender, age, and cause of death: Reliability between Korea National Health and Nutrition Survey (NHANES) data and death certificate data
As presented in Table 3, the magnitude of reliability increased as the number of educational groups decreased. When the education level was categorized into three groups (college or higher, high and middle school, elementary school or less), the overall agreement rates was 88.9% (95% CI: 86.2%–91.6%). For this categorization of education, overstatement and understatement rates diminished and there was no marked evidence of education inflation in death certificate data compared with health survey.
Table 4 presents odds ratio (95% CI) of agreement when five and three educational groups were used to assess agreement. When five educational groups were used, the middle and elementary school education groups showed a significantly smaller likelihood of agreement after adjusting for gender, age (10-year age), duration between health survey and death, and cause of death. However, this was not true when assessing agreement with three educational groups. No significant difference in odds ratio was found for gender, age, duration between health survey and deaths, and cause of death, likely in part due to small sample size.
Table 4. Odds ratio and 95% confidence interval (CI) of agreement in education level (N = 511)
Results of this study showed that the proportion of deaths without recorded education was very low, only one of 513 (0.2%) in death certificate, far less than prior US studies in which information on education for about 15%–27% of all deaths occurred was not available in the death certificate [1,2]. According to death certificate data covering all South Koreans, the rate of failure to report the decedent's educational attainment was 0.3% for men and 0.4% for women aged 35–64 . This may be due to the Korean death certification system where any vague or missing item on the death certificate is further clarified by the Korea National Statistical Office via telephone . This low percentage of missing education information in South Korean death certificate data could provide a great potential to monitor socioeconomic inequalities in mortality.
Direct comparison of the validity and reliability level found in this study with prior reports in other countries may be problematic because of differences in age distribution and educational categories used. Factors associated with participation to health survey may also affect the overall validity and reliability. Nevertheless, we may conclude that this study showed a similar or higher level of validity and reliability in education information than in Western studies. For example, the agreement rate for three educational groups (less than high school graduate, high school graduate, and greater than high school education) in a prior study was 84% . Results of this study showed an agreement rate of 88.9% for the same categories. A prior study presented a sensitivity of 0.85 for college or higher education,  which is similar to what was found in this study (0.84). It should be noted that in prior studies, a substantial number of subjects with missing education information (15%–27%) were found but were eliminated from analyses that reported the agreement rate and sensitivity [1,2].
Despite the relatively high level for overall validity and reliability, low sensitivity and agreement rates were found in this study for participants with middle school education. After adjusting for covariates, middle and elementary school education groups were less concordantly recorded in death certificate data. One possible explanation for this may be related to historical changes in the education system in South Korea. During Japanese colonial occupation (1910–1945) and the Korean War (1950–1953), South Korea could not establish a nationwide educational curriculum . An education system divided by elementary school (6 years), middle school (3 years), and high school (3 years) began in 1954 when the Ministry of Education of Korea first introduced an official nationwide curriculum. Various types of educational attainment, which are difficult to group into the categories used in this study, would be inevitable among decedents whose childhood and adolescent education was achieved before the modern South Korean educational system was established. Difficulties in determining one's educational level have also remained in vital statistic records. For example, middle and high school levels were grouped as one category in early 1990s death certificate data . However, when middle school and elementary school education were collapsed into high school and no education respectively, statistically significant differences in agreement of education did not appear.
It was hypothesized that male decedents' education would be more accurately reported by surviving female spouses than female decedents' due to the gender gap in life expectancy . However, this study showed that the agreement rate of education for women was generally greater than that for men. Statistically significant gender differences in odds of agreement were found for both educational categories (five and three groups) when only the gender variable was entered into logistic model (data not shown). In South Korea a woman's death is more frequently reported by her children, who may know more about her education than other informants. The widow's cognitive or communication problems may lower the agreement rate of male decedents' education. However, the comparison of accuracy level between widow's reporting (for male decedents' education) and children's reporting (for female decedents' education) needs to be further explored. If a woman's informants were uncertain as to which education levels she achieved, it might be difficult to accurately deduce and report the actual education level. However, in this study reports of women's education in the NHANES were concentrated around the "no education" level (67.9%). The simplicity of a woman's education might help informants for death certification to report education information accurately. Based on logistic model, significantly greater odds ratios of agreement for women compared with men became insignificant after adjusting for education (data not shown).
Although the magnitude of overall agreement rate for education did not vary with causes of death, understatement rate for external causes tended to be greater than the rates for other broad causes. Higher understatement rate for external causes was mainly due to stigmatized causes of death rather than other external causes such as transport accident. Understatement rate for suicide (n = 17) was 35.3% (n = 6) while the rate for transport accident (n = 24) was 8.3% (n = 2). Logistic analysis presented a greater odds ratio of understatement for suicide (data not shown). This suggests that a considerable understatement for education may occur in stigmatized causes of death.
Overstatement and understatement of education by descendants may vary with regions with different cultural background. Results of this study showed a tendency for inflation of education when grouped into five categories but did not reveal any evidence of inflation when education was collapsed into three categories. These findings agree with prior US studies where inflation was found in six educational groups  but did not appear in collapsed educational groups . Thus, when using unlinked data a more collapsed categorization of education would produce a more valid result on estimating mortality inequality. However, as the sensitivity of college or higher education was generally greater than that of no education and the pattern did not vary with covariates, it can be suggested that educational mortality differentials between no education and college or higher education in the previous study  were underestimated. Indeed, several longitudinal mortality follow-up studies presented a substantial difference in mortality by education in South Korea [15,16]. Given that the level of agreement did not significantly vary with cause of death, cause-specific analysis may not result in biased results. However, a more collapsed categorization in education would be recommended especially when a more definitive conclusion regarding educational mortality inequality is required.
The low percentage of missing education information in South Korean death certificate data could provide a great potential to monitor socioeconomic inequalities in mortality. Despite the relatively high level for overall validity and reliability, a more collapsed categorization of education would produce a more valid result on estimating mortality inequality when using unlinked data.
List of abbreviations
CI, confidence intervals; NHANES, National Health and Nutrition Examination Survey of South Korea; SD, standard deviation.
The author(s) declare that they have no competing interests.
YHK conceived and designed the study with advice from HRK and JWL. YHK analyzed and interpreted the data and drafted the paper. HRK and JWL interpreted the data and critically revised the draft of the paper. All authors read and approved the final manuscript.
J Prev Med Public Health 2005, 38:443-448.
The pre-publication history for this paper can be accessed here: