A core function of local health departments is to conduct health assessments. The analysis of death certificates provides information on diseases, conditions, and injuries that are likely to cause death – an important outcome indicator of population health. The expected years of life lost (Y LL) measure is a valid, stand-alone measure for identifying and ranking the underlying causes of premature death. The purpose of this study was to rank the leading causes of premature death among San Francisco residents, and to share detailed methods so that these analyses can be used in other local health jurisdications.
Using death registry data and population estimates for San Francisco deaths in 2003–2004, we calculated the number of deaths, Y LL, and age-standardized Y LL rates (ASY Rs). The results were stratified by sex, ethnicity, and underlying cause of death. The Y LL values were used to rank the leading causes of premature death for men and women, and by ethnicity.
In the years 2003–2004, 6312 men died (73,627 years of life lost), and 5726 women died (51,194 years of life lost). The ASY R for men was 65% higher compared to the ASY R for women (8971.1 vs. 5438.6 per 100,000 persons per year). The leading causes of premature deaths are those with the largest average Y LLs and are largely preventable. Among men, these were HIV/AIDS, suicide, drug overdose, homicide, and alcohol use disorder; and among women, these were lung cancer, breast cancer, hypertensive heart disease, colon cancer, and diabetes mellitus. A large health disparity exists between African Americans and other ethnic groups: African American age-adjusted overall and cause-specific Y LL rates were higher, especially for homicide among men. Except for homicide among Latino men, Latinos and Asians have comparable or lower Y LL rates among the leading causes of death compared to whites.
Local death registry data can be used to measure, rank, and monitor the leading causes of premature death, and to measure and monitor ethnic health disparities.
A core function of local health departments is to conduct public health surveillance, including population health assessments [1,2]. Public health surveillance is the ongoing, systematic collection, analysis, interpretation, and dissemination of data regarding a health-related event for use in public health action to reduce morbidity and mortality and to improve health . For a local health jurisdiction, primary data collection, such as representative population-based surveys, can be expensive and unsustainable. Therefore, local health departments must analyze existing health data, preferably those that are population-based, comprehensive, readily available, and locally relevant.
Death records are an important data source for assessing population health and health disparities because they cover the whole population and include information on key characteristics of decedents, including age, sex, ethnicity, place of residence and of death, and underlying and contributing causes of death. However, cause-specific mortality is typically reported using traditional epidemiologic measures, especially counts and rates (including age-adjusted rates), that are heavily influenced by deaths among older residents. For most causes, these measures are not very sensitive to deaths occurring at younger ages, which are more likely to be premature, preventable deaths.
To identify and prioritize causes of premature death, the standard expected years of life lost (Y LL) measure, as developed by the Global Burden of Disease Study , provides a valuable analytic tool which can be applied to local geographic areas. The Y LL is based on comparing the age of death to an external standard life expectancy curve, and can incorporate time discounting and age weighting. The Y LL, combined with the years lived with disability (Y LD) measure, make up the disability-adjusted life year (DALY). Unfortunately, directly measuring Y LDs (and therefore DALYs) is cost prohibitive and not practical for most local health jurisdictions. In contrast, Y LLs are measurable for a comprehensive set of conditions. Y LLs, as opposed to more traditional mortality measures (counts, rates, etc.), highlight premature deaths. Where population estimates are available, age-adjusted Y LL rates allow comparisons across groups or over time . These deaths are particularly important from a public health and public policy perspective because they represent preventable loss of life.
Although the Y LL is a valid, stand-alone measure for identifying and ranking the causes of premature death for a region [6-8], this measure has not been widely adopted for local area mortality analyses. There are several reasons for this. First, detailed methods for calculating Y LL are not available in standard epidemiology textbooks or scientific journal articles. In contrast, the years of potential life lost (Y PLL) is commonly used: it is easily calculated by substracting the age of death from a chosen cut-off (e.g., 65, 75, or 85 years) [9,10]; however, the Y PLL does not measure deaths after the cut-off age, and it does not incorporate time-based discounting used in cost-effectiveness analysis. The Y LL measures every death and can incorporate discounting. Second, with few exceptions , sufficient local area Y LL analyses have not been published to demonstrate their value in assessing population health. And third, readily available software solutions to make analyses more efficient have not been developed.
The purpose of this paper is to provide detailed methods for calculating Y LL for a local geographic area (San Francisco, California, United States), and to demonstrate its value as a population health measure to impact local public health priorities. We illustrate how to use Y LL to rank causes of death, and how to use average Y LLs to identify the leading causes of premature death for major ethnic groups. Analysis and interpretation of death registry data using Y LLs provide objective evidence for public health policymakers, partners, and stakeholders to inform and guide the setting of local public health priorities. This is especially important because of the geographic and demographic variation in health outcomes, major risk factors, and health disparities [5,11].
Summarized in Table 1 are the notation and definitions used in this article. For the years 2003–2004, registered deaths for San Francisco were obtained from the State of California, Department of Health Services, Center for Health Statistics . The data file contained the underlying causes of death of San Francisco residents (whether or not they died in San Francisco), and the underlying causes of death of non-residents that died in San Francisco. For this study, only San Francisco resident underlying causes of death were used. Population estimates were obtained from the State of California, Department of Finance, Demographic Research Unit . Standard life expectancies for men and women are from the Coale-Demeny Model Life Tables West: Levels 25 for men, and Level 26 for women (Table 2). For calculating age-standardized rates we used the Year 2000 United States standard million population (Table 3).
Table 1. Summary of notation and definitions
Table 2. Standard life expectancies based on Model Life Table West, Level 25 and 26
Table 3. Year 2000 United States standard million population
The deaths and population estimates were aggregated into 19 age intervals (columns 1 and 2 of Table 4). For age-standardizations, 11 age intervals were used (Tables 3 and 5). The 19-level age intervals were used for calculating expected years of life lost (Y LL) for men and women, (Table 4), stratified by sex and ethnicity (Table 6), stratified by cause of death and sex (Table 7), and stratified by cause of death, sex, and ethnicity (Additional file 1: Tables A-1 to A-4).
Additional file 1. Leading causes of premature death by ethnicity and sex, San Francisco, 2003–2004. This file contains tables for the leading causes of premature death by ethnicity and sex. We illustrate how these methods can be repeated for population subgroups to inform and guide public health priorities.
Format: PDF Size: 56KB Download file
This file can be viewed with: Adobe Acrobat Reader
Table 4. Calculating expected years of life lost for San Francisco men and women, 2003–2004
Table 5. Calculating direct age-standardized expected years of life lost rate for San Francisco men and women, 2003–2004
Table 6. Expected years of life lost (YLL) and age-standardized YLL rates, By ethnicity, San Francisco, 2003–2004
Table 7. Leading causes of premature death for San Francisco, By sex, 2003–2004
Cause of death categories
Using the International Classification of Diseases, 10th Revision (ICD-10) , the cause of death categories were adapted from the World Health Organization Global Burden of Disease Study  and the Centers for Disease Control and Prevention External Cause of Injury Mortality Matrix . The cause of death category definitions were comprehensive, mutually exclusive, and sufficiently specific to support public health interventions [see Additional file 2]. For example, we used specific cancer diagnoses (e.g., "lung cancer") instead of broad categories (e.g., "all cancers").
Additional file 2. Cause of death classification using International Classification of Diseases, 10th Revision (ICD-10) codes. This file contains the ICD-10 codes used for our cause of death categories, which were adapted from the World Health Organization Global Burden of Disease Study  and the Centers for Disease Control and Prevention External Cause of Injury Mortality Matrix .
Format: PDF Size: 39KB Download file
This file can be viewed with: Adobe Acrobat Reader
Interpolating standard model life table
For a single death at age x, the Y LL for that individual is simply the expected years of life remaining at the age of death (i.e., life expectancy at age x: ex) based on the model life table West, Level 25 and 26 (Table 2). However, the table does not contain life expectancies for deaths within age intervals. For ages that fall within an interval, the life expectancy must be interpolated from the table.
For a group of deaths that occurred at ages within age interval x to x + n, (i.e., n = age interval length), the expected years of life remaining for those deaths ( ) is estimated using a formula for linear interpolation (Equation 1) :
Calculating expected years of life lost (Y LL)
For a group of deaths that occurred at ages within age interval x to x + n, the crude expected years of life lost is
where nDx is the number of deaths between age x and age x + n.
To incorporate discounting and age weighting, one would use Equation 3:
where . For this equation, r is the discount rate, and β, C, and K are age weighting constants (see Table 1 for complete definitions). To include age weighting, K (the modulation constant) can be set to 1. For this study, age weighting was not used (K = 0) and r = 0.03.
When the discount rate (r) is 0, Equation 3 simplifies to Equation 4:
First, we calculated the expected of years of life lost, comparing men to women, by summing nYx for all age intervals (Table 4):
Using this approach, we calculated Y LLs for 117 specific causes of death stratified by sex, and stratified by sex and ethnicity.
Calculating age-standardized expected years of life lost rates
Using the direct method , we calculated age standardized Y LL rates (ASY R). First, we calculated age-specific rates of years of life lost (nyx). Then, these rates were reweighted using using the Year 2000 United States standard million population (nwx in Table 3) . The reweighted rates ( ) were summed to get an ASYR (Equation 6).
See Table 5 for use of Equation 6 in spreadsheet calculations.
Ranking leading causes of premature death
Determining the leading causes of premature death required two steps. First, the leading 15 causes of death were ranked by Y LL values for San Francisco (stratified by sex), and the leading 10 causes of death were ranked for each ethnic group (stratified by sex). Second, to highlight conditions of highly premature death, these were further subranked by their average Y LL values. Age-standardized Y LL rates were included to allow comparisons of ethnic groups within sex strata.
All analyses and graphics were conducted in R – a widely available, open source programming language for statistical computing and graphics . To facilitate the Y LL calculation for readers, we provide and demonstrate a numerical function for R [see Additional file 3].
Additional file 3. Open source programming code for calculating expected years of life lost. This file contains R programming code to calculate expected years of life lost. R is a comprehensive, open source software package for statistical computing and graphics .
Format: PDF Size: 57KB Download file
This file can be viewed with: Adobe Acrobat Reader
Displayed in Table 4 is the spreadsheet format for calculating expected years of life lost (Y LL) for San Francisco men and women. The age-interval specific number of deaths (nDx), average age of death (nax), standard life expectancy ( ), and years of life lost (nYx) are shown. In the years 2003–2004, 6312 men died with 73,627 years of life lost, and 5726 women died with 51,194 years of life lost.
Displayed in Table 5 is the spreadsheet format for calculating direct age-standardized Y LL rates (ASY R) for San Francisco men and women, combining years 2003–2004. The sex and age-interval specific population estimates (nNx), expected years of life lost (nYx), expected Y LL rate (nyx), and weighted expected Y LL rate ( ) are displayed in each column. The ASY R for men was 65% higher compared to the ASY R for women (8971.1 per 100,000 persons per year vs. 5438.6 per 100,000 persons per year). Displayed in Table 6 are the Y LL, number of deaths, average Y LL, age-standardized Y LL rates, and ASY R ratios stratified by sex and ethnicity. While whites and Asians account for the largest number of deaths (as expected based on population estimates), African American men and women have the highest age-standardized Y LL rates (Figure 1). For all causes of death, the ASY R for African American men is 2.44 times higher compared to white men, and the ASY R for African American women is 2.31 times higher compared to white women.
Figure 1. Comparison of age-standardized expected years of life lost rates (ASY Rs), By sex and ethnicity, San Francisco, 2003–2004.
The leading causes of premature death for San Francisco residents, ranked first by Y LLs and then subranked by average Y LLs, are displayed in Table 7. Among the top fifteen, the leading causes with the greatest degree of prematurity of premature deaths are those with the largest average Y LLs. Therefore, among men, the leading causes of premature deaths were HIV/AIDS (average Y LL = 20.3 years), suicide (19.9 years), drug overdose (21.7 years), homicide (YLL = 25.0 years), and alcohol use disorder (17.4 years). The leading causes of deaths with the smallest average Y LLs were ischemic heart disease (8.9 years), lung cancer (10.7 years), stroke (8.2 years), hypertensive heart disease (11.8 years), and chronic obstructive pulmonary disease (8.3 years).
Among women, the leading causes of premature deaths (those with the largest Y LLs) were lung cancer (10.4 years), breast cancer (13.4 years), hypertensive heart disease (8.2 years), colon cancer (9.2 years), and diabetes mellitus (8.6 years). The leading causes of deaths with the smallest Y LLs were ischemic heart disease (6.6 years), stroke (6.9 years), chronic obstructive pulmonary disease (7.8 years), pneumonia (5.6 years), and dementias (4.6 years).
Similarly, an analysis was conducted to rank the leading causes of premature death by ethnicity and sex [Additional file 1] for African Americans (Table A-1), Asians/Pacific Islanders (Table A-2), Latino/Hispanics (Table A-3), and whites (Table A-3). Similar analyses were done for each ethnic group. For example, among African American men, the leading causes of premature death (largest average Y LLs) were homicide (25.9 years), HIV/AIDS (19.7 years), hypertensive heart disease (14.7 years), drug overdose (19.8 years), and alcohol use disorder (15.6 years). The leading causes of death with the smallest average Y LLs were ischemic heart disease (11.8 years), lung cancer (13.1 years), stroke (10.7 years), chronic obstructive pulmonary disease (11.8 years), and diabetes mellitus (12.6 years).
Age-standardized Y LL rates (ASY Rs) allow comparisons of the burden of premature mortality by ethnic group and specific cause of death (Figures 1, 2, 3). For example, for almost every leading cause of premature death in men and women, African Americans had the highest ASY Rs compared to other ethnic groups. Among African American men, the disparity in ASY Rs was most notable for violent assault (homicide), followed by HIV/AIDS, vascular diseases (ischemic and hypertensive heart, and cerebrovascular disease), accidental drug overdose, and lung cancer. Among African American women, the disparity in ASY Rs was most notable for vascular diseases (ischemic and hypertensive heart, and cerebrovascular diseases), breast cancer, HIV/AIDS, and accidental drug overdose.
Figure 2. Leading causes of premature death among men (ranked by Y LLs), comparing age-standardized Y LL rates (ASY R) by cause of death and ethnicity, San Francisco, 2003–2004. Symbols: African American (○), Latino/Hispanic (△), Asian/Pacific Islander (×), White (+).
Figure 3. Leading causes of premature death among women (ranked by Y LLs), comparing age-standardized Y LL rates (ASY R) by cause of death and ethnicity, San Francisco, 2003–2004. Symbols: African American (○), Latino/Hispanic (△), Asian/Pacific Islander (×), White (+).
The key findings of this study are that (1) the leading causes of premature mortality were largely preventable: among men, these were HIV/AIDS, suicide, drug overdose, homicide, and alcohol use disorder; and among women, these were lung cancer, breast cancer, hypertensive heart disease, colon cancer, and diabetes mellitus; (2) leading causes of premature death differed remarkably between ethnic groups (Tables A-1–A-4); (3) a large health disparity was measured between African Americans and other ethnic groups: African American age-adjusted overall and cause-specific Y LL rates are notably higher, especially for homicide among men (Figures 1, 2, and 3); and (4) except for homicide among Latino men, Latinos and Asians had comparable or lower Y LL rates among the leading causes of premature death compared to whites (Figures 2 and 3). These results illustrate how death registry data can be used to measure, rank, and monitor the leading causes of premature mortality for a local geographic region. Such studies can be used to monitor the local mortality burden of disease and injury over time. For example, our results were compared to our previous San Francisco Y LL study for the period 1990–1995 . While the burden of HIV/AIDS deaths decreased remarkably, the ethnic health disparities remained, with African Americans continuing to suffer the largest burden. This was especially striking for homicides among African American men. The generally better health status of Asians and Latinos has persisted.
Several of these findings mirror those from national studies . For example, the U.S. Burden of Disease and Injury Study  found many of the same preventable causes of premature death among the leading causes, and that the YLL ranking for each ethnic group was unique. Like our study, there were large disparities, measured as DALYs, between African Americans and other ethnic groups, and they reported better health outcomes among Asians than whites. The Eight Americas Study [22,23] also found large disparities, measured as life expectancy, between Asian Americans and African Americans. A recent examination of the U.S. black-white disparity in life expectancy during the period 1983–2003  found, like our study, that cardiovascular disease (both males and females), homicide (males), and HIV/AIDS (males) were leading contributors to the gap in recent years.
Three measures were used in this study: Y LLs, average Y LLs, and ASY Rs. The Y LL is a stand-alone measure of mortality burden not requiring population estimates. It was used to rank the 15 leading causes of death for men and women (Table 7). However, these 15 leading causes were influenced by the larger number of deaths among older residents. To highlight premature, preventable causes of death, we then ranked these top 15 causes by their average Y LLs. Notably, many of the leading causes of death have strong social determinants. Alternatively, the ASY R could have been used to rank the leading causes of death; however, this was not our first choice because it requires population estimates, and the rankings would still be influenced by older deaths. Given our availability of population estimates, ASY Rs were used to make comparisons among ethnic groups (Table 6 and Additional file 1). However, only the Y LLs (including average Y LLs) were necessary to rank the leading causes of premature death. Similar analyses were conducted for each ethnic group [Additional file 1].
This study has several strengths. First, we used a simple measure of premature mortality – expected years of life lost – that can be calculated from death registry data that is readily available, population-based, and complete for the whole population. Second, Y LL estimates can be calculated for a comprehensive list of causes of death. Third, Y LL calculations do not require population estimates, allowing leading cause of deaths to be ranked for parts of the population (such as specific ethnicities or geographic areas) for which population estimates are not available. Fourth, subranking by average Y LLs identifies leading causes of premature death, bringing attention to preventable deaths that contribute most to the mortality burden. Fifth, these analyses can be repeated periodically to monitor changes, guide and inform policy makers, and to direct and evalute interventions.
Sixth, except for motor vehicle accidents , we used the Global Burden of Disease ICD-10 cause of death categories, making our methods similar to national and international studies [15,21]. Seventh, our study included Latinos/Hispanics, an important segment of the population that was not included in a similar national study . Eighth, with the availability of ethnic-specific population estimates, we were able to age-standardize the Y LLs to measure, compare, and monitor the ethnic health disparities in the burden of premature deaths. And ninth, our study findings are directly relevant and can be adapted to the diverse and unique needs of our communities, and to our local government and policymakers.
This study also has several limitations. First, the accuracy of data recorded on death certificates (e.g., underlying cause of death and ethnicity) varies by region and underlying cause . Additionally, analyses using underlying cause of death categories may underestimate the mortality burden for selected contributing causes of death listed on the death certificates (e.g., diabetes mellitus) . Second, the Y LL metric does not measure well conditions that cause significant disease and disability, but are difficult to measure (e.g., mental illness) or do not result in death (e.g., osteoarthritis). Third, on average, there may be a 10-month or longer delay from the time a calendar year ends and the availability of ICD-10-coded death registry data.
Fourth, the ranking of a specific cause of death depends on its individual Y LL magnitude as well as its relative contribution compared to other causes; changes in ranking for a cause over time may be due either to changes in the occurence of that cause, or to changes in the occurences of other causes ranked above or below it. Fifth, the average Y LL could be large for a specific cause of death but only involve a small number of deaths (small burden). To avoid this problem, we only evaluated the average Y LL for the highest ranked causes of death based on Y LLs. Sixth, the Y LL measure is not age-standardized and cannot be used to compare specific causes of death between groups with different age compositions. (With population estimates, Y LL can be age-standardized as described in Methods.) And seventh, because of the uncertainty of population estimates, age-standardized rates must also be interpreted with caution. In spite of these limitations, using Y LLs to rank the leading causes of premature death provides community residents, community-based organizations, policy makers, public health authorities, and researchers with local, representative, objective, and informative data to guide and inform public health priorities, and to direct and evaluate public health interventions.
This study has the following key implications: First, we provide the methodological details for calculating Y LL to measure the burden of premature mortality for any geographic area that has death registry data. We provide both the ICD-10 cause of death classifications used for this study [Additional file 2] and the computational program code for calculating age-interval-specific expected years of life lost that can incorporate discounting (used in this study) and age weighting (not used in this study) [Additional file 3]. This code can be executed in a freely available, open source program for statistical computing and graphics . And second, we demonstrate how these results can be used to rank the leading cause of premature death for major ethnic groups. The rankings can be use to guide, inform, and monitor public health priorities and programs for each group. These analyses can become part of routine public health surveillance for local health jurisdictions, as we have done in San Francisco.
Population health measures based on Y LLs are readily calculated and useful for measuring, ranking, and monitoring the leading causes of premature death for a local geographic area, and for measuring and monitoring the impact of local efforts to reduce premature mortality in ethnic groups for which there are health disparities.
The authors declare that they have no competing interests.
TJA conceived and designed the study, conducted the analyses, and prepared the initial manuscript. DYL reviewed the literature on Global Burden of Disease Study methods and applied the findings to our study. BSK reviewed the literature on Global Burden of Disease Study methods and local area research studies, and applied findings to our cause of death classifications. RR assisted in statistical programming, quality control, and review of quantitative methods. MHK reviewed the study for clinical accuracy, epidemiologic methods, and public health impact. All authors contributed substantially to the interpretation of findings and manuscript revisions. All authors read and approved the final manuscript.
Special thanks to Roma Guy (Health Education Department, San Francisco State University; Health Commissioner, City and County of San Francisco) and Virginia Smyly (Deputy Director, Community Programs and Prevention, San Francisco Department of Public Health) for reviewing multiple drafts and providing valuable feedback.
Sources of funding: This study was supported by the San Francisco Department of Public Health (all authors).
Murray CJL, Lopez AD, (Eds): Global Burden of Disease: A comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020 (The Global Burden of Disease and Injury). Harvard School of Public Health; 1996.
Katzenellenbogen J, S SPS: Western Australian Burden of Disease Study: Mortality 2000. [http://www.health.wa.gov.au/publications/documents/WA_Burden_of%20Disease.pdf] webcite
MMWR Morb Mortal Wkly Rep 1993, 42(13):251-253. PubMed Abstract
Aragon T, Reiter R, Katcher B: San Francisco Burden of Disease and Injury: Mortality Analysis, 1990–1995. [http:/ / www.sfdph.org/ dph/ files/ reports/ StudiesData/ DiseaseInjury/ bdi9095b1.pdf] webcite
State of California, Department of Finance: San Francisco Race/Ethnic Projections with Age and Sex Detail, 2000 – 2050. [http:/ / www.dof.ca.gov/ HTML/ DEMOGRAP/ Data/ RaceEthnic/ Population-00-50/ documents/ SanFrancisco.txt] webcite
World Health Organtization: International Classification of Diseases (ICD), 10th Revision. [http://www.who.int/classifications/apps/icd/icd10online/] webcite
van Dommelen L: Linear interpolation. [http://www.eng.fsu.edu/~dommelen/courses/eml3100/aids/intpol/index.html] webcite
[Accessed May 23, 2007].
Natl Vital Stat Rep 1998, 47(3):1-16.
Murray CJL, Kulkarni SC, Michaud C, Tomijima N, Bulzacchelli MT, Iandiorio TJ, Ezzati M: Eight Americas: investigating mortality disparities across races, counties, and race-counties in the United States.
The pre-publication history for this paper can be accessed here: