Early prediction of median survival among a large AIDS surveillance cohort

Enanoria, Wayne TA; Hubbard, Alan E; van der Laan, Mark J; Chen, Mi; Ruiz, Juan; Colford, John M

doi:10.1186/1471-2458-7-127

Research article
Open access
Published: 27 June 2007

Early prediction of median survival among a large AIDS surveillance cohort

Wayne TA Enanoria¹,
Alan E Hubbard²,
Mark J van der Laan²,
Mi Chen³,
Juan Ruiz⁴ &
…
John M Colford Jr¹

BMC Public Health volume 7, Article number: 127 (2007) Cite this article

3262 Accesses
2 Citations
Metrics details

Abstract

Background

For individuals with AIDS, data exist relatively soon after diagnosis to allow estimation of "early" survival quantiles (e.g., the 0.10, 0.15, 0.20 and 0.30 quantiles, etc.). Many years of additional observation must elapse before median survival, a summary measure of survival, can be estimated accurately. In this study, a new approach to predict AIDS median survival is presented and its accuracy tested using AIDS surveillance data.

Methods

The data consisted of 96,373 individuals who were reported to the HIV/AIDS Reporting System of the California Department of Health Services Office of AIDS as of December 31, 1996. We defined cohorts based on quarter year of diagnosis (e.g., the "931" cohort consists of individuals diagnosed with AIDS in the first quarter of 1993). We used early quantiles (estimated using the Inverse Probability of Censoring Weighted estimator) of the survival distribution to estimate median survival by assuming a linear relationship between the earlier quantiles and median survival. From this model, median survival was predicted for cohorts for which a median could not be estimated empirically from the available data. This prediction was compared with the actual medians observed when using updated survival data reported at least five years later.

Results

Using the 0.15 quantile as the predictor and the data available as of December 31, 1996, we were able to predict the median survival of four cohorts (933, 934, 941, and 942) to be 34, 34, 31, and 29 months. Without this approach, there were insufficient data with which to make any estimate of median survival. The actual median survival of these four cohorts (using data as of December 31, 2001) was found to be 32, 40, 46, and 80 months, suggesting that the accuracy for this approach requires a minimum of three years to elapse from diagnosis to the time an accurate prediction can be made.

Conclusion

The results of this study suggest that early and accurate prediction of median survival time after AIDS diagnosis may be possible using early quantiles of the survival distribution. The methodology did not seem to work well during a period of significant change in survival as observed with highly active antiretroviral treatment, but results suggest that it may work well in a time of more gradual improvement in survival.

Peer Review reports

Background

Since the beginning of the AIDS epidemic, the prediction of trends in survival after an AIDS diagnosis has been important for planning health care services and for monitoring the impact of the epidemic. Temporal associations between improved survival following the introduction of expanded treatment options provide population-based evidence that there may be beneficial treatment effects long before these hypotheses can be tested formally. In a time when health care resources are limited and health priorities must be established, it is crucial to project the short-term mortality after AIDS for future planning of health care resources [1].

Temporal trends and improvements in survival with AIDS were reported early in the epidemic even before the introduction of advances in therapy [2]. Shortly after the introduction of zidovudine therapy, temporal trends in survival were (eventually) noted using surveillance data [3]. Other registry-based studies investigated the relationship between survival following an AIDS diagnosis and calendar date of diagnosis [4, 5]. These studies consistently showed marked improvements in AIDS survival after the introduction of zidovudine therapy and Pneumocystis carinii pneumonia prophylaxis. More recently, the introduction of highly active antiretroviral therapy (HAART) has renewed the idea of examining trends in survival after an AIDS diagnosis in order to study both the short- and long-term effects of these new drugs on HIV-related morbidity and mortality.

In this study, a new approach was implemented to predict AIDS survival and test its accuracy using AIDS surveillance data. The purpose of this study was: (1) to determine the earliest quantile (such as 0.10, 0.15, or 0.20) of the survival distribution that can be used to predict accurately a cohort's subsequently observed median survival, and (2) to estimate the survival quantiles using the Inverse Probability of Censoring Weighted (IPCW) estimator [6] in order to improve the prediction methodology in the common situation with registry data in which death is subject to delays in reporting [7].

For cohorts of individuals who have recently been diagnosed with AIDS, data exist for "early" survival quantiles (such as the 0.10, 0.15, 0.20 quantiles, etc.) but many years of additional observation must elapse before later quantiles, such as the median (0.50 quantile) can be estimated with accuracy. Assuming a linear relationship between the early survival quantile and the median survival, an early prediction of the median value for a cohort's eventual survival distribution is compared to the actual or true median value for the cohort. If the predicted median is accurate, then early estimation of AIDS survival is possible and will be of great benefit to health care planners developing strategies and financing for the health care needs of these patients. Additionally, such an accurate, early prediction methodology could be extended to other large, population-based surveillance systems where survival prediction is a major goal.

Methods

California AIDS surveillance data

The California Department of Health Services, Office of AIDS (OA), in cooperation with the Centers for Disease Control and Prevention (CDC), maintains a registry of all reports of AIDS cases in California. This registry, the HIV/AIDS Reporting System (HARS), contains demographic, risk factor and limited clinical information on each reported case. A HARS data set as of December 31, 2001 was used to obtain four variables: the dates of AIDS diagnosis (month and year only; the day was assumed to be 15 in order to calculate a date), the dates of death (if reported), the dates the deaths were reported to the CDC, and the date each case was entered into the registry. Dates of death are updated periodically by local city and county health departments and by OA using the California Death Registry and the National Death Index.

The State of California Health and Human Services Agency Committee for the Protection of Human Subjects and the Committee for the Protection of Human Subjects at the University of California at Berkeley approved the use of these data for this purpose.

Identification of cohorts of AIDS patients as of December 31, 1996

The date of AIDS diagnosis is the date of the first condition that would allow a person to be classified as having AIDS under the 1993 change in the AIDS case definition [8]. This definition was retroactively applied to cases diagnosed prior to 1993. Cases were grouped into cohorts defined by the calendar quarter of their AIDS diagnosis. For example, a person diagnosed in November of 1992 (i.e., the fourth quarter of 1992) was classified into the "924" cohort. All AIDS cases diagnosed according to the 1993 change in the AIDS case definition and entered into the HARS Registry between January 1, 1983 and December 31, 1996 were eligible to be included in the study.

Determination of survival from AIDS diagnosis to death

CDC receives information from California's HIV/AIDS Registry on a monthly basis. For all newly-reported deaths, the date on which the death was first reported to the CDC is recorded. In order to re-create the death information that would have been available to any investigator as of December 31, 1996, death dates were included only if they were reported on or before this date. Survival time was defined as the time elapsed from the date of the AIDS diagnosis until death from all causes, or until December 31, 1996, the date of analysis for the study.

The Inverse Probability of Censoring Weighted estimator

For many sources of registry-based data, there is a delay between the recording of vital status and its availability for analysis. In such situations, the analyst may assume mistakenly that those who are not yet known to have died are still alive when, in fact, some of these individuals may have died but the deaths have not yet been reported to the registry. The use of the Kaplan-Meier (K-M) estimator to estimate survival in this situation has been shown to be inconsistent and to yield biased results [9]. Following the approach of Robins and Rotnitzky [10], van der Laan and Hubbard [6] and Hubbard et al. [11] proposed a simple inverse probability of censoring weighted estimator to account for this delay in vital status information and this estimator was applied in this study.

The study sample consists of 56 cohorts of individuals with AIDS defined by the quarter year of diagnosis. Since the censoring date (the date of analysis) is December 31, 1996, individuals diagnosed with AIDS in the 951 cohort who survived the entire period can only have censoring times equal to 23 months (for those diagnosed in January), 22 months (for those diagnosed in February), or 21 months (for those diagnosed in March). One possible concern with using the IPCW estimator to estimate the survival distribution is that the estimator may perform poorly if the censoring distribution has all of its weight on a small set of times, as observed with this data. If there are subjects for whom the reporting time is greater than the support of possible censoring times, the IPCW may be biased [11]. In order to account for this, artificial censoring was used to augment the estimator.

Each case was assigned a new, uniformly distributed censoring time from 0 months to the maximum censoring time according to the cohort to which each case belonged. For example, the individuals diagnosed in the first quarter of 1995, the 951 cohort, were each assigned randomly a censoring time from a uniform distribution ranging from 0 months to 23 months, the maximum censoring time for this cohort. Similarly, the members of the 941 cohort were each assigned a censoring time from a uniform distribution from 0 months to 35 months, the members of the 931 cohort from 0 months to 47 months, and so forth. The censoring time for each individual was taken to be the minimum of this new censoring time or the original censoring time defined as the time elapsed from their date of diagnosis to December 31, 1996. By doing so, an artificial censoring distribution is created with more uniform mass over the possible times of death for each of the cohorts.

The reason to artificially censor the date arises from the type of censoring distribution encountered in these data. Specifically, subjects are enrolled within a narrow window of time (three months) for each cohort and all subjects are censored at the same chronological time. Thus, the censoring distribution has all of its mass over a three month period. The consequence of this is the potentially high variability in the IPCW estimator for quantiles within the support of censoring. By artificially censoring the data, censoring is "spread" over a larger interval which reduces the variability of estimates of survival at later quantiles. The cost is that the variability of survival estimates of earlier quantiles is increased by censoring originally uncensored observations.

Prediction of median estimates of survival

Since our goal was to use early survival experiences to predict later survival, we examined the relation between the early quantiles (i.e., 0.10, 0.15, 0.20, and 0.30 quantiles) of the survival distribution and the 0.50 quantile. Assuming a linear relationship, predicted median estimates were calculated based on the estimation of the linear model by entering the observed early quantile into the model. By assuming a linear model, this implies that our method only works so long as there is the same systematic shift in the survival distribution over time. That is, if the early quantile increases over time for a particular cohort, our method works only if the later quantile increases as well.

The "true" quantiles of the survival distribution

The "true" quantiles (i.e., the best possible estimate of the quantiles) of the survival distribution were assumed to be the quantiles of survival estimated empirically from the data using the IPCW estimator as of December 31, 2001. This provided at least an additional five years of observation after the date of analysis upon which the early predictions were made. In order to assess the performance of the prediction method, the predicted median estimates using our method were compared to these "true" medians (i.e., observed median estimates using data as of December 31, 2001) for the study sample.

Results

Deaths in HARS

The justification for using five years of follow-up as providing the "true" survival estimates (i.e., the length of follow-up necessary for a cohort until the quantiles are "stable" and the "true" quantiles are achieved for a particular cohort diagnosed with AIDS) is based upon empirical data. Using the raw data as of December 31, 2001 (with no artificial censoring imposed), the cumulative numbers of deaths over ten years of follow-up for four cohorts were determined (Table 1). Among the deaths that were known to occur after ten years of follow up for the cohort of individuals diagnosed in 854 (n = 309 deaths reported as of December 31, 1995 among n = 317 individuals identified as part of this cohort), 83.6% of the cohort were known to have died within four years and 88.3% were reported within 5 years. On average, 80% or more of cohorts were known to have died within four years of the identification of the cohort. These results give empiric evidence that the "true" quantiles of survival are those which are observed five years after identification of the cohort and were our basis for our decision to derive the "true" estimates using data from December 2001.

Table 1 Cumulative number of deaths over a ten-year follow-up period for the 854, 874, 894, and 914 cohorts.

Full size table

Study sample using data as of December 31, 1996

There were 96,754 AIDS cases diagnosed between 1978 through 1996 and entered into the HARS database on or before December 31, 1996. Of those, 84 cases (0.1%) were excluded who were reported as having negative survival times (n = 1) or negative reporting times (n = 83). After excluding 297 cases (0.31%) diagnosed prior to January 1, 1983 due to small sample sizes for each of these cohorts, 96,373 (99.6%) of all AIDS cases diagnosed and entered by December 31, 1996 were included in the analysis.

Survival quantiles according to the Inverse Probability of Censoring Weighted estimator

Figure 1 shows the 0.10, 0.15, 0.20, and the 0.50 quantiles for cohorts estimated using the database as it would have existed on December 31, 1996 (plotted on a log scale). The median estimate appears to be increasing beginning with the 864 cohort and again with the 904 cohort. There are 11 cohorts (933 through 954 and 962) for which the 0.15 quantile could be estimated and the 0.50 quantile could not be estimated using the IPCW estimator.

Predicted median estimates based on the relation of the observed median estimates and the observed 0.15 quantiles

Table 2 shows the observed 0.15 quantile (column A) and the observed 0.50 quantile (column B) according to the IPCW estimator as of December 31, 1996. From the results of the linear regression analysis, a median could be predicted for four cohorts (cohorts 933 through 942) for which no median estimates had been observed as of December 31, 1996 (column C) and compared to the true median estimates five years later (column D). For the 933 cohort, the predicted median estimate was 34 months based on the observed 0.15 quantile (the true median based on the 2001 data was 32 months). For the 934 cohort, the predicted median estimate was 34 months (the true median based on the 2001 data was 40 months). The differences between the predicted median and the "true" median increase as the cohorts get closer to the censoring date, i.e., for cohorts 934, 941, and 942. A scatterplot of the observed 0.15 and 0.50 quantiles is given in Figure 2.

Table 2 Predicted versus True 0.50 Quantiles using the 0.15 Quantile as a Predictor Variable.

Full size table

A comparison of the predicted medians and the true medians using other early quantiles of the survival distribution as predictors in separate linear regression models are shown graphically in Figure 3 (using the 0.10 and 0.15 quantiles as predictors) and Figure 4 (using the 0.20 and 0.30 quantiles as predictors).

A closer look at the predicted median survival estimates according to the IPCW estimator (Table 2) shows that this technique overestimated the median for earlier cohorts (suggesting a steeper linear slope) and underestimated the median for later cohorts (suggesting a less steep slope). This would suggest that the relation between the 0.15 quantile and the 0.50 quantile while assumed to be linear is changing over time. Thus, predicted median survival estimates based on a model with an interaction between the 0.15 quantile and calendar time was evaluated (Table 3). The inclusion of an interaction term with time yielded a predicted median that was closer to the truth (in comparison to the model without an interaction term) for three cohorts for which a median could not be observed at the time the prediction was made (933, 941, and 942).

Table 3 Predicted versus True 0.50 Quantiles using the 0.15 Quantile as a Predictor Variable based on a linear model with an interaction term.

Full size table

Median survival according to the Kaplan-Meier estimate of survival for four cohorts of AIDS cases

Our methodology enabled us to predict the median survival for four cohorts for which a median had not yet been observed: the 933, 934, 941, and 942 cohorts. Using our methodology, we estimated the median survival for the cohort 933 would be 34 months as of December 31, 1996 (Tables 2 and 3). Using traditional techniques (i.e., using the K-M estimator of survival), we would not have been able to observe a median of 34 months for this cohort until September 30, 2000 (almost four years later) (Additional File 1). Thus, our prediction method enabled us to make a prediction for cohorts almost four years earlier than it would be observed using traditional techniques. For the cohort 934, we estimated the median survival would be 34 months. As of December 31, 2001 (the closing date for our dataset), we still had not observed a median of 34 months according to the K-M estimator. For the 934 cohort, any median estimate would not be observed using traditional methods until May 31, 1998, 17 months after the prediction was made (December 31, 1996) using our methodology (Additional File 2).

Predicted median estimates based on other dates of analyses

We also examined the performance of our prediction method using two other dates of analyses other than December 31, 1996. Prediction median estimates for data as of December 31, 1992 and December 31, 1994 are presented in Additional Files 3 and 4 respectively.

Discussion

For cohorts of individuals who have been diagnosed recently with AIDS, data exist relatively soon after diagnosis for estimating "early" survival quantiles (such as the 0.10, 0.15, 0.20 quantiles, etc.) but many years of additional observation must elapse before later quantiles can be estimated accurately. The purpose of this study was to determine if median survival could be predicted accurately using earlier quantiles of survival distributions provided by AIDS surveillance data. Our approach for predicting median survival consisted of two components: (1) the estimation of quantiles of the survival distribution using the IPCW estimator, and (2) the use of a linear model to reflect the relationship between the early quantile and the later quantile. The utility of such an approach would allow early information to predict later unobserved survival patterns in order to accurately identify changes in population-based survival years before such changes are observed. If accurate estimation could be achieved, this approach could offer a method for researchers to estimate the expected survival distribution after AIDS diagnosis (or after many conditions for which surveillance databases are maintained such as cancer). This approach enables accurate predictions of changes in survival among HIV-infected individuals like that observed in 1987 [3–5, 12] and more notably with the advent of the use of highly active antiretroviral therapies [13–16].

The class of IPCW estimators has been developed in order to improve more traditional techniques in situations where these techniques may lead to biased estimates of survival. IPCW estimators have been applied to many types of survival problems such as correcting for non-compliance and dependent censoring in the examination of a beneficial treatment effect on survival [17] and non-parametric survival estimation when death is reported with delay [11], as in this study.

In this study, the 0.15 quantile of the survival distribution predicted accurately the median survival for cohorts diagnosed before the third quarter of 1993. In addition, the 0.15 quantile of the survival distributions predicted accurately the median survival in the short-term for two cohorts (933 and 934) for which a median estimate could not be estimated at the time of analysis (December 31, 1996). By using traditional methods (i.e., the K-M estimator) and without the use of our methodology, at least six months of additional follow-up would be required to observe any median survival estimate for the 933 cohort and at least 45 months until the predicted median of 34 months (as predicted by our methodology) would be observed for this cohort. This demonstrated that our methodology not only provides an accurate estimate of median survival, but an estimate of median survival long before traditional approaches would allow.

The results of this study suggest that our methodology yields an accurate prediction of median survival for cohorts diagnosed at least three years earlier than the date when the prediction is made. For example, the difference between the predicted and the true median survival was ≤ 6 months for the cohorts 933 and 934 but greater than 6 months for the cohorts 941 and 941 using data as of December 31, 1996. The "true" median estimate (using data as of December 31, 2001) according to the IPCW estimator for the 942 cohort was estimated to be 80 months. This may indeed be an early estimate of the median survival for this cohort and, as more data for this cohort becomes available, this median estimate may decrease over time like that observed with the K-M estimator (Additional File 1). This early estimate would greatly affect our assessment of the accuracy of our predictions since we are using this estimate as the "true" median survival. As a result, if one applies this methodology now in the second quarter of 2006, we could only expect to be able to predict with accuracy the median survival for cohorts that were diagnosed in the second quarter of 2003 or earlier. This methodology would not appear to work for cohorts diagnosed later than the second quarter of 2003.

The ability of the method, however, to accurately predict median survival in the short-term based on historical data is greatly influenced by three factors: (1) the variability of the estimates of the various quantiles of the survival distribution, (2) the assumption that the relation between the early quantile and the later quantile (median) can be represented by a linear model, and (3) this relation will remain the same in the short-term.

The estimates of the various quantiles of the survival distribution are affected greatly by two sources of bias: delays in diagnosis and delays in death reports. The fact that patients who were in HARS as of a particular date of analysis were included in the study sample obviously excluded those who were diagnosed before this date of analysis but were reported sometime after. Delays in diagnosis affect the estimates of survival by introducing additional individuals into each of the cohorts and, depending on their individual survival times, may affect the observed quantiles of the survival distribution. It is unclear how such individuals would affect the estimates of survival, but this potential for bias was eliminated by only including those who were reported by a fixed data of analysis (e.g., December 31, 1996) in order to mimic the "real world" situation in which only those patients currently entered into a database are available for analysis. This should not detract from the utility of the estimation procedure since the predicted medians were compared with additional data five years after the prediction was made using only those patients who were selected in the original study sample.

In addition, delays in the reporting of death can bias estimates of survival if one assumes that a case is alive if a death date has not been reported. In this study, the use of the IPCW estimator is an attempt to mitigate the potential for bias in estimating survival after an AIDS diagnosis due to reporting delays of death. This estimator adjusts the estimates of survival for a given cohort to account for delays in death reports, provided that the delay of death report distribution is known. The bias introduced by failing to account for delays in death reporting when estimating survival after an AIDS diagnosis has already been established [11].

For simplicity, the prediction method assumed that the early quantile such as the 0.15 quantile, a representation of the "early"' survival experience was linearly related to the median of the survival estimate, the "later survival experience". Higher-dimensional models were explored but did not improve the predictive ability (data not shown). In assuming a linear relationship and extrapolating the observed relationship to the future, an additional assumption made was that this relationship would remain as observed in the past, at least in the short-term. Validation from the more recent cohorts (i.e., the 933 and 934 cohorts) confirms that the linear model accurately predicts median survival in the short-term, but may not perform well for all cohorts (e.g., the 941 and 942 cohorts). Assuming that HAART first became available with the approval of Saquinavir (hard-gel) in December 1995, the 941 and 942 cohorts would have been introduced to HAART earlier after diagnosis (21 months for the 941 cohort and 18 months for the 942 cohort) in comparison to the 933 and 934 cohorts (27 months and 24 months respectively). When comparing survival across different cohorts diagnosed over time, we would expect the later cohorts to demonstrate a shift in survival, thus violating any observed linear relationships between earlier quantiles and later quantiles observed in the past.

Conclusion

This investigation suggests that this approach to survival estimation accurately predicted subsequent survival experience observed in two of these cohorts (the 933 and 934 cohorts). It is notable that the technique did not perform well during a period of rapid increase in AIDS survival that is not unlike the presently observed increases in survival influenced by current advances in therapy. However, the performance of the methodology before the introduction of HAART suggests that this methodology may work well in a time of more gradual improvement in survival with antiretroviral treatment. This technique may also have application in other areas of research (e.g. cancer surveillance) where population changes in survival have been observed and should be validated using additional data.

References

Bacchetti P: Historical assessment of some specific methods for projecting the AIDS epidemic. Am J Epidemiol. 1995, 141 (8): 776-781.
CAS PubMed Google Scholar
Bacchetti P, Osmond D, Chaisson RE, Dritz S, Rutherford GW, Swig L, Moss AR: Survival patterns of the first 500 patients with AIDS in San Francisco. J Infect Dis. 1988, 157 (5): 1044-1047.
Article CAS PubMed Google Scholar
Lemp GF, Payne SF, Neal D, Temelso T, Rutherford GW: Survival trends for patients with AIDS. Jama. 1990, 263 (3): 402-406. 10.1001/jama.263.3.402.
Article CAS PubMed Google Scholar
Harris JE: Improved short-term survival of AIDS patients initially diagnosed with Pneumocystis carinii pneumonia, 1984 through 1987. Jama. 1990, 263 (3): 397-401. 10.1001/jama.263.3.397.
Article CAS PubMed Google Scholar
Moore RD, Hidalgo J, Sugland BW, Chaisson RE: Zidovudine and the natural history of the acquired immunodeficiency syndrome. N Engl J Med. 1991, 324 (20): 1412-1416.
Article CAS PubMed Google Scholar
van der Laan MJ, Hubbard AE: Locally-efficient estimation of the survival distribution with right-censored data and covariates when collection of the data is delayed. Biometrika. 1998, 85: 771-783. 10.1093/biomet/85.4.771.
Article Google Scholar
Bacchetti P: Reporting delays of deaths with AIDS in the United States. J Acquir Immune Defic Syndr Hum Retrovirol. 1996, 13 (4): 363-367.
Article CAS PubMed Google Scholar
Centers for Disease Control and Prevention: 1993 revised classification system for HIV infection and expanded surveillance case definition for AIDS among adolescents and adults. MMWR Recomm Rep. 1992, 41 (RR-17): 1-19. [http://www.cdc.gov/mmwr/preview/mmwrhtml/00018871.htm]
Google Scholar
Tu XM, Meng XL, Pagano M: The AIDS epidemic: estimating survival after AIDS diagnosis from surveillance data. Journal of the American Statistical Association. 1993, 88: 26-36. 10.2307/2290688.
Google Scholar
Robins JM, Rotnitzky A: Recovery of information and adjustment for dependent censoring using surrogate markers. AIDS Epidemiology: Methodological Issues. 1992, Boston , Birkhauser, 297-331.
Chapter Google Scholar
Hubbard AE, Van der Laan MJ, Enanoria W, Colford JM: Nonparametric survival estimation when death is reported with delay. Lifetime Data Anal. 2000, 6 (3): 237-250. 10.1023/A:1009689625311.
Article CAS PubMed Google Scholar
Whitmore-Overton SE, Tillett HE, Evans BG, Allardice GM: Improved survival from diagnosis of AIDS in adult cases in the United Kingdom and bias due to reporting delays. Aids. 1993, 7 (3): 415-420. 10.1097/00002030-199303000-00017.
Article CAS PubMed Google Scholar
Lee LM, Karon JM, Selik R, Neal JJ, Fleming PL: Survival after AIDS diagnosis in adolescents and adults during the treatment era, United States, 1984-1997. Jama. 2001, 285 (10): 1308-1315. 10.1001/jama.285.10.1308.
Article CAS PubMed Google Scholar
Sabin CA: Assessing the impact of highly active antiretroviral therapy on AIDS and death. Aids. 1999, 13 (15): 2165-2166. 10.1097/00002030-199910220-00021.
Article CAS PubMed Google Scholar
Porta D, Rapiti E, Forastiere F, Pezzotti P, Perucci CA: Changes in survival among people with AIDS in Lazio, Italy from 1993 to 1998. Lazio AIDS Surveillance Collaborative Group. Aids. 1999, 13 (15): 2125-2131. 10.1097/00002030-199910220-00016.
Article CAS PubMed Google Scholar
Li Y, McDonald AM, Dore GJ, Kaldor JM: Improving survival following AIDS in Australia, 1991-1996. National HIV Surveillance Committee. Aids. 2000, 14 (15): 2349-2354. 10.1097/00002030-200010200-00016.
Article CAS PubMed Google Scholar
Robins JM, Finkelstein DM: Correcting for noncompliance and dependent censoring in an AIDS clinical trial with Inverse Probability of Censoring Weighted (IPCW) log-rank tests. Biometrics. 2000, 56: 779-788. 10.1111/j.0006-341X.2000.00779.x.
Article CAS PubMed Google Scholar

Pre-publication history

The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2458/7/127/prepub

Download references

Acknowledgements

The authors would like to thank the following individuals for their assistance in providing the HARS and death report data: Maya Tholandi, Jim Creeger, Kristin Debnar, Pat Sweeney, and Sam Costa.

Author information

Authors and Affiliations

Division of Epidemiology, School of Public Health, University of California at Berkeley, Berkeley, California, USA
Wayne TA Enanoria & John M Colford Jr
Division of Biostatistics, School of Public Health, University of California at Berkeley, Berkeley, California, USA
Alan E Hubbard & Mark J van der Laan
Centers for Disease Control and Prevention, Atlanta, Georgia, USA
Mi Chen
California Department of Health Services, Office of AIDS, Sacramento, California, USA
Juan Ruiz

Authors

Wayne TA Enanoria
View author publications
You can also search for this author in PubMed Google Scholar
Alan E Hubbard
View author publications
You can also search for this author in PubMed Google Scholar
Mark J van der Laan
View author publications
You can also search for this author in PubMed Google Scholar
Mi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Juan Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
John M Colford Jr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John M Colford Jr.

Additional information

Competing interests

The author(s) declare that they have no competing interests.

Authors' contributions

JC and WE conceived of the study and WE drafted the manuscript. WE conducted the analyses under the guidance of AH and MvdL who created the novel survival methodology presented in this paper. MC and JR participated in acquisition of the data and interpretation of the data analyses. All authors read and approved the final manuscript.

Electronic supplementary material

12889_2006_738_MOESM1_ESM.doc

Additional File 1: Median survival (months) for four cohorts of AIDS cases at different dates of analysis according to the Kaplan-Meier estimator, December 1996 – December 2001. This table gives the estimate of median survival according to the Kaplan-Meier estimator for four cohorts for different dates of analysis in order to illustrate when a median survival estimate would be observed using traditional methods. (DOC 46 KB)

12889_2006_738_MOESM2_ESM.doc

Additional File 2: Comparison of Prediction Methodology with Traditional Methods (Kaplan-Meier Estimator). This table gives the earliest date the true median survival estimate could be observed and estimated using traditional methods (the Kaplan-Meier estimate), as well as the date any median survival estimate could be observed and estimated using the traditional approach. (DOC 38 KB)

12889_2006_738_MOESM3_ESM.doc

Additional File 3: Predicted 0.50 Quantile versus "True" 0.50 Quantile according to the Inverse Probability of Censoring Weighted Estimate of Survival, 1983–1992. This table gives predicted and true 0.50 quantile estimates of survival using the methodology presented in the paper using a dataset as of December 31, 1992 and comparing it with results as of December 31, 1997 (assumed to be the "truth"). (DOC 108 KB)

12889_2006_738_MOESM4_ESM.doc

Additional File 4: Predicted 0.50 Quantile versus "True" 0.50 Quantile according to the Inverse Probability of Censoring Weighted Estimate of Survival, 1983–1994. This table gives predicted and true 0.50 quantile estimates of survival using the methodology presented in the paper using a dataset as of December 31, 1994 and comparing it with results as of December 31, 1999 (assumed to be the "truth"). (DOC 120 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Enanoria, W.T., Hubbard, A.E., van der Laan, M.J. et al. Early prediction of median survival among a large AIDS surveillance cohort. BMC Public Health 7, 127 (2007). https://doi.org/10.1186/1471-2458-7-127

Download citation

Received: 20 July 2006
Accepted: 27 June 2007
Published: 27 June 2007
DOI: https://doi.org/10.1186/1471-2458-7-127

Early prediction of median survival among a large AIDS surveillance cohort

Abstract

Background

Methods

Results

Conclusion

Background

Methods

California AIDS surveillance data

Identification of cohorts of AIDS patients as of December 31, 1996

Determination of survival from AIDS diagnosis to death

The Inverse Probability of Censoring Weighted estimator

Prediction of median estimates of survival

The "true" quantiles of the survival distribution

Results

Deaths in HARS

Study sample using data as of December 31, 1996

Survival quantiles according to the Inverse Probability of Censoring Weighted estimator

Predicted median estimates based on the relation of the observed median estimates and the observed 0.15 quantiles

Median survival according to the Kaplan-Meier estimate of survival for four cohorts of AIDS cases

Predicted median estimates based on other dates of analyses

Discussion

Conclusion

References

Pre-publication history

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors' contributions

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Public Health

Contact us