Abstract
Background
To preserve patient anonymity, health register data may be provided as binned data only. Here we consider as example, how to estimate mean survival time after a diagnosis of metastatic colorectal cancer from Norwegian register data on time to death or censoring binned into 30 day intervals. All events occurring in the first three months (90 days) after diagnosis were removed to achieve comparability with a clinical trial. The aim of the paper is to develop and implement a simple, and yet flexible method for analyzing such interval censored and truncated data.
Methods
Considering interval censoring a missing data problem, we implement a simple multiple imputation strategy that allows flexible sensitivity analyses with respect to the shape of the censoring distribution. To allow identification of appropriate parametric models, a χ^{2}goodnessoffit testalso imputation basedis derived and supplemented with diagnostic plots. Uncertainty estimates for mean survival times are obtained via a simulation strategy. The validity and statistical efficiency of the proposed method for varying interval lengths is investigated in a simulation study and compared with simpler alternatives.
Results
Mean survival times estimated from the register data ranged from 1.2 (SE = 0.09) to 3.2 (0.31) years depending on period of diagnosis and choice of parametric model. The shape of the censoring distribution within intervals did generally not influence results, whereas the choice of parametric model did, even when different models fit the data equally well. In simulation studies both simple midpoint imputation and multiple imputation yielded nearly unbiased analyses (relative biases of 0.6% to 9.4%) and confidence intervals with nearnominal coverage probabilities (93.4% to 95.7%) for censoring intervals shorter than six months. For 12 month censoring intervals, multiple imputation provided better protection against bias, and coverage probabilities closer to nominal values than simple midpoint imputation.
Conclusion
Binning of event and censoring times should be considered a viable strategy for anonymizing register data on survival times, as they may be readily analyzed with methods based on multiple imputation.
Background
Individualized register data are routinely collected in many countries on a broad variety of diseases, and are becoming an indispensable source of information for health research. However, as informed consent is not obtained from patients, preservation of anonymity is a key concern when allowing researchers access to register data. Often, access is only allowed to binned data in which individuals can no longer be identified. Such data may pose an analytic challenge to researchers since dedicated statistical procedures for this situation are not readily available in standard statistics software, and hence there is a need for general purpose strategies that can be easily implemented in these settings.
The example of binned register data which we will study in this paper arose in the context of colorectal cancer, one of the most frequent malignancies in industrialized countries. Since a substantial proportion of patients either have clinical metastases at the point of diagnosis or develop them in the course of the disease, the prognosis is poor. For Norwegian patients diagnosed in the period 19972001 with metastatic rectal cancer (ICD9 C1921), the survival was limited with a 10.4% 5year survival for males and 7.8% for females [1]. Corresponding figures for colon cancer (ICD9 C18) were 7.4% for males and 8.6% for females, ibid. In a pivotal study by Hurwitz et al [2], patients with metastatic colorectal cancer were randomized to either conventional chemotherapy alone or conventional chemotherapy with addition of bevacizumab. Hurwitz et al found a significant treatment effect for adding bevacizumab (hazard ratio for mortality: 0.66, p < 0.001).
Consequently, the Norwegian Knowledge Centre for Health Care in spring 2007 was commissioned to undertake a health technology assessment of bevacizumab as guidance for the decision on introducing the drug into standard Norwegian treatment practice. As a key part of the assessment, estimates of recent and current mean survival times in this group of patients were requested, since it was believed that the prognosis had changed over the last two decades and was potentially quite different from those observed in the Hurwitz et al study. In 1991, Norwegian physicians' attitude to metastatic colorectal cancer was pessimistic, as only few patients had surgery to remove liver metastases and/or received chemotherapyand if the latter then 5fluorourcil only. In recent years, oncologists search more actively for metastases, which are then likely detected earlier; liver metastases are removed if technically possible and the patient is otherwise in good condition; chemotherapy is offered more frequently, and then usually oxaliplatin or irinotecan based.
To preserve anonymity of individual patients, CRN provided counts of deaths and censoring events without any patient specific information grouped into 30 day intervals (termed months in the following). Further, since the clinical trial reported by Hurwitz et al excluded patients with a prognosis of less than three months of survival (90 days) [2], it was argued that exclusion of all deaths and losstofollowup events occurring within three months of diagnosis would improve comparability with the clinical trial data. Note, however, that in the clinical trial, deaths did occur also within the first three months, and so the truncation in the Norwegian registry not only removed patients ineligible for treatment with bevacizumab, but inadvertently also those who merely happened to have short survival times and who would otherwise have been eligible.
Although the anonymized data from CRN are interval censored, they are neither of the standard type I or type II interval censored data. Type I is known as current status data, where survival times are only known to be smaller or larger than a given point in time. In type II data, event times are observed to belong to intervals where either both limits are finite and observed, or alternatively one is finite and observed and the other is infinitesee [3] for a detailed discussion. Thus, the monthly counts of deaths are type II interval censored, but the monthly counts of losstofollowup are not, as they do not assign a definitive (left) limit to the interval containing the event. Put differently, the CRN data includes interval censoring of censoring times (losstofollowup). From another perspective, the CRN data may be viewed as an example of life table data with left truncation, but life table data are for nonstandard analyses best regarded as interval censored [4]. While it is known that Maximum Likelihood Estimation (MLE) is generally superior to other analytic strategies for interval censored data [57], it is not straightforward to conduct a full MLE analysis for this type of data, unless one is willing to adopt restrictive assumptions on the censoring distribution (see below). Using multiple imputation for analyzing interval censored data has been suggested by several, for example [8,9], but not for data with interval censored censorings.
The objective of the present paper is to investigate if a multiple imputation strategy can be utilized to validly fit parametric models to the CRN data. Secondly, it is to develop diagnostic procedures that can be used to assess the fit of any parametric distribution. Finally, to investigate whether mean survival times comparable to those obtained from the clinical trial data by Tappenden et al [10] can be reliably estimated, i.e. are reasonably robust to choice of parametric distribution.
The paper is organized as follows: First in the Methods section the CRN data are described and the model is introduced. We then describe a simulation study to assess the validity and statistical efficiency of the developed strategy, develop diagnostic procedures, and introduce a simulation strategy for estimating uncertainty of estimated mean survival times. Parameterization of distributions and details of the implemented strategies are presented at the end of the Methods section. The Results section first reports the results of the simulation study before proceeding to the results of analyzing the CRN data. Finally, results and their implications are discussed.
Methods
Material
The Cancer Registry of Norway holds followup data on all patients diagnosed with malignancies in Norway in the period 19912005. For the present study, data was requested on all patients diagnosed with colorectal cancer (ICD9 codes C18, C19, and C20) for the first time, and where patients at the time of diagnosis had distant metastases (stage 4) and were less than 70 years at diagnosis. Patients who either died or were lost to followup earlier than 90 days after diagnosis were excluded, as it was assumed that the remaining patients would be more comparable to those included in the Hurwitz trial [2]. For the included patients, CRN only provided 30 day counts of deaths and losstofollowup, respectively, due to confidentiality concerns. CRN provided data separately for patients diagnosed within each of three time periods (19916, 19972001, and 20025) to allow period specific estimation. All patients were followed until death or Jan 1, 2006, whichever came first. The data provided by CRN are summarized in Table 1, where they have been binned in years for legibility. The full data set is available upon request from the lead author.
Table 1. Deaths and censorings among patients with colorectal cancer, Norway, 19912005
Ordinary model for survival times
Let Y be time to death after diagnosis with survivor distribution function S_{Y }(·; θ), density function f_{Y }(·; θ), and hazard h_{Y }(·; θ), where θ is a parameter vector. Let Z be a censoring event so that the observable time variable, X, is the minimum of Y and Z, accompanied by an indicator variable denoting which type of event was observed, i.e.:
If censoring is assumed independent, the marginal likelihood based on all n individuals for inference on θ, where the censoring contributions have been factored out, looks as follows
with x = (x_{1}, x_{2}, ..., x_{n}) and δ = (δ_{1}, δ_{2}, ..., δ_{n}).
Model for truncated and interval censored data
In the data provided by CRN, we are not observing from the entire distribution of survival timesevents are only observed if they exceed a constant M which here is 90 daysand so the likelihood given above in Equation 1 must be modified accordingly. Handling such truncation, or delayed entry as it is, corresponds to considering the conditional distribution of X given that X ≥ M. Hence the likelihood becomes
Further in the CRN data, all event times, X, are subject to interval censoring induced by a sequence of time points, t_{1}, t_{2}, ..., t_{m}_{+1}. This means that for each subject only the interval, [t_{j}; t_{j}_{+1}), and the type of event, δ, is observed. Let n_{1j }and n_{0j }be the counts of X_{i}'s falling within a given period for the two event types, i.e. n_{1j }is the number of deaths and n_{0j }the number of censorings between t_{j }and t_{j}_{+1}, respectively. If we let g_{j }be the conditional density of censoring events on the interval [t_{j }; t_{j}_{+1}) given it occurs in the interval, the likelihood takes the following form when not taking truncation into account
where t is the j × 2 matrix with interval end points, t_{j }and t_{j}_{+1}, as rows, and n is the corresponding j × 2 matrix with rows of event counts, n_{0j }and n_{1j.}
If events are subject to both truncation and interval censoring the likelihood takes the form
Equations 2 and 3 implies that for censored events the likelihood involves a term for the censoring distribution, even though censoring is considered to be independent or noninformative. If g_{j }is nonconstant over [t_{j }; t_{j}_{+1}), then even under an assumption of independent censoring, its shape will be informative when evaluating these contributions as it cannot be factored out. It is thus natural to consider g_{j }being constant over [t_{j }; t_{j}_{+1}) as a starting point for analysis, since only this model (the simplest possible) will for most parametric distribution allow direct analytical evaluation of the integrals. For example, for the Weibull survivor function direct analytical evaluation is possible only with uniform g_{j }'s and then by use of the incomplete Γfunctions. While the time intervals [t_{j }; t_{j}_{+1}) are all rather shortmaking the assumption realisticthere are drawbacks to this direct approach. Besides being complex to implement numerically, it more importantly suffers from not allowing the distributional shapes of g_{j }to be varied easily. Sensitivity analyses can thus not be readily undertaken using this approach. Instead, we suggest to consider this a missing data problem. From this point of view, a simple multiple imputation strategy lends itself naturally to be applied here: Generate for each censored individual i a censoring time (u_{il}) that is independent and distributed according to g_{j }on the interval wherein censoring is known to have happenedthis creates a completed dataset which we label by l. This is repeated k times to result in k completed datasets. The l'th dataset can be analyzed by maximizing the following marginal likelihood, where the censoring distribution has been factored out due to the assumption of noninformativeness:
Rubin's formula can be applied to obtain a combined estimate with accompanying uncertainty measures [11], and sensitivity analyses can straightforwardly be conducted by generating u_{il}'s from various distributions. As distributions we considered those shown in Figure 1, with the diagonal line representing the uniform distribution. Note that the distributions presented in Figure 1 are the conditional distributions on single subintervals, i.e. they are not representing the global censoring distribution.
Figure 1. Censoring distribution functions for sensitivity analysis. Distribution functions used in the sensitivity analysis on the impact of assuming different shapes of the censoring distribution. G_{j }is the conditional distribution of Z for each of the intervals [t_{j }; t_{j}_{+1}), j = 1, ..., m, given that Z belongs to the j'th interval.
Simplistic analyses
To allow comparison with simpler analytical strategies, we also present analyses based on single, midpoint imputation (replacing intervals with their midpoint) or ignoring truncation. If all censoring events are replaced by the midpoint of the interval they occur in, but event times are maintained as interval censored, we base the analysis on the following likelihood
If also the event times are replaced by the midpoint of the relevant interval, the likelihood becomes
To ignore truncation, the term S_{Y }(M)^{1 }is removed from the relevant likelihood.
Simulation study
To investigate the statistical properties of the multiple imputation strategy developed above, we set up a simulation study in which we compared the multiple imputation strategy with a simple, single imputation of interval midpoints for interval censored censoring events (Equation 5 above). We generated event times as Weibull distributed (S_{Y }(t; λ, γ) = exp(λy^{γ})), where we as true values of the parameters used those estimated for the last period of the Norwegian registry data (log λ = 0.5902 and γ = 0.9425, cf. results below). Censoring was taken to be noninformative and occurring with constant rate corresponding to annual proportions of 3%, 6%, and 9%, respectively. All event times were censored at the end of followup, which occurred at ten years. We varied the width of censoring intervals (1, 2, 6, and 12 months, where each month is 30 days) and the sample size (500, 1,000, and 10,000). In each of the 36 settings we analyzed 2,500 generated datasets. For each setting we report the median relative bias (the difference between the median of estimates and the true value relative to the true value), the empirical coverage probability of 95% confidence intervals, and the inflation factor of standard errors. The latter is defined as the ratio between the median standard error obtained with interval censored data and the median of standard errors obtained from analysis of identical data that have not been subjected to interval censoring, but only to ordinary right censoring. Note, that the performance of the method is not evaluated under misspecification (in practical applications misspecification should be dealt with using the tools developed below) as the focus is here only on studying the ability to handle interval censoring.
Goodnessoffit test
To conduct valid parametric analyses it is essential to have access to diagnostic procedures that allow assessing the fit of the chosen parametric family. As the data under study are already binned, we propose a variant of the χ^{2 }goodnessoffit test that accommodates censoring.
The expected counts can be derived from considering the following density, evaluated at the fitted parameter values:
where is the joint density of times of death and censoring, Y and Z, evaluated at the maximum likelihood estimate . Since Y and Z are independent, this can be written as
While f_{Y }was estimated above, we now additionally need an estimate of the censoring distribution, specifically S_{Z}. Had Y and Z not been interval censored, it would have been straightforward to get an estimate of S_{Z }via KaplanMeier with deaths considered as censoring events and censorings as failure events. We thus propose to impute Y and Z and then estimate S_{Z }by use of the KaplanMeier estimator, before computing expected values for each interval by averaging over imputations. Specifically, we suggest the following algorithm for generating imputed datasets:
1. For each observed event Y_{i }in interval [t_{j }; t_{j}_{+1}), the event time is imputed from the fitted survival function conditional on Y ∈ [t_{j }; t_{j}_{+1}).
2. For each censoring event Z_{i }in interval [t_{j}; t_{j}_{+1}), the time of censoring is imputed from a uniform distribution over this interval.
3. From the imputed dataset, the KaplanMeier estimate of S_{Z }is found by considering losstofollowup as event times and deaths as censoring events.
4. From the fitted distributions for Y and Z, the density in Equation 7 is numerically integrated over each relevant time interval. The expected count for each interval is found by multiplication with the total number of observed events.
Steps 1 to 4 are repeated to create 10 imputed datasets with expected counts. The overall expected count is then found by averaging over the imputed datasets.
Finally, the χ^{2 }goodnessoffit test statistic was computed as follows: If the expected number of events in an interval was smaller than five, the interval was joined with its lower neighbor. This procedure was repeated until all expected values were greater than five. The goodnessoffit test statistic was evaluated as
where and are the index and count of joined intervals, respectively, and is the observed count of events in the 'th interval. The test statistic was evaluated in the χ^{2 }distribution with degrees of freedom, where q is the number of parameters fitted in the parametric model.
Estimation of mean survival time
For all the distributions analyzed in this paper (see Table 2), the mean survival could be computed directly from the estimated parameters, except for the Gompertz distribution. The Gompertz estimates did however correspond to infinite mean survival, and so no alternative, numerical strategy was needed in this case. To reflect the uncertainties of the estimated parameters in computed means, we used a simulation strategy since this avoids the assumption of normality of the estimated mean and the assumption of differentiability, both of which are required by the delta method. For each parameter set, we thus sampled 10,000 times from a multivariate normal distribution defined by the estimate and their estimated covariance matrix, and for each sampled value computed the corresponding mean. As overall estimate of the mean survival time we used the median of the means accompanied by an empirical 90% confidence interval, i.e. the 5% and 95% percentiles in the sampled distribution of means.
Table 2. Parameterization of distributions
Implementation
The parametric distributions considered and their parameterization are shown in Table 2. All analyses were conducted using Stata 9 [12]. More specifically, the likelihoods in Equations 1 and 4, respectively, were both coded and evaluated using the general maximum likelihood command, ml, available in Stata 9, cf. [13]. The multiple imputation used the micombine command of the ICE addon package to obtain joint estimates across imputations [14]. Throughout we have chosen the number of imputations to be 10 [11, p. 114]. Examples of code used for the computations are available upon request to the authors.
Results
Simulation results
Table 3 presents relative bias, coverage probabilities and inflation of standard errors when either using single midpoint imputation or multiple imputation for interval censored censoring events. When interval censoring is induced by intervals with lengths up to six months, both analytic strategies perform well with negligible bias, coverage probabilities of confidence intervals close to the nominal value, and standard errors that are only marginally increased relative to the ordinary situation, i.e. when only ordinary right censoring is present. As a general tendency, however, the multiple imputation has lower relative bias and better coverage, in particular when censoring becomes more dominant in terms of higher censoring rates and wider censoring intervals. Only in extreme cases of 6% and 9% annual censoring proportions, censoring intervals of one year length, and larger sample sizes of 10,000, does coverage probabilities of the multiple imputation strategy decrease unacceptably to levels around 75% to 80%, albeit not as low as when single midpoint imputation is used. This poor performance in extreme cases is to be expected, as the conditional censoring distribution is taken to be uniform in the multiple imputation analysis, while censoring times are actually generated from an exponential modelit is this effect that becomes apparent when datasets become large and intervals very wide, but not before. In all situations, both analytic strategies yield virtually identical statistical efficiency, which depend mostly on the width of censoring intervals, and less on the rate of censorings. The loss in statistical efficiency is twice as high for the shape parameter γ as for the scale parameter log(λ).
Table 3. Simulation results comparing analyses with multiple imputation or single midpoint imputation in terms of bias, coverage probability and relative efficiency
Application results
All the five fitted distributions (Gamma, Gompertz, LogLogistic, LogNormal, and Weibull) fitted the data well in the last period (20022005), cf. Table 4. None of the distributions achieved adequate fit for the other periods, including when all data was pooled, although the Gompertz according to pvalues and diagnostic plots came close for the first period, 19911996 (plots not shown, available upon request). Figure 2 displays diagnostic plots for the Weibull distribution in each of the three periods separately and when pooled, which supports that the fit was adequate only in the last period, 20022005. In the other periods, the number of events is generally overestimated in the beginning, then underestimated from six months to approximately three years, and finally overestimated at the right tail. As the last period has a shorter followup period, the fit of any parametric model is only determined by the events in the earlier part of the followup period. Plots for the other distributions are available upon request.
Table 4. Goodnessoffit tests and estimated mean survival times for five parametric distributions, patients with metastatic colorectal cancer, Norway 19912005
Figure 2. Goodnessoffit plots. Difference between expected and observed counts of events over the time scale. χ^{2 }is the goodnessoffit test statistic value with d.f. degrees of freedom and associated pvalue. Note that the plot for all periods pooled has a differently scaled Yaxis than the period specific plots.
The estimated means varied substantially between the distributions, both when the fit was poor (long followup time, 19911996), and when the fit was good (shorter followup time, 20022005). The means ranged from 1.89 years (Gamma) to 3.14 years for the LogLogistic and even infinity for the Gompertz in the last period, 20022005 (Table 4). The likelihood ratio test for homogeneity of Weibull distributions across periods showed statistical significance with a χ^{2 }value of 122.7 on 4 degrees of freedom yielding p ≪ 0.0001. Note, that while mean survival times varied between periods, the direction of change was not consistent from distribution to distribution: For the Weibull, LogNormal, and the Gamma distribution mean survival time increased with period, the Gompertz found it to be infinite in all periods, whereas the LogLogistic suggested an increase between the first two periods and then a small decline for the last. Both the Gamma and LogLogistic distributions have asymmetric confidence intervals for the mean, something that would not easily have been detected if we had used the delta method for estimating means from the parameter estimates. Based on the trial by Hurwitz et al [2], the mean survival was estimated to 1.98 years in the intervention group and 1.57 years in the control group by Tappenden et al [10].
For the Weibull distribution we estimated parameters using different approaches for handling the interval censoring and truncation, cf. Table 5. As expected from the simulation study above, the use of single midpoint imputation for censoring events in this situation yields virtually identical estimates to those based on multiple imputationa consequence of the short censoring intervals. Results based on imputing midpoints of intervals for both events and censoring differ slightly more, but are for all practical purposes still identical. Ignoring the truncation does however create noticeably bias in the estimated mean survival time.
Table 5. Weibull Parameter estimates and estimated mean survival times, patients with metastatic colorectal cancer, Norway 19912005
Table 6 gives the results of applying the four different assumptions regarding the shape of the censoring distribution within each interval of one month length when using a Weibull distribution for event times. We only give results for the last time period and for all three time periods joined together, but results for the other time periods (not shown) were similar in the sense that changes in estimates were negligible with respect to choice of censoring distribution. This is a result of the censoring intervals being short in this case (one month), but in applications with longer intervals we would suggest mimicking this sensitivity analysis, as the assumed shape of censoring distribution becomes more important with long intervals.
Table 6. Results of sensitivity analysis with respect to the assumed shape of the interval specific censoring distribution
Discussion
The analyses above showed how a multiple imputation strategy combined with a relatively simple maximum likelihood estimation procedure could yield a flexible and valid parametric analysis of interval censored data, and at the same time avoid numerical complexities. Based on the estimated parameters, mean survival times and their uncertainty could be estimated, and finally a goodnessoffit test was implemented by utilizing the inherent binning of the interval censored data together with a multiple imputation strategy. The simulation studies demonstrated that as long as intervals are not too wide, the multiple imputation is a valid analytic strategy with low loss of statistical efficiency relative to analyses of data without interval censoring.
In the simulation study we focused attention on the statistical properties of the imputation strategy when the parametric model is correctly specified. As shown in our analysis of the Norwegian register data, it is however often not simple in actual applications to identify which model is correctly specifiedif such a tractable model exist at alleven with adequate statistical diagnostic procedures. While the problem is not restricted to binned data, it may become attenuated by the binning, as it may make the detection of deviations from the assumed distribution more difficult. If this aspect should have been studied in a simulation study, one might suggest to compare the results of misspecified analyses based on ordinary right censored data and binned data, respectively. We are however confident that only when intervals become wide will the problem of misspecification have the potential to become more pronounced than in the ordinary right censored situation, as for short intervals results are virtually identical between analyses based on right censored and binned data. Our simulation study shows that the imputation method should with wide intervals be used cautiously anyway, as the bias is then large even when the model is correctly specified.
While the CRN data were truncated and interval censored, it might be argued that both features were not prominent for the data: Only the first 90 days of data are discarded, and the binning in 30 day intervals is but a fine grained filtering. Even though three months is short compared to the length of followupin particular for the earlier cohortswe showed that three months is considerable with respect to estimated means, and this aspect of the data can thus not be ignored. The simulation studies revealed that the impact of interval censoring was generally less important, especially for narrow intervals. Even so, the simulation studies revealed that while single midpoint imputation yielded identical results for narrow intervals, the suggested multiple imputation strategy provided better protection against bias and confidence intervals with coverage probabilities closer to nominal values in situations with wider censoring intervals. Although both deaths and censoring events were interval censored in the CRN data, multiple imputation was only done with respect to censoring events in the estimation process. The rationale was that the distribution of deaths was predetermined by a parametric distribution, and so their likelihood contributions were given and straightforward to calculate. The censoring distribution on the other hand is not itself of interest, and so the focus here is on sensitivity of main parameter estimates to different choices of the distribution of censoring. If the interest had been in conducting semiparametric estimation (Cox regression) of the event distribution, then multiple imputation for events (deaths) might be a simple alternative to dedicated methods for interval censored data.
Finally, it should be noted that although costeffectiveness analyses may mandate estimation of mean survival times, this should not be considered a trivial endeavor, see for example [15]. In most studies involving survival times, the right tail of the distribution is unobserved due to right censoring, and yet this tail is highly influential on the meaneven in situations where only a small proportion survives past the endoffollowup and different parametric distributions fit equally well, as documented in our analyses above. It is for this reason that the mean restricted to the observation period is commonly used in costeffectiveness analyses, although it makes comparisons with studies with different lengths of followup impossible. The estimated mean from a parametric model will necessarily depend on the specific shape implicitly assumed for the unobserved part of the distribution, and so the sensitivity of the mean to distributional assumptions should be explored whenever possible, as we have done here.
Conclusion
Provision of binned data to maintain anonymity of patients should be considered a viable procedure, since a multiple imputation strategy can be used to account for the interval censoring created by the binning, as long as intervals are not too wide.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
HS and ISK jointly developed the original idea for the study. HS developed the statistical analyses and conducted them, and drafted the original and revised manuscripts. ISK provided access to data. The interpretation of data analyses was the product of discussions between the authors. ISK commented on drafts of the paper. Both authors have seen and approved the final version.
References

Cancer Registry of Norway: Cancer in Norway 2006  Cancer incidence, mortality, survival and prevalence in Norway. Oslo: Cancer Registry of Norway; 2007.

Hurwitz H, Fehrenbacher L, Novotny W, Cartwright T, Hainsworth J, Heim W, Berlin J, Baron A, Griffing S, Holmgren E, Ferrara N, Fyfe G, Rogers B, Ross R, Kabbinavar F: Bevacizumab plus irinotecan, fluorouracil, and leucovorin for metastatic colorectal cancer.
The New England journal of medicine 2004, 350:233542.
PMID: 15175435
PubMed Abstract  Publisher Full Text 
Groeneboom P, Wellner JA: Information bounds and nonparametric maximum likelihood. Boston: Birkhäuser; 1992.

Lawless JF: Statistical Models and Methods for Lifetime Data. New Jersey: John Wiley & Sons, Inc; 2003.

Odell PM, Anderson KM, D'Agostino RB: Maximum Likelihood Estimation for IntervalCensored Data Using a WeibullBased Accelerated Failure Time Model. [http://www.jstor.org/stable/2532360] webcite
Biometrics 1992, 48:951959. PubMed Abstract  Publisher Full Text

Hewett P, Ganser GH: A comparison of several methods for analyzing censored data.
The Annals of occupational hygiene 2007, 51:61132.
PMID: 17940277
PubMed Abstract  Publisher Full Text 
Lindsey: A Study of Interval Censoring in Parametric Regression Models. [http://dx.doi.org/10.1023/A:1009681919084] webcite
Lifetime Data Analysis 1998, 4:329354. PubMed Abstract  Publisher Full Text

Pan W: A multiple imputation approach to Cox regression with intervalcensored data.
Biometrics 2000, 56:199203. PubMed Abstract  Publisher Full Text

Hsu CH, Taylor JMG, Murray S, Commenges D: Multiple imputation for interval censored data with auxiliary variables. [http://dx.doi.org/10.1002/sim.2581] webcite
Stat Med 2007, 26(4):769781. PubMed Abstract  Publisher Full Text

Tappenden P, Jones R, Paisley S, Carroll C: The costeffectiveness of bevacizumab in the firstline treatment of metastatic colorectal cancer in England and Wales.
European Journal of Cancer 2007, 43:24872494. PubMed Abstract  Publisher Full Text

Rubin DB: Multiple imputation for nonresponse in surveys. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, New York: John Wiley & Sons Inc; 1987.

StataCorp: Stata Statistical Software: Release 9. College Station, TX: StataCorp LP; 2005.

Gould W, Sribney W: Maximum Likelihood Estimation with Stata. College Station, Texas: Stata Press; 1999.

Royston P: ICE: Stata module for multiple imputation of missing values. [http://ideas.repec.org/c/boc/bocode/s446602.html] webcite
Statistical Software Components, Boston College Department of Economics 2006.

Nelson CL, Sun JL, Tsiatis AA, Mark DB: Empirical estimation of life expectancy from large clinical trials: Use of lefttruncated, rightcensored survival analysis methodology. [http://dx.doi.org/10.1002/sim.3355] webcite
Statistics in Medicine 2008., 9999
n/a