Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, USA

Office of Disease Prevention, National Institutes of Health, USA

Abstract

Background

There is common belief among some medical researchers that if a potential surrogate endpoint is highly correlated with a true endpoint, then a positive (or negative) difference in potential surrogate endpoints between randomization groups would imply a positive (or negative) difference in unobserved true endpoints between randomization groups. We investigate this belief when the potential surrogate and unobserved true endpoints are perfectly correlated within each randomization group.

Methods

We use a graphical approach. The vertical axis is the unobserved true endpoint and the horizontal axis is the potential surrogate endpoint. Perfect correlation within each randomization group implies that, for each randomization group, potential surrogate and true endpoints are related by a straight line. In this scenario the investigator does not know the slopes or intercepts. We consider a plausible example where the slope of the line is higher for the experimental group than for the control group.

Results

In our example with unknown lines, a

Conclusion

Perfect correlation between potential surrogate and unobserved true outcomes within randomized groups does not guarantee correct inference based on a potential surrogate endpoint. Even in early phase trials, investigators should not base conclusions on potential surrogate endpoints in which the only validation is high correlation with the true endpoint within a group.

Background

A potential surrogate endpoint is an endpoint obtained sooner, at less cost, or less invasively than the true endpoint of interest. When using a potential surrogate endpoint, one would like to make the same inference as if one had observed a true endpoint (i.e. a health outcome). Fleming and DeMets

Fleming and DeMets

Methods

To investigate the validity of a potential surrogate endpoint that is perfectly correlated with true outcomes within randomized groups, we created the graphic in Figure

Graphical depiction of incorrect inference based on surrogage endpoints. The graph shows perfectly correlated results (namely a straight line) for the relationship between surrogate and true outcomes for a control group C and experimental group E

Graphical depiction of incorrect inference based on surrogage endpoints. The graph shows perfectly correlated results (namely a straight line) for the relationship between surrogate and true outcomes for a control group C and experimental group E. The mean surrogate outcome in the E group

In this simple example the unobserved true endpoint is proportional to the potential surrogate endpoint, so the intercept for both lines is zero. However qualitatively similar results would hold when the two lines have different intercepts. In our example the onlydifference between the lines for groups E and C is that the slopes differ. In particular, the slope of the line relating potential surrogate and true endpoints for group E is higher than that for group C. To graphically find the mean value of a true endpoint corresponding to the mean value of potential surrogate endpoint, one draws a vertical line from the mean value of the potential surrogate endpoint to the line relating potential surrogate and true endpoint, and then a horizontal line leftward to the axis for the true endpoint.

The graphic also applies when the potential surrogate and true outcomes are binary. Of course with binary endpoints the notion of perfect correlation does not apply. However, with binary endpoints, one obtains straight lines, so the graphic is applicable. For example, suppose the potential surrogate endpoint is the presence or absence of adenoma and the true endpoint is the presence or absence of colorectal cancer. In that case the horizontal axis is the fraction of subjects with adenomas and the vertical axis is the unobserved fraction of subjects who would get colorectal cancer. For each randomization group there is a line relating the fraction of subjects with adenoma to the unobserved fraction with colorectal cancer. (Each line is constructed by connecting the point representing the proportion with the true endpoint when the surrogate endpoint is 0 with the point representing the proportion with true endpoint when the surrogate endpoint is 1). The lines in Figure

Suppose that an investigator only knows from prior studies that the potential surrogate and true endpoints are perfectly correlated within randomization group (and does not know the slopes or intercepts). Or suppose the surrogate and true endpoints are binary, so there are two straight lines (but with unknown slopes or intercepts), one for each randomization group. Will the use of a potential surrogate endpoint to replace the unobserved true endpoint give qualitatively the correct results? In other words, will a decrease (increase) in the mean potential surrogate endpoint or the fraction with the surrogate endpoint necessarily imply a decrease (increase) in the mean true endpoint or the fraction with the true endpoint?

Results

In the graphic in Figure

With binary data,

One could create a similar graphic that shows that no change in the surrogate endpoint corresponds to either a decrease or increase in the true endpoint, or that an increase in the surrogate endpoint could lead to a decrease in the true endpoint.

Discussion

Plausibility of Figure 1

We showed graphically that perfect correlation does not guarantee correct inference when a potential surrogate endpoint replaces a true endpoint. The underlying reason is that the line predicting true endpoint from potential surrogate endpoint has a sufficiently different slope for each randomization group to make a substantial difference in the conclusion. In one possible scenario the intervention reduces the value of the surrogateendpoint that is observed without affecting the true endpoint, thereby increasing the slope.

With binary outcomes, different slopes can readily arise because of unobserved heterogeneity in the potential surrogate endpoint. As an example consider adenoma (yes or no) as a potential surrogate endpoint for the true outcome of colorectal cancer (yes or no). Two recent randomized trials

To better understand the role of heterogeneity, we follow Schatzkin and Gail _{z }denote the probability an adenoma in randomization group _{z }denote the probability of colorectal cancer arising from a "bad" adenoma in randomization group _{z }denote the probability of any adenoma in group _{E}π_{E }> φ_{C}π_{C}. The leftward shift of the vertical line in Figure _{E }< ω_{C}.

Such a situation is quite plausible, as illustrated in a randomized trial of finasteride versus placebo _{E }> π_{C}) but a smaller fraction with any prostate cancer (ω_{E }< ω_{C}). Because individuals with high-grade prostate cancer generally have a greater risk of prostate cancer mortality, we have φ_{E }> φ_{C}. If the risk of prostate cancer mortality with other histological grades of prostate cancer is minimal, the situation is mathematically similar to the aforementioned hypothetical example with "bad" and "innocent" adenomas, except that that the fraction "bad" is observed. There is a greater slope with the finasteride group (φ_{E}π_{E }> φ_{C}π_{C}) and a smaller value fraction with the surrogate with the finasteride group (ω_{E }< ω_{C}), which corresponds to Figure

Graphical Representation of the Prentice Criterion

For valid hypothesis testing based on a surrogate endpoint that

Inference Without the Prentice Criterion

Other approaches to inference with surrogate endpoints involve

If one could accurately predict the slopes and intercepts of lines in Figure

Another approach for predicting true endpoint from a potential surrogate endpoint is the "meta-analytic" approach

A third approach for predicting true endpoints from a potential surrogate endpoint in a randomized trial is a counterfactual approach based on the potential surrogates that would occur, if contrary to fact, individuals were randomized to a different group

In all these approaches there is a fundamental assumption that the relationship between the potential surrogate and true endpoints in previous studies is very similar to the relationship in the new study under investigation. Besides accounting for the variability in this relationship (in addition to sampling variability), one needs to restrict the previous studies to those involving similar interventions although, as discussed below, that is not a guarantee of valid inference.

Additional Caveats with Potential Surrogate Endpoints

The use of surrogate endpoints is particularly attractive for studies of complex chronic disease since occurrence of the true endpoint may take years. However, it is precisely because of the complexity of the diseases that assessment of potential surrogate endpoints is so difficult. There are likely to be multiple causal pathways to the true disease endpoint. Different interventions may exert their biologic effects on different pathways.

This is why it is particularly hazardous to use even an "established " surrogate endpoint (or a potential surrogate endpoint "validated" via multiple previous studies) for one class of drug to assess another class of drugs. For example, the statin class of drugs lowers serum cholesterol and lowers cardiovascular event rates, including mortality. However HRT with combined estrogen plus progestin lowers serum cholesterol but

Another cautionary note is important. If an intervention induces harmful side effects, it is risky to draw conclusions from the potential surrogate endpoint based only inference regarding the true endpoint. There may be harms that occur after the time the potential surrogate endpoint is observed that are not well predicted by the potential surrogate endpoint. Under these circumstances, even if the two lines in Figure

Conclusion

Sometimes potential surrogate endpoints are justified because they are highly correlated with the true endpoint in other studies. As illustrated here, even with known perfect correlation within randomized groups, one cannot rely on the potential surrogate endpoint for valid inference about the true endpoint, as even the direction of the effect could be the opposite with true and potential surrogate endpoints. Thus, even in preliminary trials, investigators should not base conclusions on potential surrogate endpoints in which the only validation is high correlation with the true endpoint.

Authors' Contribution

SGB wrote the initial draft. BSK made substantial improvements to the manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here: