Open Access Research article

Matching methods to create paired survival data based on an exposure occurring over time: a simulation study with application to breast cancer

Alexia Savignoni123*, Caroline Giard14, Pascale Tubert-Bitter23 and Yann De Rycke5

Author Affiliations

1 Service de Biostatistique, Institut Curie, 26 rue d’Ulm, 75005 Paris, France

2 Inserm, CESP Centre for research in Epidemiology and Population Health, U1018, Biostatistics Team, F-94807 Villejuif, France

3 Univ Paris-Sud, UMRS1018, F-94807 Villejuif, France

4 Institut Curie, Pharmacological Unit, Saint-Cloud, France

5 Institut Curie, Public Health Team, Paris, France

For all author emails, please log on.

BMC Medical Research Methodology 2014, 14:83  doi:10.1186/1471-2288-14-83

Published: 26 June 2014



Paired survival data are often used in clinical research to assess the prognostic effect of an exposure. Matching generates correlated censored data expecting that the paired subjects just differ from the exposure. Creating pairs when the exposure is an event occurring over time could be tricky. We applied a commonly used method, Method 1, which creates pairs a posteriori and propose an alternative method, Method 2, which creates pairs in “real-time”. We used two semi-parametric models devoted to correlated censored data to estimate the average effect of the exposure <a onClick="popup('','MathML',630,470);return false;" target="_blank" href="">View MathML</a>: the Holt and Prentice (HP), and the Lee Wei and Amato (LWA) models. Contrary to the HP, the LWA allowed adjustment for the matching covariates (LWAa) and for an interaction (LWAi) between exposure and covariates (assimilated to prognostic profiles). The aim of our study was to compare the performances of each model according to the two matching methods.


Extensive simulations were conducted. We simulated cohort data sets on which we applied the two matching methods, the HP and the LWA. We used our conclusions to assess the prognostic effect of subsequent pregnancy after treatment for breast cancer in a female cohort treated and followed up in eight french hospitals.


In terms of bias and RMSE, Method 2 performed better than Method 1 in designing the pairs, and LWAa was the best model for all the situations except when there was an interaction between exposure and covariates, for which LWAi was more appropriate. On our real data set, we found opposite effects of pregnancy according to the six prognostic profiles, but none were statistically significant. We probably lacked statistical power or reached the limits of our approach. The pairs’ censoring options chosen for combination Method 2 - LWA had to be compared with others.


Correlated censored data designing by Method 2 seemed to be the most pertinent method to create pairs, when the criterion, which characterized the pair, was an exposure occurring over time. In such a setting, the LWA was the most appropriate model.

Matching on time-dependent covariates; Matched time-to-event data; Correlated survival data; Event occurring over time; Stratified Cox model; Marginal Cox model; Pregnancy; Breast cancer; Simulation study