Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models
1 Freiburg Center for Data Analysis and Modeling, University of Freiburg, Eckerstr. 1, 79104 Freiburg, Germany
2 Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg, Germany
BMC Bioinformatics 2008, 9:14 doi:10.1186/1471-2105-9-14Published: 10 January 2008
When predictive survival models are built from high-dimensional data, there are often additional covariates, such as clinical scores, that by all means have to be included into the final model. While there are several techniques for the fitting of sparse high-dimensional survival models by penalized parameter estimation, none allows for explicit consideration of such mandatory covariates.
We introduce a new boosting algorithm for censored time-to-event data that shares the favorable properties of existing approaches, i.e., it results in sparse models with good prediction performance, but uses an offset-based update mechanism. The latter allows for tailored penalization of the covariates under consideration. Specifically, unpenalized mandatory covariates can be introduced. Microarray survival data from patients with diffuse large B-cell lymphoma, in combination with the recent, bootstrap-based prediction error curve technique, is used to illustrate the advantages of the new procedure.
It is demonstrated that it can be highly beneficial in terms of prediction performance to use an estimation procedure that incorporates mandatory covariates into high-dimensional survival models. The new approach also allows to answer the question whether improved predictions are obtained by including microarray features in addition to classical clinical criteria.