Artificial neural networks versus proportional hazards Cox models to predict 45-year all-cause mortality in the Italian Rural Areas of the Seven Countries Study
1 Laboratory of Biotechnologies Applied to Cardiovascular Medicine, Department of Cardiovascular, Respiratory, Nephrological, Anesthesiological and Geriatrical Sciences, Sapienza, University of Rome, Viale del Policlinico, 155, Rome 00161, Italy
2 Associazione per la Ricerca Cardiologica, Rome, Italy
BMC Medical Research Methodology 2012, 12:100 doi:10.1186/1471-2288-12-100Published: 23 July 2012
Projection pursuit regression, multilayer feed-forward networks, multivariate adaptive regression splines and trees (including survival trees) have challenged classic multivariable models such as the multiple logistic function, the proportional hazards life table Cox model (Cox), the Poisson’s model, and the Weibull’s life table model to perform multivariable predictions. However, only artificial neural networks (NN) have become popular in medical applications.
We compared several Cox versus NN models in predicting 45-year all-cause mortality (45-ACM) by 18 risk factors selected a priori: age; father life status; mother life status; family history of cardiovascular diseases; job-related physical activity; cigarette smoking; body mass index (linear and quadratic terms); arm circumference; mean blood pressure; heart rate; forced expiratory volume; serum cholesterol; corneal arcus; diagnoses of cardiovascular diseases, cancer and diabetes; minor ECG abnormalities at rest. Two Italian rural cohorts of the Seven Countries Study, made up of men aged 40 to 59 years, enrolled and first examined in 1960 in Italy. Cox models were estimated by: a) forcing all factors; b) a forward-; and c) a backward-stepwise procedure. Observed cases of deaths and of survivors were computed in decile classes of estimated risk. Forced and stepwise NN were run and compared by C-statistics (ROC analysis) with the Cox models. Out of 1591 men, 1447 died. Model global accuracies were extremely high by all methods (ROCs > 0.810) but there was no clear-cut superiority of any model to predict 45-ACM. The highest ROCs (> 0.838) were observed by NN. There were inter-model variations to select predictive covariates: whereas all models concurred to define the role of 10 covariates (mainly cardiovascular risk factors), family history, heart rate and minor ECG abnormalities were not contributors by Cox models but were so by forced NN. Forced expiratory volume and arm circumference (two protectors), were not selected by stepwise NN but were so by the Cox models.
There were similar global accuracies of NN versus Cox models to predict 45-ACM. NN detected specific predictive covariates having a common thread with physical fitness as related to job physical activity such as arm circumference and forced expiratory volume. Future attention should be concentrated on why NN versus Cox models detect different predictors.