Email updates

Keep up to date with the latest news and content from BMC Medical Research Methodology and BioMed Central.

Open Access Research article

Adaptive regression modeling of biomarkers of potential harm in a population of U.S. adult cigarette smokers and nonsmokers

John H Warner1, Qiwei Liang2*, Mohamadi Sarkar2, Paul E Mendes2 and Hans J Roethig2

Author Affiliations

1 Pharsight Corporation, a Certara Company, Mountain View, CA 94041-1530, USA

2 Altria Client Services Inc, 601 E Jackson Street, Richmond, VA 23219, USA

For all author emails, please log on.

BMC Medical Research Methodology 2010, 10:19  doi:10.1186/1471-2288-10-19

Published: 16 March 2010



This article describes the data mining analysis of a clinical exposure study of 3585 adult smokers and 1077 nonsmokers. The analysis focused on developing models for four biomarkers of potential harm (BOPH): white blood cell count (WBC), 24 h urine 8-epi-prostaglandin F(EPI8), 24 h urine 11-dehydro-thromboxane B2 (DEH11), and high-density lipoprotein cholesterol (HDL).


Random Forest was used for initial variable selection and Multivariate Adaptive Regression Spline was used for developing the final statistical models


The analysis resulted in the generation of models that predict each of the BOPH as function of selected variables from the smokers and nonsmokers. The statistically significant variables in the models were: platelet count, hemoglobin, C-reactive protein, triglycerides, race and biomarkers of exposure to cigarette smoke for WBC (R-squared = 0.29); creatinine clearance, liver enzymes, weight, vitamin use and biomarkers of exposure for EPI8 (R-squared = 0.41); creatinine clearance, urine creatinine excretion, liver enzymes, use of Non-steroidal antiinflammatory drugs, vitamins and biomarkers of exposure for DEH11 (R-squared = 0.29); and triglycerides, weight, age, sex, alcohol consumption and biomarkers of exposure for HDL (R-squared = 0.39).


Levels of WBC, EPI8, DEH11 and HDL were statistically associated with biomarkers of exposure to cigarette smoking and demographics and life style factors. All of the predictors togather explain 29%-41% of the variability in the BOPH.