Table 3

Random Forest Classifiers for patient clusters and clinical subpopulations.

Variable description

Sub populations

Biomarkers

Classification error (SD)

AUROC (SD)


All 157 hematuria patients

controls n = 77

UC n = 80

CRP, EGF, IL-6, IL-1α, MMP9NGAL, osmolarity, CEA

0.203 (0.017)

0.766 (0.152)


Patient clustersa

blue

n = 57 (28)

TNFα, EGF, NSE, NGAL, MMP9NGAL, TM, FAS

0.155 (0.029)

0.800 (0.258)


green

n = 49 (18)

TNFα, EGF, IL-6, IL-1α, MMP9NGAL, TM, CEA

0.204 (0.037)

0.825 (0.264)


gold

n = 23 (15)

CRP, sTNFR1, vWF, IL-1α, MMP9NGAL, creatinine, BTA

0.245 (0.049)

0.700 (0.349)


Clinical subpopulations


Smoking

smokers

n = 101 (60)

CRP, EGF, MMP9, IL-1α, IL-4, TM, IL-2

0.276 (0.027)

0.770 (0.117)


non- smokers

n = 56 (20)

TNFα, sTNFR1, IL-6, IL-1α, MMP9NGAL, creatinine, CEA

0.156 (0.027)

0.783 (0.159)


Gender

males

n = 120 (65)

CRP, EGF, CK18, IL-1β, IL-8, creatinine, IL-2

0.272 (0.030)

0.753 (0.117)


females

n = 37 (15)

CRP, EGF, IL-6, dDimer, MMP9NGAL, osmolarity, CEA

0.181 (0.054)

0.830 (0.146)


Hx stone disease

yes

n = 30 (14)

CRP, sTNFR1, CK18, IL-1α, IL-8, creatinine, VEGF

0.322 (0.062)

0.738 (0.194)


no

n = 127 (66)

CRP, EGF, IL-6, IL-1α, MMP9NGAL, creatinine, CEA

0.186 (0.015)

0.817 (0.117)


Hx BPE

yes

n = 30 (14)

CRP, EGF, IL-6, IL-1α, MMP9NGAL, TM, CEA

0.192 (0.018)

0.826 (0.148)


no

n = 127 (66)

CRP, EGF, CK18, NGAL, MMP9NGAL, creatinine, BTA

0.266 (0.061)

0.788 (0.169)


Anti-hypertensive medication

on medication

n = 73 (51)

TNFα, EGF, IL-6, protein, MMP9NGAL, creatinine, CEA

0.211 (0.025)

0.731 (0.161)


no medication

n = 83 (28)

TNFα, sTNFR1, IL-6, NGAL, IL-8, TM, CEA

0.145 (0.028)

0.810 (0.132)


Anti-platelet medication

on medication

n = 37 (25)

TNFα, EGF, IL-6, protein, IL-8, osmolarity, CEA

0.215 (0.019)

0.780 (0.141)


no medication

n = 118 (53)

CRP, EGF, MCP-1, protein, MMP9NGAL, TM, FPSA

0.160 (0.046)

0.843 (0.153)


Anti-ulcer medication

on medication

n = 33 (17)

CRP, EGF, IL-6, IL-1α, IL-8, TM, CEA

0.220 (0.018)

0.827 (0.118)


no medication

n = 123 (62)

CRP, EGF, vWF, IL-1β, MMP9NGAL, TM, HA

0.259 (0.072)

0.812 (0.168)


Using the clusters of biomarkers as a feature set, we determined the classification error and the area under the receiver operating characteristic curve (AUROC) of urothelial cancer (UC) diagnostic classifiers for all possible biomarker combinations for all 157 hematuria patients; for 3/5 of the patient clusters; and for 14 subpopulations split on the basis of smoking, gender, history of stone disease, history of benign prostate enlargement (BPE), or anti-hypertensive, anti-platelet or anti-ulcer medications. Therefore, one biomarker from each of the seven clusters illustrated in the biomarker dendrogram (Figure 3), was represented in each classifier. The classification errors in the clinically split populations were very similar to those obtained for the patient clusters. aOnly two of the natural patient subpopulations, those shown in blue and green in Figure 1, contained sufficient numbers to train a Random Forest Classifier (RFC). For reasons of comparison, we also trained a RFC for the gold cluster. Four of the seven biomarkers were the same in the diagnostic classifiers for the blue and green patient clusters suggesting biological similarities. The numbers in brackets in the second column indicate the number of patients with UC. BTA, bladder tumor antigen; CEA, carcino-embryonic antigen; CK18, cytokeratin 18; CRP, C-reactive protein; EGF, epidermal growth factor; FPSA, free prostate specific antigen; IL, interleukin; HA, hyaluronidase; MMP-9, matrix metalloproteinase 9; NGAL, neutrophil-associated gelatinase lipocalin; NSE, neuron specific enolase; SD, standard deviation; sTNFR1, soluble tumor necrosis factor receptor 1; TM, thrombomodulin; TNFα, tumor necrosis factor α; VEGF, vascular endothelial growth factor; vWF, Von Willeband factor.

Emmert-Streib et al. BMC Medicine 2013 11:12   doi:10.1186/1741-7015-11-12

Open Data