Open Access Research article

Classification and regression tree (CART) model to predict pulmonary tuberculosis in hospitalized patients

Fabio S Aguiar1*, Luciana L Almeida2, Antonio Ruffino-Netto3, Afranio Lineu Kritski1, Fernanda CQ Mello1 and Guilherme L Werneck45

Author Affiliations

1 Instituto de Doenças do Tórax (IDT)/Clementino Fraga Filho Hospital (CFFH), Federal University of Rio de Janeiro, Rua Professor Rodolpho Paulo Rocco, n° 255 - 6° Andar - Cidade Universitária - Ilha do Fundão, 21941-913, Rio de Janeiro, Brazil

2 Harbor Hospital, 3001 S. Hanover St, Baltimore, MD, 21225, USA

3 Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900, 14049-900, Ribeirão Preto-SP, Brazil

4 Instituto de Estudos em Saúde Coletiva, Federal University of Rio de Janeiro, Praça Jorge Machado Moreira, Ilha do Fundão, Cidade Universitária, 21944-210, Rio de Janeiro, Brazil

5 Instituto de Medicina Social, State University of Rio de Janeiro, Rua São Francisco Xavier, 524, 7° andar, Bloco D. – Maracanã, 20550-900, Rio de Janeiro, Brazil

For all author emails, please log on.

BMC Pulmonary Medicine 2012, 12:40  doi:10.1186/1471-2466-12-40

Published: 7 August 2012



Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission.


Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005.


We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%.


The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in countries with limited resources.

Sensitivity and specificity; Accuracy; Tuberculosis; Diagnosis; Predictive models; CART