Indirect calibration between clinical observers - application to the New York Heart Association functional classification system
1 Department of Clinical Epidemiology, Predictive Medicine and Public Health, University of Porto Medical School, Porto, Portugal
2 Institute of Public Health of the University of Porto, Porto, Portugal
3 Department of Pure Mathematics, University of Porto Science School, Portugal
4 Heart Failure Clinic, Department of Internal Medicine, Hospital S. João, Porto, Portugal
BMC Research Notes 2011, 4:276 doi:10.1186/1756-0500-4-276Published: 3 August 2011
Previous studies showed an inter-observer agreement for the NYHA classification of approximately 55%. The aim of this study was to calibrate the New York Heart Association (NYHA) classification system between observers, increasing its reliability.
Among 1136 community-dwellers in Porto, Portugal, aged ≥ 45 years, 265 reporting breathlessness answered a 4-item questionnaire to characterize symptom severity. The questionnaire was administered by 7 physicians who also classified the subject's functional capacity according to NYHA. Each subject was assessed by one physician. We calibrated NYHA classifications by the concurrent method, using 1-parameter logistic graded response model. Discrepancies between observers were assessed by differences in ability thresholds between NYHA classes I-II and II-III. The ability estimated by the model was used to predict the NYHA classification for each observer.
Estimates of the first and second thresholds for each observer ranged from -1.92 to 0.46 and from 1.42 to 2.30, respectively. The agreement between estimated ability and the observers' NYHA classification was 88% (kappa = 0.61).
The study objectively indicates the main reason why several studies have reported low inter-observer is the existence of discrepant thresholds between observers in the definition of NYHA classes. The concurrent method can be used to minimize the reliability problem of NYHA classification.