Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Research article

Indirect calibration between clinical observers - application to the New York Heart Association functional classification system

Milton Severo12*, Rita Gaio3, Patrícia Lourenço4, Margarida Alvelos4, Paulo Bettencourt4 and Ana Azevedo124

Author Affiliations

1 Department of Clinical Epidemiology, Predictive Medicine and Public Health, University of Porto Medical School, Porto, Portugal

2 Institute of Public Health of the University of Porto, Porto, Portugal

3 Department of Pure Mathematics, University of Porto Science School, Portugal

4 Heart Failure Clinic, Department of Internal Medicine, Hospital S. João, Porto, Portugal

For all author emails, please log on.

BMC Research Notes 2011, 4:276  doi:10.1186/1756-0500-4-276

Published: 3 August 2011



Previous studies showed an inter-observer agreement for the NYHA classification of approximately 55%. The aim of this study was to calibrate the New York Heart Association (NYHA) classification system between observers, increasing its reliability.


Among 1136 community-dwellers in Porto, Portugal, aged ≥ 45 years, 265 reporting breathlessness answered a 4-item questionnaire to characterize symptom severity. The questionnaire was administered by 7 physicians who also classified the subject's functional capacity according to NYHA. Each subject was assessed by one physician. We calibrated NYHA classifications by the concurrent method, using 1-parameter logistic graded response model. Discrepancies between observers were assessed by differences in ability thresholds between NYHA classes I-II and II-III. The ability estimated by the model was used to predict the NYHA classification for each observer.

Estimates of the first and second thresholds for each observer ranged from -1.92 to 0.46 and from 1.42 to 2.30, respectively. The agreement between estimated ability and the observers' NYHA classification was 88% (kappa = 0.61).


The study objectively indicates the main reason why several studies have reported low inter-observer is the existence of discrepant thresholds between observers in the definition of NYHA classes. The concurrent method can be used to minimize the reliability problem of NYHA classification.

dyspnea; physical exertion; questionnaires; New York Heart Association; calibration; reliability; equating