Open Access Open Badges Research article

An administrative data validation study of the accuracy of algorithms for identifying rheumatoid arthritis: the influence of the reference standard on algorithm performance

Jessica Widdifield1*, Claire Bombardier1, Sasha Bernatsky2, J Michael Paterson134, Diane Green3, Jacqueline Young3, Noah Ivers135, Debra A Butt1, R Liisa Jaakkimainen136, J Carter Thorne7 and Karen Tu13

Author Affiliations

1 University of Toronto, Toronto, 200 Elizabeth St 13EN-224, Toronto, ON M5G 2C4, Canada

2 McGill University, Montreal, QC, Canada

3 Institute for Clinical Evaluative Sciences, Toronto, ON, Canada

4 McMaster University, Hamilton, ON, Canada

5 Women’s College Hospital, Toronto, ON, Canada

6 Sunnybrook Health Sciences Centre, Toronto, ON, Canada

7 Southlake Regional Health Centre, Newmarket, ON, Canada

For all author emails, please log on.

BMC Musculoskeletal Disorders 2014, 15:216  doi:10.1186/1471-2474-15-216

Published: 23 June 2014



We have previously validated administrative data algorithms to identify patients with rheumatoid arthritis (RA) using rheumatology clinic records as the reference standard. Here we reassessed the accuracy of the algorithms using primary care records as the reference standard.


We performed a retrospective chart abstraction study using a random sample of 7500 adult patients under the care of 83 family physicians contributing to the Electronic Medical Record Administrative data Linked Database (EMRALD) in Ontario, Canada. Using physician-reported diagnoses as the reference standard, we computed and compared the sensitivity, specificity, and predictive values for over 100 administrative data algorithms for RA case ascertainment.


We identified 69 patients with RA for a lifetime RA prevalence of 0.9%. All algorithms had excellent specificity (>97%). However, sensitivity varied (75-90%) among physician billing algorithms. Despite the low prevalence of RA, most algorithms had adequate positive predictive value (PPV; 51-83%). The algorithm of “[1 hospitalization RA diagnosis code] or [3 physician RA diagnosis codes with ≥1 by a specialist over 2 years]” had a sensitivity of 78% (95% CI 69–88), specificity of 100% (95% CI 100–100), PPV of 78% (95% CI 69–88) and NPV of 100% (95% CI 100–100).


Administrative data algorithms for detecting RA patients achieved a high degree of accuracy amongst the general population. However, results varied slightly from our previous report, which can be attributed to differences in the reference standards with respect to disease prevalence, spectrum of disease, and type of comparator group.

Rheumatoid arthritis; Health administrative databases; Validation study; Sensitivity and specificity; Predictive values; Diagnostic test