Email updates

Keep up to date with the latest news and content from BMC Medical Education and BioMed Central.

Open Access Research article

Standard setting: Comparison of two methods

Sanju George1*, M Sayeed Haque2 and Femi Oyebode2

Author Affiliations

1 Queen Elizabeth Psychiatric Hospital, Mindelsohn Way, Edgbaston, Birmingham, UK, B15 2QZ

2 Department of Psychiatry, University of Birmingham, Queen Elizabeth Psychiatric Hospital, Birmingham, UK, B15 2QZ

For all author emails, please log on.

BMC Medical Education 2006, 6:46  doi:10.1186/1472-6920-6-46

Published: 14 September 2006

Abstract

Background

The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard – setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method.

Methods

The norm – reference method of standard -setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method.

Results

The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% – 87%). The modified Angoff method had an inter-rater reliability of 0.81 – 0.82 and a test-retest reliability of 0.59–0.74.

Conclusion

There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability.