Rarely selected distractors in high stakes medical multiple-choice examinations and their recognition by item authors: a simulation and survey
Assessment and Evaluation Unit, Institute of Medical Education, Faculty of Medicine, University of Bern, Konsumstrasse 13, CH-3010 Bern, Switzerland
BMC Medical Education 2010, 10:85 doi:10.1186/1472-6920-10-85Published: 24 November 2010
Many medical exams use 5 options for multiple choice questions (MCQs), although the literature suggests that 3 options are optimal. Previous studies on this topic have often been based on non-medical examinations, so we sought to analyse rarely selected, 'non-functional' distractors (NF-D) in high stakes medical examinations, and their detection by item authors as well as psychometric changes resulting from a reduction in the number of options.
Based on Swiss Federal MCQ examinations from 2005-2007, the frequency of NF-D (selected by <1% or <5% of the candidates) was calculated. Distractors that were chosen the least or second least were identified and candidates who chose them were allocated to the remaining options using two extreme assumptions about their hypothetical behaviour: In case rarely selected distractors were eliminated, candidates could randomly choose another option - or purposively choose the correct answer, from which they had originally been distracted. In a second step, 37 experts were asked to mark the least plausible options. The consequences of a reduction from 4 to 3 or 2 distractors - based on item statistics or on the experts' ratings - with respect to difficulty, discrimination and reliability were modelled.
About 70% of the 5-option-items had at least 1 NF-D selected by <1% of the candidates (97% for NF-Ds selected by <5%). Only a reduction to 2 distractors and assuming that candidates would switch to the correct answer in the absence of a 'non-functional' distractor led to relevant differences in reliability and difficulty (and to a lesser degree discrimination). The experts' ratings resulted in slightly greater changes compared to the statistical approach.
Based on item statistics and/or an expert panel's recommendation, the choice of a varying number of 3-4 (or partly 2) plausible distractors could be performed without marked deteriorations in psychometric characteristics.