Open Access Research article

Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory

Lubna A Baig12* and Claudio Violato3

Author Affiliations

1 Medical Education, College of Medicine, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia

2 Medical Education Research Unit, University of Calgary, Calgary, Canada

3 Medical Education Research Unit Faculty of Medicine, University of Calgary, Calgary, Canada

For all author emails, please log on.

BMC Medical Education 2012, 12:121  doi:10.1186/1472-6920-12-121

Published: 7 December 2012



The objective structure clinical examination (OSCE) has been used since the early 1970s for assessing clinical competence. There are very few studies that have examined the psychometric stability of the stations that are used repeatedly with different samples. The purpose of the present study was to assess the stability of objective structured clinical exams (OSCEs) employing the same stations used over time but with a different sample of candidates, SPs, and examiners.


At Time 1, 191 candidates and at Time 2 (one year apart), 236 candidates participated in a 10-station OSCE; 6 of the same stations were used in both years. Generalizability analyses (Ep2) were conducted. Employing item response analyses, test characteristic curves (TCC) were derived for each of the 6 stations for a 2-parameter model. The TCCs were compared across the two years, Time 1 and 2.


The Ep2 of the OSCEs exceeded.70. Standardized thetas (θ) and discriminations were equivalent for the same station across the two year period indicating equivalent TCCs for a 2-parameter model.


The 6 OSCE stations used by the AIMG program over two years have adequate internal consistency reliability, stable generalizability (Ep2) and equivalent test characteristics. The process of assessment employed for IMG’s are stable OSCE stations that may be used several times over without compromising psychometric properties.

With careful security, high-stakes OSCEs may use the same stations that have high internal consistency and generalizability repeatedly as the psychometric properties are stable over several years with different samples of candidates.

Stability of OSCEs; Latent trait analyses; Ep2; Internal consistency