Open Access Open Badges Research article

Construct-level predictive validity of educational attainment and intellectual aptitude tests in medical student selection: meta-regression of six UK longitudinal studies

IC McManus12*, Chris Dewberry3, Sandra Nicholson4, Jonathan S Dowell5, Katherine Woolf1 and Henry WW Potts1

Author Affiliations

1 UCL Medical School, University College London, Gower Street, London WC1E 6BT, UK

2 Research Department of Clinical, Educational and Health Psychology, Division of Psychology and Language Sciences, University College London, Gower Street, London WC1E 6BT, UK

3 Department of Organizational Psychology, Birkbeck, University of London, Malet Street, Bloomsbury, London WC1E 7HX, UK

4 Institute of Health Science Education, Queen Mary London, Turner Street, London E1 2AD, UK

5 Undergraduate Medical Education, Ninewells Hospital and Medical School, Dundee, Scotland DD1 9SY, UK

For all author emails, please log on.

BMC Medicine 2013, 11:243  doi:10.1186/1741-7015-11-243

Published: 14 November 2013



Measures used for medical student selection should predict future performance during training. A problem for any selection study is that predictor-outcome correlations are known only in those who have been selected, whereas selectors need to know how measures would predict in the entire pool of applicants. That problem of interpretation can be solved by calculating construct-level predictive validity, an estimate of true predictor-outcome correlation across the range of applicant abilities.


Construct-level predictive validities were calculated in six cohort studies of medical student selection and training (student entry, 1972 to 2009) for a range of predictors, including A-levels, General Certificates of Secondary Education (GCSEs)/O-levels, and aptitude tests (AH5 and UK Clinical Aptitude Test (UKCAT)). Outcomes included undergraduate basic medical science and finals assessments, as well as postgraduate measures of Membership of the Royal Colleges of Physicians of the United Kingdom (MRCP(UK)) performance and entry in the Specialist Register. Construct-level predictive validity was calculated with the method of Hunter, Schmidt and Le (2006), adapted to correct for right-censorship of examination results due to grade inflation.


Meta-regression analyzed 57 separate predictor-outcome correlations (POCs) and construct-level predictive validities (CLPVs). Mean CLPVs are substantially higher (.450) than mean POCs (.171). Mean CLPVs for first-year examinations, were high for A-levels (.809; CI: .501 to .935), and lower for GCSEs/O-levels (.332; CI: .024 to .583) and UKCAT (mean = .245; CI: .207 to .276). A-levels had higher CLPVs for all undergraduate and postgraduate assessments than did GCSEs/O-levels and intellectual aptitude tests. CLPVs of educational attainment measures decline somewhat during training, but continue to predict postgraduate performance. Intellectual aptitude tests have lower CLPVs than A-levels or GCSEs/O-levels.


Educational attainment has strong CLPVs for undergraduate and postgraduate performance, accounting for perhaps 65% of true variance in first year performance. Such CLPVs justify the use of educational attainment measure in selection, but also raise a key theoretical question concerning the remaining 35% of variance (and measurement error, range restriction and right-censorship have been taken into account). Just as in astrophysics, ‘dark matter’ and ‘dark energy’ are posited to balance various theoretical equations, so medical student selection must also have its ‘dark variance’, whose nature is not yet properly characterized, but explains a third of the variation in performance during training. Some variance probably relates to factors which are unpredictable at selection, such as illness or other life events, but some is probably also associated with factors such as personality, motivation or study skills.

Medical student selection; Undergraduate performance; Postgraduate performance; Educational attainment; Aptitude tests; Criterion-related construct validity; Range restriction; Right censorship; Grade inflation; Markov Chain Monte Carlo algorithm