Open Access Open Badges Research article

Validating self-reported strokes in a longitudinal UK cohort study (Whitehall II): Extracting information from hospital medical records versus the Hospital Episode Statistics database

Annie Britton1*, Beverly Milne1, Therese Butler1, Adelaida Sanchez-Galvez1, Martin Shipley1, Anthony Rudd2, Charles DA Wolfe2, Ajay Bhalla2 and Eric J Brunner1

Author Affiliations

1 Department of Epidemiology and Public Health, University College London, 1-19 Torrington Place, London, WC1E 6BT, UK

2 Division of Health and Social Care Research, King’s College London, 7th Floor Capital House, 42 Weston Street, London, SE1 3QD, UK

For all author emails, please log on.

BMC Medical Research Methodology 2012, 12:83  doi:10.1186/1471-2288-12-83

Published: 21 June 2012



Valuable information on the determinants of non-fatal stroke can be obtained from longitudinal observational cohort studies. Such studies often rely on self-reported stroke events, which are best validated with external medical evidence. The aim of this paper is to compare the information on incident non-fatal stroke events arising from different sources.


We carried out a validation of self-reported stoke events among participants in the Whitehall II Study, a large UK based cohort study (baseline sample size 10,308 men and women).


106 stroke events were self-reported in three self-administered questionnaires between 2002 and 2009. Eight (7.5%) of these events were discarded as false positives after medical review, 66 were validated by information from the NHS Hospital Episode Statistics (HES) database in England, 16 by manual searches of hospital records alone, and 12 by letters from general practitioners alone. HES provided information on an additional (i.e. not self-reported) 47 events coded as stroke during the period 2002 to 2009 in hospitals in England among the original baseline participants. Of these, 43 participants were no longer active in the study and 4 had completed questionnaires but not reported a stroke event.


Validating self-reported strokes in cohort studies with information from the NHS HES database was efficient and provided information on probable non-fatal stroke events among cohort members no longer in active follow-up. Manual extraction from hospital notes can provide supplementary information beyond that available in the HES discharge summary and was used to sub-type some strokes. However, the process was labour intensive. Multiple sources are needed to capture maximum information on stroke events but increasingly with hospitalisation in the acute phase of stroke, HES has an important role. Further development of HES is required to assure validity and coverage.

Stroke; Cohort studies; Self-report; Validation; Medical records; NHS HES database