Open Access Open Badges Research article

Defining upper gastrointestinal bleeding from linked primary and secondary care data and the effect on occurrence and 28 day mortality

Colin John Crooks12*, Timothy Richard Card12 and Joe West12

Author Affiliations

1 Division of Epidemiology and Public Health, The University of Nottingham, Clinical Sciences Building 2, City Hospital, Nottingham, NG5 1PB, UK

2 Nottingham Digestive Diseases Centre, National Institute for Health Research Biomedical Research Unit, Queen’s Medical Centre, Nottingham University Hospitals National Health Service Trust, Nottingham, NG7 2UH, UK

For all author emails, please log on.

BMC Health Services Research 2012, 12:392  doi:10.1186/1472-6963-12-392

Published: 13 November 2012



Primary care records from the UK have frequently been used to identify episodes of upper gastrointestinal bleeding in studies of drug toxicity because of their comprehensive population coverage and longitudinal recording of prescriptions and diagnoses. Recent linkage within England of primary and secondary care data has augmented this data but the timing and coding of concurrent events, and how the definition of events in linked data effects occurrence and 28 day mortality is not known.


We used the recently linked English Hospital Episodes Statistics and General Practice Research Database, 1997–2010, to define events by; a specific upper gastrointestinal bleed code in either dataset, a specific bleed code in both datasets, or a less specific but plausible code from the linked dataset.


This approach resulted in 81% of secondary care defined bleeds having a corresponding plausible code within 2 months in primary care. However only 62% of primary care defined bleeds had a corresponding plausible HES admission within 2 months. The more restrictive and specific case definitions excluded severe events and almost halved the 28 day case fatality when compared to broader and more sensitive definitions.


Restrictive definitions of gastrointestinal bleeding in linked datasets fail to capture the full heterogeneity in coding possible following complex clinical events. Conversely too broad a definition in primary care introduces events not severe enough to warrant hospital admission. Ignoring these issues may unwittingly introduce selection bias into a study’s results.

Selection bias; Mortality; Data linkage; Upper gastrointestinal bleeding; Case definitions