Open Access Highly Accessed Research article

Evaluation of data completeness in the electronic health record for the purpose of patient recruitment into clinical trials: a retrospective analysis of element presence

Felix Köpcke1*, Benjamin Trinczek2, Raphael W Majeed3, Björn Schreiweis4, Joachim Wenk5, Thomas Leusch6, Thomas Ganslandt7, Christian Ohmann5, Björn Bergh4, Rainer Röhrig3, Martin Dugas2 and Hans-Ulrich Prokosch1

Author Affiliations

1 Lehrstuhl für Medizinische Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Krankenhausstraße 12, Erlangen 91054, Germany

2 Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, Gebäude A11, Münster 48149, Germany

3 Anaesthesiologie und operative Intensivmedizin in Gießen, Rudolf-Buchheim-Straße 7, Gießen 35392, Germany

4 Universitätsklinikum Heidelberg, Zentrum für Informations- und Medizintechnik, Sektion Medizinische Informationssysteme, Speyerer Straße 4, Heidelberg D-69115, Germany

5 Koordinierungszentrum für Klinische Studien, Medizinische Fakultät, Heinrich-Heine-Universität, Moorenstr. 5, Düsseldorf, 40225 Germany

6 Universitätsklinikum Düsseldorf, Abt. Datenverarbeitung D05.IKT, Anwendungsbetreuung Medico, Moorenstr. 5, Düsseldorf 40225, Germany

7 Universitätsklinikum Erlangen, Medizinisches Zentrum für Informations- und Kommunikationstechnik, Krankenhausstraße 12, Erlangen 91054, Germany

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2013, 13:37  doi:10.1186/1472-6947-13-37

Published: 21 March 2013



Computerized clinical trial recruitment support is one promising field for the application of routine care data for clinical research. The primary task here is to compare the eligibility criteria defined in trial protocols with patient data contained in the electronic health record (EHR). To avoid the implementation of different patient definitions in multi-site trials, all participating research sites should use similar patient data from the EHR. Knowledge of the EHR data elements which are commonly available from most EHRs is required to be able to define a common set of criteria. The objective of this research is to determine for five tertiary care providers the extent of available data compared with the eligibility criteria of randomly selected clinical trials.


Each participating study site selected three clinical trials at random. All eligibility criteria sentences were broken up into independent patient characteristics, which were then assigned to one of the 27 semantic categories for eligibility criteria developed by Luo et al. We report on the fraction of patient characteristics with corresponding structured data elements in the EHR and on the fraction of patients with available data for these elements. The completeness of EHR data for the purpose of patient recruitment is calculated for each semantic group.


351 eligibility criteria from 15 clinical trials contained 706 patient characteristics. In average, 55% of these characteristics could be documented in the EHR. Clinical data was available for 64% of all patients, if corresponding data elements were available. The total completeness of EHR data for recruitment purposes is 35%. The best performing semantic groups were ‘age’ (89%), ‘gender’ (89%), ‘addictive behaviour’ (74%), ‘disease, symptom and sign’ (64%) and ‘organ or tissue status’ (61%). No data was available for 6 semantic groups.


There exists a significant gap in structure and content between data documented during patient care and data required for patient eligibility assessment. Nevertheless, EHR data on age and gender of the patient, as well as selected information on his disease can be complete enough to allow for an effective support of the manual screening process with an intelligent preselection of patients and patient data.

Patient selection; Research subject recruitment; Clinical trials as topic; Electronic health records; Data quality; Information systems; Database