A method for inferring medical diagnoses from patient similarities
1 Departments of Bioengineering & Genetics, Stanford University, 318 Campus Drive, Stanford 94305, USA
2 Sackler School of Medicine, Tel Aviv University, Klausner St., Tel Aviv 69978, Israel
3 Department of Internal Medicine "B", Beilinson Hospital, Rabin Medical Center, 39 Jabotinski St., Petah-Tikva 49100, Israel
4 Blavatnik School of Computer Science, Tel-Aviv University, Klausner St., Tel Aviv 69978, Israel
BMC Medicine 2013, 11:194 doi:10.1186/1741-7015-11-194Published: 2 September 2013
Clinical decision support systems assist physicians in interpreting complex patient data. However, they typically operate on a per-patient basis and do not exploit the extensive latent medical knowledge in electronic health records (EHRs). The emergence of large EHR systems offers the opportunity to integrate population information actively into these tools.
Here, we assess the ability of a large corpus of electronic records to predict individual discharge diagnoses. We present a method that exploits similarities between patients along multiple dimensions to predict the eventual discharge diagnoses.
Using demographic, initial blood and electrocardiography measurements, as well as medical history of hospitalized patients from two independent hospitals, we obtained high performance in cross-validation (area under the curve >0.88) and correctly predicted at least one diagnosis among the top ten predictions for more than 84% of the patients tested. Importantly, our method provides accurate predictions (>0.86 precision in cross validation) for major disease categories, including infectious and parasitic diseases, endocrine and metabolic diseases and diseases of the circulatory systems. Our performance applies to both chronic and acute diagnoses.
Our results suggest that one can harness the wealth of population-based information embedded in electronic health records for patient-specific predictive tasks.