An adaptive prediction and detection algorithm for multistream syndromic surveillance
National Security Technology Department, The Johns Hopkins University Applied Physics Laboratory, Laurel, MD 20723-6099, USA
BMC Medical Informatics and Decision Making 2005, 5:33 doi:10.1186/1472-6947-5-33Published: 12 October 2005
Surveillance of Over-the-Counter pharmaceutical (OTC) sales as a potential early indicator of developing public health conditions, in particular in cases of interest to biosurvellance, has been suggested in the literature. This paper is a continuation of a previous study in which we formulated the problem of estimating clinical data from OTC sales in terms of optimal LMS linear and Finite Impulse Response (FIR) filters. In this paper we extend our results to predict clinical data multiple steps ahead using OTC sales as well as the clinical data itself.
The OTC data are grouped into a few categories and we predict the clinical data using a multichannel filter that encompasses all the past OTC categories as well as the past clinical data itself. The prediction is performed using FIR (Finite Impulse Response) filters and the recursive least squares method in order to adapt rapidly to nonstationary behaviour. In addition, we inject simulated events in both clinical and OTC data streams to evaluate the predictions by computing the Receiver Operating Characteristic curves of a threshold detector based on predicted outputs.
We present all prediction results showing the effectiveness of the combined filtering operation. In addition, we compute and present the performance of a detector using the prediction output.
Multichannel adaptive FIR least squares filtering provides a viable method of predicting public health conditions, as represented by clinical data, from OTC sales, and/or the clinical data. The potential value to a biosurveillance system cannot, however, be determined without studying this approach in the presence of transient events (nonstationary events of relatively short duration and fast rise times). Our simulated events superimposed on actual OTC and clinical data allow us to provide an upper bound on that potential value under some restricted conditions. Based on our ROC curves we argue that a biosurveillance system can provide early warning of an impending clinical event using ancillary data streams (such as OTC) with established correlations with the clinical data, and a prediction method that can react to nonstationary events sufficiently fast. Whether OTC (or other data streams yet to be identified) provide the best source of predicting clinical data is still an open question. We present a framework and an example to show how to measure the effectiveness of predictions, and compute an upper bound on this performance for the Recursive Least Squares method when the following two conditions are met: (1) an event of sufficient strength exists in both data streams, without distortion, and (2) it occurs in the OTC (or other ancillary streams) earlier than in the clinical data.