Email updates

Keep up to date with the latest news and content from BMC Public Health and BioMed Central.

Open Access Highly Accessed Correspondence

Distributed data processing for public health surveillance

Ross Lazarus1*, Katherine Yih2 and Richard Platt12

Author Affiliations

1 Channing Laboratory, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA

2 Department of Ambulatory Care and Prevention, Harvard Medical School, Harvard Pilgrim Health Care; Harvard Vanguard Medical Associates, Boston, MA, USA

For all author emails, please log on.

BMC Public Health 2006, 6:235  doi:10.1186/1471-2458-6-235

Published: 19 September 2006

Abstract

Background

Many systems for routine public health surveillance rely on centralized collection of potentially identifiable, individual, identifiable personal health information (PHI) records. Although individual, identifiable patient records are essential for conditions for which there is mandated reporting, such as tuberculosis or sexually transmitted diseases, they are not routinely required for effective syndromic surveillance. Public concern about the routine collection of large quantities of PHI to support non-traditional public health functions may make alternative surveillance methods that do not rely on centralized identifiable PHI databases increasingly desirable.

Methods

The National Bioterrorism Syndromic Surveillance Demonstration Program (NDP) is an example of one alternative model. All PHI in this system is initially processed within the secured infrastructure of the health care provider that collects and holds the data, using uniform software distributed and supported by the NDP. Only highly aggregated count data is transferred to the datacenter for statistical processing and display.

Results

Detailed, patient level information is readily available to the health care provider to elucidate signals observed in the aggregated data, or for ad hoc queries. We briefly describe the benefits and disadvantages associated with this distributed processing model for routine automated syndromic surveillance.

Conclusion

For well-defined surveillance requirements, the model can be successfully deployed with very low risk of inadvertent disclosure of PHI – a feature that may make participation in surveillance systems more feasible for organizations and more appealing to the individuals whose PHI they hold. It is possible to design and implement distributed systems to support non-routine public health needs if required.