Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Research article

Input data quality control for NDNQI national comparative statistics and quarterly reports: a contrast of three robust scale estimators for multiple outlier detection

Qingjiang Hou1*, Brandon Crosser2, Jonathan D Mahnken1, Byron J Gajewski12 and Nancy Dunton2

Author Affiliations

1 Department of Biostatistics, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS 66160, USA

2 School of Nursing, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA

For all author emails, please log on.

BMC Research Notes 2012, 5:456  doi:10.1186/1756-0500-5-456

Published: 25 August 2012

Abstract

Background

To evaluate institutional nursing care performance in the context of national comparative statistics (benchmarks), approximately one in every three major healthcare institutions (over 1,800 hospitals) across the United States, have joined the National Database for Nursing Quality Indicators® (NDNQI®). With over 18,000 hospital units contributing data for nearly 200 quantitative measures at present, a reliable and efficient input data screening for all quantitative measures for data quality control is critical to the integrity, validity, and on-time delivery of NDNQI reports.

Methods

With Monte Carlo simulation and quantitative NDNQI indicator examples, we compared two ad-hoc methods using robust scale estimators, Inter Quartile Range (IQR) and Median Absolute Deviation from the Median (MAD), to the classic, theoretically-based Minimum Covariance Determinant (FAST-MCD) approach, for initial univariate outlier detection.

Results

While the theoretically based FAST-MCD used in one dimension can be sensitive and is better suited for identifying groups of outliers because of its high breakdown point, the ad-hoc IQR and MAD approaches are fast, easy to implement, and could be more robust and efficient, depending on the distributional property of the underlying measure of interest.

Conclusion

With highly skewed distributions for most NDNQI indicators within a short data screen window, the FAST-MCD approach, when used in one dimensional raw data setting, could overestimate the false alarm rates for potential outliers than the IQR and MAD with the same pre-set of critical value, thus, overburden data quality control at both the data entry and administrative ends in our setting.

Keywords:
NDNQI; Interquartile range; Median absolute deviation; FAST-MCD; Outlier; Quality control