Open Access Research article

Optimizing the diagnostic power with gastric emptying scintigraphy at multiple time points

Qingjiang Hou1, Zhiyue Lin2, Reginald Dusing3, Byron J Gajewski14, Richard W McCallum5 and Matthew S Mayo1*

Author Affiliations

1 Department of Biostatistics, School of Medicine, University of Kansas Medical Center, Kansas City, Kansas 66160, USA

2 Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA

3 Department of Radiology, School of Medicine, University of Kansas Medical Center, Kansas City, Kansas 66160, USA

4 School of Nursing, University of Kansas Medical Center, Kansas City, Kansas 66160, USA

5 Department of Internal Medicine, Texas Tech University Health Science Center, El Paso, TX 79905, USA

For all author emails, please log on.

BMC Medical Research Methodology 2011, 11:84  doi:10.1186/1471-2288-11-84

Published: 31 May 2011



Gastric Emptying Scintigraphy (GES) at intervals over 4 hours after a standardized radio-labeled meal is commonly regarded as the gold standard for diagnosing gastroparesis. The objectives of this study were: 1) to investigate the best time point and the best combination of multiple time points for diagnosing gastroparesis with repeated GES measures, and 2) to contrast and cross-validate Fisher's Linear Discriminant Analysis (LDA), a rank based Distribution Free (DF) approach, and the Classification And Regression Tree (CART) model.


A total of 320 patients with GES measures at 1, 2, 3, and 4 hour (h) after a standard meal using a standardized method were retrospectively collected. Area under the Receiver Operating Characteristic (ROC) curve and the rate of false classification through jackknife cross-validation were used for model comparison.


Due to strong correlation and an abnormality in data distribution, no substantial improvement in diagnostic power was found with the best linear combination by LDA approach even with data transformation. With DF method, the linear combination of 4-h and 3-h increased the Area Under the Curve (AUC) and decreased the number of false classifications (0.87; 15.0%) over individual time points (0.83, 0.82; 15.6%, 25.3%, for 4-h and 3-h, respectively) at a higher sensitivity level (sensitivity = 0.9). The CART model using 4 hourly GES measurements along with patient's age was the most accurate diagnostic tool (AUC = 0.88, false classification = 13.8%). Patients having a 4-h gastric retention value >10% were 5 times more likely to have gastroparesis (179/207 = 86.5%) than those with ≤10% (18/113 = 15.9%).


With a mixed group of patients either referred with suspected gastroparesis or investigated for other reasons, the CART model is more robust than the LDA and DF approaches, capable of accommodating covariate effects and can be generalized for cross institutional applications, but could be unstable if sample size is limited.