Open Access Research article

A new approach to analyse longitudinal epidemiological data with an excess of zeros

Alette S Spriensma123*, Tibor RS Hajos24, Michiel R de Boer35, Martijn W Heymans123 and Jos WR Twisk123

Author Affiliations

1 Department of Epidemiology and Biostatistics, VU University Medical Center, P.O. Box 7057, Amsterdam, 1007 MB, The Netherlands

2 EMGO Institute for Health and Care Research, VU University Medical Center, Van der Boechorststraat 7, Amsterdam, 1081 BT, The Netherlands

3 Department of Methodology and Applied Biostatistics, Faculty of Earth and Life Sciences, Institute of Health Sciences, VU University, de Boelelaan 1085, Amsterdam, 1081 HV, The Netherlands

4 Department of Medical Psychology, VU University Medical Centre, Van der Boechorststraat 7, Amsterdam, 1081 BT, The Netherlands

5 Department of Health Sciences, University of Groningen, Antonius Deusinglaan 1, Groningen, 9713 AV, The Netherlands

For all author emails, please log on.

BMC Medical Research Methodology 2013, 13:27  doi:10.1186/1471-2288-13-27

Published: 20 February 2013

Abstract

Background

Within longitudinal epidemiological research, ‘count’ outcome variables with an excess of zeros frequently occur. Although these outcomes are frequently analysed with a linear mixed model, or a Poisson mixed model, a two-part mixed model would be better in analysing outcome variables with an excess of zeros. Therefore, objective of this paper was to introduce the relatively ‘new’ method of two-part joint regression modelling in longitudinal data analysis for outcome variables with an excess of zeros, and to compare the performance of this method to current approaches.

Methods

Within an observational longitudinal dataset, we compared three techniques; two ‘standard’ approaches (a linear mixed model, and a Poisson mixed model), and a two-part joint mixed model (a binomial/Poisson mixed distribution model), including random intercepts and random slopes. Model fit indicators, and differences between predicted and observed values were used for comparisons. The analyses were performed with STATA using the GLLAMM procedure.

Results

Regarding the random intercept models, the two-part joint mixed model (binomial/Poisson) performed best. Adding random slopes for time to the models changed the sign of the regression coefficient for both the Poisson mixed model and the two-part joint mixed model (binomial/Poisson) and resulted into a much better fit.

Conclusion

This paper showed that a two-part joint mixed model is a more appropriate method to analyse longitudinal data with an excess of zeros compared to a linear mixed model and a Poisson mixed model. However, in a model with random slopes for time a Poisson mixed model also performed remarkably well.

Keywords:
Two-part joint model; Excess of zeros; Count; Mixed modelling; Longitudinal; Statistical methods