Open Access Research article

Rasch fit statistics and sample size considerations for polytomous data

Adam B Smith12*, Robert Rush3, Lesley J Fallowfield4, Galina Velikova1 and Michael Sharpe5

Author Affiliations

1 Cancer Research UK – Clinical Centre, St. James's University Hospital, Leeds, UK

2 Centre for Health & Social Care, University of Leeds, Leeds, UK

3 Centre for Integrated Health Research, Queen Margaret University, Edinburgh, UK

4 Psychosocial Oncology Group – Cancer Research UK, University of Sussex, UK

5 School of Molecular & Clinical Medicine, University of Edinburgh, Edinburgh, UK

For all author emails, please log on.

BMC Medical Research Methodology 2008, 8:33  doi:10.1186/1471-2288-8-33

Published: 29 May 2008



Previous research on educational data has demonstrated that Rasch fit statistics (mean squares and t-statistics) are highly susceptible to sample size variation for dichotomously scored rating data, although little is known about this relationship for polytomous data. These statistics help inform researchers about how well items fit to a unidimensional latent trait, and are an important adjunct to modern psychometrics. Given the increasing use of Rasch models in health research the purpose of this study was therefore to explore the relationship between fit statistics and sample size for polytomous data.


Data were collated from a heterogeneous sample of cancer patients (n = 4072) who had completed both the Patient Health Questionnaire – 9 and the Hospital Anxiety and Depression Scale. Ten samples were drawn with replacement for each of eight sample sizes (n = 25 to n = 3200). The Rating and Partial Credit Models were applied and the mean square and t-fit statistics (infit/outfit) derived for each model.


The results demonstrated that t-statistics were highly sensitive to sample size, whereas mean square statistics remained relatively stable for polytomous data.


It was concluded that mean square statistics were relatively independent of sample size for polytomous data and that misfit to the model could be identified using published recommended ranges.