An overview of the statistical methods reported by studies using the Canadian community health survey
- Equal contributors
Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada
BMC Medical Research Methodology 2014, 14:15 doi:10.1186/1471-2288-14-15Published: 25 January 2014
The Canadian Community Health Survey (CCHS) is a cross-sectional survey that has collected information on health determinants, health status and the utilization of the health system in Canada since 2001. Several hundred articles have been written utilizing the CCHS dataset. Previous analyses of statistical methods utilized in the literature have focused on a particular journal or set of journals to understand the statistical literacy required for understanding the published research. In this study, we describe the statistical methods referenced in the published literature utilizing the CCHS dataset(s).
A descriptive study was undertaken of references published in Medline, Embase, Web of Knowledge and Scopus associated with the CCHS. These references were imported into a Java application utilizing the searchable Apache Lucene text database and screened based upon pre-defined inclusion and exclusion criteria. Full-text PDF articles that met the inclusion criteria were then used for the identification of descriptive, elementary and regression statistical methods referenced in these articles. The identification of statistical methods occurred through an automated search of key words on the full-text articles utilizing the Java application.
We identified 4811 references from the 4 bibliographical databases for possible inclusion. After exclusions, 663 references were used for the analysis. Descriptive statistics such as means or proportions were presented in a majority of the articles (97.7%). Elementary-level statistics such as t-tests were less frequently referenced (29.7%) than descriptive statistics. Regression methods were frequently referenced in the articles: 79.8% of articles contained reference to regression in general with logistic regression appearing most frequently in 67.1% of the articles.
Our study shows a diverse set of analysis methods being referenced in the CCHS literature, however, the literature heavily relies on only a subset of all possible statistical tools. This information can be used in identifying gaps in statistical methods that could be applied to future analysis of public health surveys, insight into training and educational programs, and also identifies the level of statistical literacy needed to understand the published literature.