Selective non-participation at baseline (due to non-response and non-consent) and loss to follow-up are important concerns for longitudinal observational research. We investigated these matters in the context of baseline recruitment and retention at 18 months of participants for a prospective observational cohort study of knee pain and knee osteoarthritis in the general population.
Participants were recruited to the Knee Clinical Assessment Study – CAS(K) – by a multi-stage process involving response to two postal questionnaires, consent to further contact and medical record review (optional), and attendance at a research clinic. Follow-up at 18-months was by postal questionnaire. The characteristics of responders/consenters were described for each stage in the recruitment process to identify patterns of selective non-participation and loss to follow-up. The external validity of findings from the clinic attenders was tested by comparing the distribution of WOMAC scores and the association between physical function and obesity with the same parameters measured directly in the target population as whole.
3106 adults aged 50 years and over reporting knee pain in the previous 12 months were identified from the first baseline questionnaire. Of these, 819 consented to further contact, responded to the second questionnaire, and attended the research clinics. 776 were successfully followed up at 18 months. There was evidence of selective non-participation during recruitment (aged 80 years and over, lower socioeconomic group, currently in employment, experiencing anxiety or depression, brief episode of knee pain within the previous year). This did not cause significant bias in either the distribution of WOMAC scores or the association between physical function and obesity.
Despite recruiting a minority of the target population to the research clinics and some evidence of selective non-participation, this appears not to have resulted in significant bias of cross-sectional estimates. The main effect of non-participation in the current cohort is likely to be a loss of precision in stratum-specific estimates e.g. in those aged 80 years and over. The subgroup of individuals who attended the research clinics and who make up the CAS(K) cohort can be used to accurately estimate parameters in the reference population as a whole. The potential for selection bias, however, remains an important consideration in each subsequent analysis.
Longitudinal observational research provides critical information on the course, causes, and outcomes of rheumatological disorders [1,2]. In the field of knee osteoarthritis, a state-of-the-science evaluation identified 15 existing population-based prospective cohort studies from the United States and Europe . In addition to examining multiple hypotheses that are specified a priori, these studies also constitute a resource for subsequent nested case-control and case-cohort analyses. When questions are raised about the value of longitudinal observational research, they are more often addressed at its quality rather than its value in principle .
Selective non-participation and loss to follow-up are important aspects of the quality of longitudinal research conducted in the general population. The choice of general population setting (as opposed to patient cohorts in clinical settings) coupled with a sampling approach that aims to derive a representative sample of a defined reference population are often justified on the grounds of external validity . This applies to study findings concerning both absolute measures of frequency (e.g. the proportion experiencing a particular outcome at a point in time or within a given period of observation) and measures of association (e.g. the relationship between a specific exposure or risk indicator and a particular outcome). Non-participation at baseline (referred to here as the combined effect of non-response and non-consent) and loss to follow-up (attrition) are potential threats to the external validity of study findings. Biasing effects of non-response and attrition will tend to be higher as the proportion of non-respondents and losses to follow-up from the eligible population increases. The extent to which non-responders and those lost to follow-up differ from the population mean on the parameter of interest will also influence the degree of bias . By reducing the sample size there is also a consequent loss of precision in parameter estimates. Given the importance of selective non-participation and loss to follow-up it is understandable that these are included as criteria for appraising the "quality" of observational cohort studies [7-9].
The need to carefully assess these matters in studies of rheumatological conditions has long been recognised. In one of the earliest studies of arthritis, the Pittsburgh Arthritis Study, only 60% of those invited to attend a clinical assessment were successfully examined, prompting efforts to evaluate factors associated with non-participation [10,11]. Although differences between participants and non-participants were discussed, it was recognised that participation may be quite specific to the geographical location, target population, and topic of research. Their main recommendation was that an investigation of non-participation should be included in the plans for every study in which clinical examinations are performed. Unfortunately the quality of many more recent longitudinal studies of hip and knee osteoarthritis has been found to be low . Failing to sufficiently investigate or report non-participation and loss to follow-up may contribute to this.
In this article we describe the result of recruitment and retention at 18 months of participants in a prospective observational cohort study of knee pain and knee osteoarthritis in the general population. This study involved multiple stages of data collection at baseline, providing serial opportunities for self-selection out of the study, but also providing data for tracking differences between respondents and non-respondents. We pay particular attention to the occurrence of selective non-participation and loss to follow-up, and their possible effect on the external validity of absolute measures of frequency and of measures of association within the cohort.
The Clinical Assessment Study (Knee) – CAS(K) – is a population-based prospective observational cohort study in four phases of adults aged 50 years and over, registered with one of three general practices (irrespective of their actual consultation patterns). Ethical approval for the study was obtained from the North Staffordshire Local Research Ethics Committee (Project Reference Numbers: 1430, 03/94). Full details of the study design and methods have been presented elsewhere [13,14]. Briefly, the four phases were:
Phase 1: Baseline two-stage mailed survey
A Health Survey questionnaire was mailed to all eligible patients that included measures of socio-demographic characteristics, general health status, psychological and lifestyle variables, and recent pain. Respondents who provided written consent to further contact and who reported knee pain in the past 12 months were sent a Regional Pains Survey questionnaire which collected more detailed data on their reported knee pain including position, laterality, duration, and pain, stiffness and functional limitation using the Western Ontario and McMaster Universities OA index (WOMAC LK 3.0) .
Phase 2: Baseline clinical assessment study of the knee (CAS(K))
Participants with knee pain completing both of the mailed surveys were sent a letter inviting them to attend a research clinic that included a standardised assessment using digital photographs of the lower limbs and hands, clinical interview and examination of the knees and hands, plain radiographs of both knees and both hands and a brief self-complete questionnaire.
Phase 3: 18-month prospective review of general practice medical records
All participants in Phase 1 who gave permission for their GP records to be accessed had their computerised medical records tagged by a member of the Centre's Health Informatics Specialist team. All consultations in the 18-months periods both before and after the baseline clinical assessment were identified. The three practices participating in this research are fully computerised and undergo annual audits completed by the Health Informatics team to assess the quality and completeness of the data entry at the practices.
Phase 4: Follow-up mailed survey at 18 months
A follow-up survey was mailed to all Phase 2 participants approximately 18 months after their baseline clinical assessment. Prior to mailing out this follow-up questionnaire, a member of the Health Informatics team accessed the participating general practice registers in order to gain recent contact details for the participants. Those who were found to have left the practice were traced through the NHS tracing service and their new general practitioners were asked for permission to include them in the follow-up. In addition to the standard mailing process used for the baseline questionnaires (three waves: initial mailing, followed by a postcard reminder to non-respondents, and then by a repeat questionnaire to remaining non-respondents), two further contact stages of minimal data collection (MDC) were used for those who still did not respond, first by post and then by telephone call, were used to gain just the primary outcome data (WOMAC scores).
The target population for CAS(K) was adults aged 50 years or over, registered with one of the three participating practices, and reporting knee pain in the previous 12 months. To determine the extent of selective non-participation and loss to follow-up we compared the characteristics of respondents to the population from which they were drawn at each selection point of the recruitment and follow-up. At the initial recruitment point, which was response to the mailed Health Survey, this comparison was based simply on age, gender, and practice distribution. At each subsequent selection point more information became available on which to make these comparisons, allowing selective non-response and non-consent to be evaluated in relation to socioeconomic characteristics, general health, beliefs about joint pain and osteoarthritis, knee pain characteristics, and radiographic disease. Given the number of possible comparisons, this was done in first instance by simple 'eye-balling'. We then summarised and quantified the main selection effects by comparing characteristics in the observed target population (respondents to the Health Survey reporting knee pain within the previous 12 months) with clinic attenders using logistic regression.
The representativeness of the CAS(K) cohort in terms of chronicity and severity of knee pain at baseline was an important consideration. However, the WOMAC was gathered in the second postal questionnaire (Regional Pains Survey). Some selective non-participation may already have occurred prior to this. In addition, therefore, we compared the distributions of these variables in the CAS(K) cohort with the distributions reported in a single-stage postal survey with high response conducted at three separate practices in North Staffordshire in the same age group and using the same case definition and measures of chronicity and severity of knee pain [16,17].
To investigate the effect of selective non-participation on measures of association, we chose to examine the relationship between body mass index (based on self-reported height and weight) and physical function (SF-36 Physical Function scale ). A positive association between these two has been reported in previous population studies of the general population and in those with knee pain [17,19]. Using logistic regression, we compared the strength and direction of this association in the observed target population and in CAS(K) clinical attenders at baseline, before and after adjusting for age and gender. Furthermore, we stratified both groups on the basis of features related to selective non-participation and compared the strength and direction of the association between BMI and function within each stratum.
Phase 1: Baseline two-stage mailed survey
Health Survey questionnaires were posted to all adults aged 50 years and over at the three practices (n = 8984) over the time period 1 July 2002 to 13 May 2003. During the three mailing waves of the questionnaires, 221 exclusions were made to the database (98 deaths or departures from the practice, 104 questionnaires were returned as addressee unknown, and 19 people had comprehension or memory problems), leaving an eligible study population of 8763 adults. 6108 completed questionnaires were received from the eligible 8763, giving an adjusted response of 69.7%. The non-responders were made up of 223 people who declined to participate, 83 people who stated ill health as the reason for their not completing the questionnaire, and 2349 people for whom no response was received. 99.5% of the responding population reported their ethnicity as white.
Of these 6108 responders, 3106 (50.9%) reported that they had experienced knee pain in the past 12 months (observed target population), of whom 2226 (71.2%) gave written permission to be further contacted and were mailed a Regional Pain Survey questionnaire. During the three mailing waves of this second questionnaire, three exclusions were made to the database (three deaths or departures from the practice) leaving an eligible study population of 2223 adults. 1949 completed questionnaires were received from the eligible 2223, giving an adjusted response of 87.7%. The non-responders were made up of 39 people who declined to participate, 6 people who stated ill health as the reason for not completing the questionnaire, and 229 people from whom no response was received (Figure 1).
Figure 1. Flow diagram showing recruitment of CAS(K) participants.
Phase 2: Baseline clinical assessment study of the knee (CAS(K))
Of the 1949 participants who completed both surveys in Phase 1, 1943 were sent a letter of invitation to the clinical assessment study and 819 participants (42.2%) attended an appointment. Plain radiographs were completed on 790 participants (Figure 1).
Phase 3: 18-month prospective review of general practice medical records
Of the 3106 responders to the Health Questionnaire survey 2423 (78.0%) gave permission on the Health Questionnaire survey to access their medical records. This figure was higher amongst the participants attending the clinical assessment at 779 (95.1%).
Phase 4: Follow-up mailed survey at 18 months
Of the 819 participants that attended the research clinic, 14 exclusions were made to the database prior to mailing the 18-month follow-up questionnaire (9 deaths, three general practitioner exclusions, one moved abroad, one refusal), leaving an eligible study population of 805 adults. During the three mailing waves of the questionnaire, three exclusions were made to the database (two deaths and one due to cognitive impairment), leaving an eligible study population of 802 adults. 776 completed questionnaires were received from the eligible 802 (760 – full questionnaire, 16 – MDC), giving an adjusted response of 96.8%. The non-responders were made up of 9 people who declined to participate, 7 people who stated ill health as the reason for not having completed the questionnaire, one person who was away on long-term holiday and 9 people from whom no response was received (Figure 1). Included in this response were 11 baseline participants who had moved practice during the follow-up period. All were successfully found by the NHS tracing service. One was a GP exclusion, two were non-responders and 8 completed the full questionnaire.
Selective non-participation and loss to follow-up
The representation of the three practices participating in the study was similar at each of the selection points across the 18-month study period (Table 1).
Table 1. Age, gender and practice at each selection point.
Age and gender
Basic demographic information was available for all the eligible population at baseline from the practice register (Table 1). Those participating in the Health Survey questionnaire, who also reported knee pain in the past year, had a similar age distribution to that in the baseline eligible population. However, those who additionally gave permission for further contact were less likely to be aged over 80 years. This age group were also less likely to attend the clinical assessment. The gender distribution of the samples at the various selection points was similar to that seen in the baseline eligible population.
Demographic, general health, psychological and lifestyle characteristics
This information was collected on the Health Survey questionnaire and so was only available for the 6108 respondents to this questionnaire (Table 2). Amongst respondents with knee pain, those giving permission for further contact were more likely to be married or co-habiting. In those attending the clinical assessment and completing the 18-month follow-up questionnaire, the percentage of married/co-habiting participants increased again. The proportion of subjects who had attended higher education increased across each of the selection points. Health Survey respondents who were in employment were more likely to consent to further contact but less likely to attend the research clinics. Those in higher managerial jobs were over-represented when compared to the responders to the Health Survey questionnaire. Individuals who were depressed or anxious according to the Hospital Anxiety and Depression Scale  were less likely to attend the research clinics; however, this was not reflected in the SF-12 mental component scores which remained stable across all selection points. Participants' views on the seriousness of osteoarthritis and the impact that doctors can have on joint pain were very similar across the respondents at the various selection points.
Table 2. Demographic, general health, psychological and lifestyle characteristics at each selectionpoint
Knee pain characteristics
Detailed information regarding knee pain characteristics were collected at two selection points: baseline Regional Pains Survey questionnaire, and baseline research clinic attendance (Table 3).
Table 3. Knee pain characteristics at each selection point.
Participants with knee pain of less than seven days' duration were under-represented in those attending the baseline clinical assessment study compared to the levels in those responding to the Regional Pains Survey questionnaire. However, scores on the pain, stiffness and physical functioning sub-scales of the WOMAC were similar at all three selection points and almost identical to normative data for the whole population of knee sufferers aged 50 years and over and for each age and gender stratum (Table 4).
Table 4. Baseline WOMAC scores in CAS(K) clinic attenders and population norms.
Table 5 summarises the main selection effects comparing the characteristics of clinic attenders (n = 819) to those in the observed target population who did not attend the research clinic (n = 2287). Female gender, age 80 years and over, not being married/cohabiting, lower educational attainment, manual occupations, and possible or probable anxiety or depression were associated with non-participation.
Table 5. Main factors associated with participation in research clinics compared with observed target population.
Association between self-reported BMI and physical function
A positive association between BMI and physical function was observed in the survey respondents reporting knee pain (Table 6). This was stronger after adjusting for age and gender. The same pattern was observed in the sub-group attending clinic, although the association here was marginally stronger (both crude and adjusted odds ratios). We stratified both groups separately by age (50–59, 60–69, 70–79, 80+ years), occupation (manual, non-manual), anxiety (none/possible, probable), and depression (none, possible/probable). Within each stratum, the association between BMI and function tended to be stronger in the CAS(K) clinic attenders than in the observed target population (although there were insufficient numbers in the 80+ and possible/probable depression groups for meaningful analysis; data not shown). The slightly stronger association between BMI and function observed in the CAS(K) clinic attenders as a whole, therefore, is not simply caused by the non-participation of individuals from a particular stratum in whom that association is weak.
Table 6. Association between body mass index and SF-36 physical function in observed target population and CAS(K) clinic attenders*
In the current study, 819 participants attended the clinical assessment from 1949 invited (42%). However, this is from a potentially eligible population of 3106 with knee pain in the target population (giving a crude response of 26%). Assuming the prevalence of knee pain in the non-responders was the same as that observed in the responders to the Health Survey (50.9%), this would lead to a clinical assessment participation rate of 18.4% from the total surveyed population (n = 8763). In such circumstances there is clear potential for poor representativeness of the subgroup attending the research clinic. The inclusion of multiple stages of data collection at baseline provides serial opportunities for self-selection out of the study but does also provide data for tracking differences between respondents and non-respondents. Taking Health Survey respondents with knee pain as the target population (n = 3106), the main selection effects that were apparent were selective non-participation of persons aged 80 years and over, females, not married/cohabiting, those with lower educational attainment or from lower socioeconomic groups (less likely to consent to further contact and to attend research clinic), those in employment, those experiencing anxiety or depression, or those reporting only a brief episode of knee pain within the previous year (less likely to attend research clinic). Given this pattern, it seems unlikely that any single form of selective non-participation is operating. It is more probable that there is a degree of selective non-participation of individuals from opposite ends of a spectrum: on the one hand, the youngest age band (non-response to Health Survey), currently in employment, and with minor episodes of knee pain, and, on the other, the most elderly, who are more likely to have persistent or severe knee pain and other morbidity.
Yet despite the level of non-participation, our main finding is that it may have only a modest effect on the cross-sectional distribution of key variables (e.g. WOMAC scores) and prevalence odds ratios (e.g. between physical function and obesity). The subgroup of individuals who attended the research clinics and who make up the CAS(K) cohort can be used to estimate these parameters accurately in the reference population as a whole. The main effect of non-participation in the current cohort is likely to be a loss of precision in stratum-specific estimates e.g. in those aged 80 years and over. The additive effects of non-response and non-consent are therefore still important  but strategies such as oversampling minority groups or those anticipated to have higher levels of non-participation (e.g. ) may remedy this in future studies.
We used a number of strategies that have been shown to increase response to postal surveys . These included the use of pre-pilot and pilot studies to make the questionnaires more interesting and user-friendly, University sponsorship, the omission of sensitive questions (e.g. income, recent life events), reminder postcard and repeat questionnaire mailing to initial non-respondents, request of an explanation for non-participation (voluntary), placement of relevant knee-specific questions at the start and general questions at the end of the questionnaire (18-month follow-up only), and postal pre-contact (18-month follow-up only). Conversely, our postal questionnaires were long (due to both the scope of data collection and font size felt necessary for this age group); we offered no financial or other personal incentives; we used standard delivery with business reply, and all respondents to the baseline Health Survey were offered the ability to opt out of further contact or medical record review. These strategies may reduce response rates. In the current study, these choices were driven by ethical and cost considerations as well as the ambitious scope of the study. A trade-off may well exist between non-participation at baseline and subsequent loss to follow-up. Attrition at 18 months amongst CAS(K) clinic attenders was very low (3%), similar to a comparable study recently conducted in the United States . The inclusion of the tracing service and minimum data collection at the 18-month follow-up was valuable. Though it directly contributed only 3% to the adjusted response at 18-months (24/802), it halved the level of attrition from 6% (50/802) to 3% (26/802).
We have considered the representativeness of the CAS(K) cohort from the perspective of the sampling frame chosen; that is, adults aged 50 years and over registered with three general practices in North Staffordshire and experiencing knee pain within the previous 12 months. Very few of the target population were from ethnic minorities. We have investigated the effects of selective non-participation and attrition on cross-sectional parameters (descriptive characteristics, prevalence odds ratio). Whilst our findings provide some reassurance on the generalisability of findings from this cohort, we recognise that this cannot be assumed to apply to all subsequent associations and outcomes studied within this cohort. Bias resulting from loss to follow-up, self-selection, and missing data can occur despite the best efforts of investigators  and the role of this in each subsequent analysis must be considered on its own merits. In particular, the question of whether CAS(K) participants differ from non-participants in their prognosis remains unanswered. We intend to determine this at 3-year follow-up, where non-participants to the clinical assessment who consented to further contact will be followed up on the same measures as CAS(K) participants.
Demographic, socioeconomic and health-related factors appeared to influence participation. Beliefs about the seriousness of the condition under investigation or the effectiveness of health care did not. In this study we found substantial non-participation but this did not introduce significant bias to cross-sectional population parameters. Findings from the CAS(K) cohort can be generalised to the target population from which they were drawn although investigating the possible biasing effects of non-participation and attrition remain an important consideration for future analyses of this cohort.
The author(s) declare that they have no competing interests.
All authors participated in the design of the study and drafting the manuscript. All authors read and approved the final manuscript.
This study is supported financially by a Programme Grant awarded by the Medical Research Council, UK (grant code: G9900220) and Support for Science funding secured by the North Staffordshire Primary Care Research Consortium for NHS service support costs. KD is supported by a grant from the Arthritis Research Campaign.
The authors would like to thank the administrative and health informatics staff at Keele University's Primary Care Sciences Research Centre, staff of the participating general practices and Haywood Hospital, especially Dr Jackie Saklatvala, Carole Jackson and the Radiographers at the Department of Radiology.
J Rheumatol 1999, 26:481-483. PubMed Abstract
Wolfe F, Lassere M, van der Heijde D, Stucki G, Suarez-Almazor M, Pincus T, Eberhardt K, Kvien TK, Symmons D, Silman A, van Riel P, Tugwell P, Boers M: Preliminary core set of domains and reporting requirements for longitudinal observational studies in rheumatology.
J Rheumatol 1999, 26:484-489. PubMed Abstract
Khan KS, ter Riet G, Popay J, Nixon J, Kleijnen J: Study quality assessment. In Undertaking Systematic Reviews of Research on Effectiveness. CRD's Guidance for those Carrying out or Commissioning Reviews. CRD Report Number 4. 2nd edition. York Publishing Services Ltd.; 2001.
Aus New Zealand J Surg 2003, 73:712-716. Publisher Full Text
Peat G, Thomas E, Hand J, Wood L, Dziedzic KS, Myers H, Wilkie R, Duncan R, Hay E, Hill J, Croft PR: The Knee Clinical Assessment Study – CAS(K). A prospective study of knee pain and knee osteoarthritis in the general population.
BMC Musculoskel Disord 2004, 5:4. BioMed Central Full Text
Thomas E, Wilkie R, Peat G, Hill S, Dziedzic KS, Croft PR: The North Staffordshire Osteoarthritis Project – NorStOP: Prospective, 3-year study of the epidemiology and management of clinical osteoarthritis in a general population of older adults.
BMC Musculoskel Disord 2004, 5:2. BioMed Central Full Text
Med Care 1992, 30:473-483. PubMed Abstract
Acta Psychiatr Scand 1983, 67:361-370. PubMed Abstract
Urwin M, Symmons D, Allison T, Brammah T, Busby H, Roxby M, Simmons A, Williams G: Estimating the burden of musculoskeletal disorders in the community: the comparative prevalence of symptoms at different anatomical sites, and the relation to social deprivation.
The pre-publication history for this paper can be accessed here: