The use of Patient-reported Outcomes (PROs) as secondary endpoints in the development of new antidepressants has grown in recent years. The objective of this study was to assess the psychometric properties of the 9-item, patient-administered version of the Montgomery-Åsberg Depression Rating Scale (MADRS-S).
Data from a multicentre, double-blind, 8-week, randomised controlled trial of 278 outpatients diagnosed with Major Depressive Disorder were used to evaluate the validity, reliability and sensitivity to change of the MADRS-S using psychometric methods. A Receiver Operating Characteristic (ROC) curve was plotted to identify the most appropriate threshold to define perceived remission.
No missing values were found at the item level, indicating good acceptability of the scale. The construct validity was satisfactory: all items contributed to a common underlying concept, as expected. The correlation between MADRS-S and physicians' MADRS was moderate (r = 0.54, p < 0.001) indicating that MADRS-S is complementary rather than redundant to the MADRS. Cronbach's alpha was 0.84, and the stability over time of the scale, estimated on a sub-sample of patients whose health status did not change during the first week of the study, was good (intraclass correlation coefficient of 0.78). MADRS-S sensitivity to change was shown. Using a threshold value of 5, the definition of "perceived remission" reached a sensitivity of 82% and a specificity of 75%.
Taking account of patient's perceptions of the severity of their own symptoms along with the psychometric properties of the MADRS-S enable its use for evaluative purposes in the development of new antidepressant drugs.
Depression is a disabling illness associated with considerable co-morbidity, risk of suicide and numerous adverse social and economic consequences [1-3]. The reported lifetime risk for Major Depressive Disorder (MDD) in the general population varies between 10% – 25% for women, and 5% – 12% for men . The pharmacological treatment of MDD is based on antidepressants, whose efficacy has been demonstrated in a large number of studies .
The use of patient-reported outcomes as a secondary endpoint in the development of new antidepressants has been of growing interest in recent years. Among these outcomes, health-related quality of life, medication compliance and subjective effectiveness (patients' perceptions of symptom severity) are the most commonly used [6-8]. One possible explanation for this growing interest is the fact that patients are increasingly becoming key players in the overall disease management process. In the subjective effectiveness questionnaire category, the main instruments are the Beck Depression Inventory (BDI) [9,10], the Carroll Rating Scale for Depression (CRSD) [11-13], the Montgomery – Åsberg Depression Rating Scale – Self report (MADRS-S) , the Hamilton Depression Inventory (HDI) , and the Quick Inventory of Depressive Symptomatology – Self Report (QIDS-SR) .
The BDI is the most widely used self-rating instrument, and has been extensively validated in numerous studies . The CRSD and the HDI are the self-reported versions of the Hamilton Depression Rating Scale (HDRS) , while the MADRS-S is the patient version of the Montgomery-Åsberg Depression Rating Scale (MADRS) . From the conceptual and psychometric points of view, these questionnaires are quite different. The BDI is more concerned with depressive cognitive attitudes while the other scales pay more attention to somatic symptoms and functional impairment. The CRSD may discourage patients; completing its 52 questions is time consuming. This may also make it more difficult to implement in clinical research studies compared with the shorter 16-item QIDS-SR or 9-item MADRS-S.
These reasons led us to focus on the psychometric properties of the MADRS-S, particularly its sensitivity to change, since this is of major importance for evaluative purposes (e.g. in the comparison of treatment effects).
Patients and Methods
Study Design and Population
Data came from a multicentre, double blind, randomised clinical trial comparing escitalopram with citalopram in outpatients diagnosed with MDD . Eligible patients were aged between 18–65 years, fulfilled DSM-IV criteria for MDD, and had a baseline MADRS total score of at least 30.
Patients meeting DSM-IV criteria for primary diagnoses of any axis I other than MDD, or those with a history of mania, bipolar disorder, schizophrenia or other psychotic disorder, obsessive-compulsive disorder, or cognitive disorder were not eligible for the study. Patients who met DSM-IV criteria for substance abuse or dependence within the past 12 moths, or who used a depot antipsychotic within 6 months before study inclusion, or any antipsychotic, anxiolytic or anticonvulsant medications within 2 weeks before the first administration of study medication were also ineligible for inclusion.
The Regional review and Ethics committee approved the study protocol on September 3rd, 2003. All patients provided their written informed consent.
Study assessments were performed at baseline and at weeks 1, 4 and 8 after start of treatment. Sociodemographics and clinical data were collected at baseline, and the investigators administered the MADRS, the Clinical Global Impression of Severity (CGI-S) and Improvement (CGI-I) scales at each visit . Before these assessments, patients were asked to fill in the MADRS-S.
This scale consists of 9 items assessing patients' mood, feelings of unease, sleep, appetite, ability to concentrate, initiative, emotional involvement, pessimism and zest for life. Each item is scored between 0 and 3, with three intermediate levels (0.5, 1.5, 2.5). The total score is calculated by summing the answers of the nine items, ranging between 0 and 27 (higher scores indicate increased impairment).
Statistical analyses were performed using SAS version 8.2 , and all statistical tests were two-sided. The α risk was set to 0.05. Continuous variables were described using mean ± standard deviation (SD), while categorical variables were reported using frequency and percentage.
Item-level analysis consisted of assessing the number of missing values for each item and item-response distribution . Correlating each item with the MADRS-S total score after correction for overlap assessed item-internal consistency. A correlation of at least 0.40 is recommended as the standard for supporting item-internal consistency . We also calculated the percentage of respondents achieving the lowest (floor effect) and highest (ceiling effect) score to determine whether the range of MADRS-S was appropriate.
Construct validity was examined using several methods.
(1) Factor analysis was conducted to test the underlying dimensionality of the MADRS-S.
(2) The discriminative validity of the MADRS-S was determined by comparing mean scores across patient groups that were known to differ in their clinical features (known-groups methodology ). Since the recall period of the MADRS-S is the past three days, we did not expect it to be associated with medical history (i.e. number of episodes of depression, history of psychiatric hospitalisation), and we assumed the MADRS-S total score to be associated with the severity of the current episode.
(3) The Receiver Operating Characteristic (ROC) curve was plotted to define the optimal cut-off value for perceived remission, using the MADRS criteria of 12 or less for remission as the "gold standard".
Cronbach's alpha coefficient was used to estimate the internal consistency reliability of the MADRS-S score. A reliability of at least 0.70 is recommended to compare groups of patients, while at least 0.90 is required for comparing individuals . Test-retest reliability of the MADRS-S questionnaire was assessed in a sub-sample of 120 patients whose health status severity was declared unchanged between the baseline and week 1 visits using the CGI-I scale. The intraclass correlation coefficient (ICC) was computed between scale scores from both assessments.
The sensitivity to change of the MADRS-S questionnaire was assessed in a sub-sample of 132 patients whose MADRS total score at week 8 was lower or equal to 12 (remission state). Baseline and week 8 scores were compared using paired t-test. Effect sizes were also computed. According to Cohen , an effect size of at least 0.2 is recommended as the standard for supporting sensitivity to change.
Finally, the evaluative ability of the MADRS-S to discriminate between treatment groups was tested using an analysis of covariance (ANCOVA) model, predicting the mean MADRS-S change at week 8 from baseline, with investigator specialisation and treatment as factors, and baseline MADRS-S score as covariate. Perceived response rates, defined as a reduction of at least 50% from the baseline MADRS-S score at week 8, and perceived remission rates, defined using the cut-off value revealed in the ROC analysis, were also compared using a logistic model with the same explanatory factors and covariate as those used in the ANCOVA model described above.
Among the 280 patients, two (0.7%) refused to fill in the MADRS-S questionnaire and were excluded from the analyses. The mean patient age was 45.2 ± 11.0 years, 186 (66.9%) were females, 188 (67.6%) had a professional activity, and 218 (78.4%) lived in an urban area. One hundred and sixty-seven patients (60.1%) were recruited by psychiatrists; 137 patients (49.3%) were treated with escitalopram and the remaining 141 (50.7%) received citalopram.
Clinical characteristics are presented in Table 1. More than half of the patients were experiencing their first episode of MDD, while 46 (16.5%) had a history of psychiatric hospitalisation. Overall, patients had a mean MADRS of 35.9 and a mean CGI-S of 5.1; 57.6% were rated as severely ill (MADRS ≥ 35).
Table 1. Patients' clinical characteristics at baseline.
No missing values were observed, indicating a high level of patient acceptability of the questionnaire. With the exception of items 4 (appetite), 7 (emotional involvement) and 9 (zest for life), item response distributions had a higher ceiling rather than floor effect, highlighting the initial severity of the disease (Table 2). Item-scale correlations showed that all but one item (8, pessimism) achieved the standard value of 0.40 for item-internal consistency (Table 2).
Table 2. Descriptive Statistics of the 9 MADRS-S Items and Total Score.
The results of the factor analysis confirmed the unidimensionality of the MADRS-S: each item contributed to the first factor axis with a factor loading of at least 0.50, explaining 45% of the total variance.
The MADRS-S score was moderately correlated with physicians' severity ratings (MADRS-S with MADRS: r = 0.54, p < 0.001; MADRS-S with CGI-S: r = 0.38, p < 0.001). As expected, the MADRS-S total score did not discriminate as to whether a patient was suffering from their first episode of MDD, nor if they had a history of psychiatric hospitalisation, but did discriminate as to whether a patient's baseline severity was ≥ 35 in the current episode (Table 3).
Table 3. Clinical Discriminative Validity of the MADRS-S.
The ROC curve for perceived remission is displayed in Figure 1. Using the cut-off value of 5, the MADRS-S-based definition of perceived remission reached a sensitivity of 81.8%, a specificity of 75.4%, and positive and negative predicted values of 77.1% and 80.3%, respectively.
Figure 1. Receiver Operating Characteristic Curve for Perceived Remission. CO: cut-off value for MADRS-S score.
Internal consistency reliability of the MADRS-S was satisfactory, with a Cronbach's alpha of 0.84, allowing group comparisons. The deletion of any of the 9 items would not increase the internal consistency of the total score (Table 2).
Among the 120 patients whose CGI-I at week 1 was rated "No change" by physicians, the intraclass correlation coefficient was 0.78, indicating the satisfactory test-retest reliability of the MADRS-S.
Sensitivity to changes
In the sub-sample of 132 remitter patients (i.e. those whose MADRS total score at week 8 was less than or equal to 12), a statistically significant difference of -12.4 ± 4.2 points was found for the total MADRS-S between baseline and week 8. This difference led to an effect size of 2.8, which supported the sensitivity to change of the self-reported version of the MADRS.
When comparing the antidepressant effects of the two therapeutic strategies of the trial, we found that the mean MADRS-S score changes from baseline were in favour of escitalopram (-9.9 ± 5.1 for escitalopram versus -8.6 ± 5.9 for citalopram), the mean difference of 1.3 (standard error of 0.7) being statistically significant (p = 0.046). As a comparison, a mean MADRS difference of 2.1 was found between escitalopram and citalopram (p < 0.05).
Perceived response, defined as a reduction of at least 50% of the baseline MADRS-S score, and perceived remission, defined using the optimal cut-off value of 5 found in the ROC analysis, were also significantly in favour of escitalopram (Figure 2). Perceived response rates were 66.4% and 53.9% for escitalopram and citalopram, respectively (p = 0.033). Perceived remission rates were 49.6% and 37.6% for escitalopram and citalopram, respectively (p = 0.043). As a comparison, response rates based on investigators' ratings of the MADRS were 76.1% for escitalopram and 61.5% for citalopram (p = 0.009); remission rates were 56.1% and 43.6% for escitalopram and citalopram, respectively (p = 0.040).
Figure 2. Perceived Response and Perceived Remission at Week 8.
The objective of this article was to investigate the psychometric properties of the MADRS-S, the patient-reported version of the MADRS. We demonstrated the validity, acceptability, reliability and sensitivity to change of the MADRS-S.
The lack of missing values illustrates good patient acceptance of the questionnaire, and indicates that it seems feasible to ask patients to rate their perception of nine symptoms of depression. Ceiling effects for six items were higher than floor effects, reflecting the initial severity level of the disease, as patients were only included in the study if they had a physician-reported MADRS score of at least 30. Results of the factor analysis supported the unique underlying concept assessed by the nine items. The reliability of the MADRS-S was satisfactory, with a test-retest intraclass correlation coefficient of 0.78 and a Cronbach's alpha of 0.84. Most importantly, the effect size of 2.8 in a sub-sample of improved patients after eight weeks of antidepressant treatment confirmed the ability of the scale to be sensitive to change, as is the original MADRS . These results clearly showed the ability of the patient-reported MADRS-S to detect differences between treatment regimens.
However, the association between physician and patient-reported scores was lower in our study (0.54) compared with those reported by Svanborg and Åsberg (0.70) , and by Mundt et al (0.82) . By comparison, Carroll et al  reported an association between the HDRS and the CRSD of 0.71. The more similar the scale, the higher the level of correlation found for assessment procedures (Mundt and colleagues compared the traditional clinician-rated MADRS with a telephone-based, interactive voice response technology); paper-and-pencil vs. interview-based ratings are known to only moderately correlate  and scales with slightly differing content and wording can be expected to show slightly lower correlations.
Patients were asked to fill the MADRS-S before any clinical assessments in order to provide the most accurate perception they have on their disease. It is noteworthy that response and remission rates based on the MADRS-S are always lower than those from the clinician-based MADRS, indicating that patients' and clinicians' perceptions of the disease are different, more complementary than redundant, and can provide additional useful information. This is of major interest for MDD management since taking into account patients' feelings may improve medication compliance, decrease time to symptom alleviation, and, as a result, improve patients' quality of life.
The growing focus on patient-reported outcomes  as a secondary endpoint in randomised clinical trials and the findings of our study lead us to recommend the concomitant use of the MADRS and MADRS-S during the development of new compounds.
The authors declare that they have no competing interests.
NM was head investigator and responsible for the design of the multicentre, double-blind, 8-week, randomised controlled trial of 278 outpatients from which data from the current study was provided. BF and NM conceived and participated in the design of the current analysis. BF oversaw execution of statistical analyses and was responsible for their coordination. Both authors read and approved the final manuscript.
H. Lundbeck A/S provided funding for this study through an unrestricted grant. The authors wish to thank Christophe Sapin, biostatistician, for his assistance in the statistical analysis of data and Mondher Toumi PhD, for his valuable support and input regarding the analyses conducted for this article.
Thompson C, Peveler RC, Stephenson D, McKendrick J: Compliance with antidepressant medication in the treatment of major depressive disorder in primary care: a randomized comparison of fluoxetine and a tricyclic antidepressant.
Psychol Assessm 1995, 7(4):472-83. Publisher Full Text
Rush AJ, Trivedi MH, Ibrahim HM, et al.: The 16-item Quick Inventory of Depression Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): a psychometric evaluation in patients with chronic major depression.
Essink-Bot ML, Krabbe PF, Bonsel GJ, Aaronson NK: An empirical comparison of four generic health status measures: the Nottingham Health Profile, the Medical Outcomes Study 36-Short-Form Health Survey, the COOP/WONCA Charts, and the EuroQol Instrument.
Murray E: Measurement issues in the evaluation of psychopharmacological therapy. In The limits of Biological Treatments for Psychological Distress. Edited by Fisher S, Greenberg RP. Hillsdale: Erlbaum; 1989:39-68.
Chassany O, Sagnier P, Marquis P: Patient-reported outcomes. The example of health-related quality of life – a European guidance document for the improved integration of health-related quality of life assessment in the drug regulatory process.
The pre-publication history for this paper can be accessed here: