Department of Health Sciences, University of York, Heslington, York, YO10 5DD, United Kingdom

Abstract

Background

This paper demonstrates how structural equation modelling (SEM) can be used as a tool to aid in carrying out power analyses. For many complex multivariate designs that are increasingly being employed, power analyses can be difficult to carry out, because the software available lacks sufficient flexibility.

Satorra and Saris developed a method for estimating the power of the likelihood ratio test for structural equation models. Whilst the Satorra and Saris approach is familiar to researchers who use the structural equation modelling approach, it is less well known amongst other researchers. The SEM approach can be equivalent to other multivariate statistical tests, and therefore the Satorra and Saris approach to power analysis can be used.

Methods

The covariance matrix, along with a vector of means, relating to the alternative hypothesis is generated. This represents the hypothesised population effects. A model (representing the null hypothesis) is then tested in a structural equation model, using the population parameters as input. An analysis based on the chi-square of this model can provide estimates of the sample size required for different levels of power to reject the null hypothesis.

Conclusions

The SEM based power analysis approach may prove useful for researchers designing research in the health and medical spheres.

Background

Structural equation modelling (SEM) was developed from work in econometrics (simultaneous equation models; see for example Wansbeek and Meijer

A necessarily very brief introduction to the logic of structural equation modelling is presented here – for a more thorough introduction to the basics of structural equation modelling the reader is directed towards one of the many good introductory texts, (Steiger has recently reviewed several such texts

The data to be analysed in a structural equation model comprise the observed covariance matrix

This formula gives the number of non-redundant elements in the covariance matrix (

However many models exclude the mean vector, and so the number of non-redundant elements in the covariance matrix is given by:

The model is a set of

^{-1}) - log |

The discrepancy function, multiplied by N - 1, follows a χ^{2 }distribution, with degrees of freedom (df) equal to ^{2 }test of the model.

In addition to the χ^{2 }test, standard errors (and hence t-values [although these values are referred to as

When ^{2 }will also be equal to zero.

It is possible to use the standard errors of the parameter estimates to test the statistical significance of the values of these parameters. In the next section, we shall see how it is also possible to use the χ^{2 }test to evaluate hypotheses regarding the value of these parameters in a model. This is most easily described using path diagrams as a tool to represent the parameters in a structural equation model.

The commonest representation of a structural equation model is in a path diagram. In a path diagram, a box represents a variable, a straight, single-headed arrow represents a regression path, and a curved arrow represents a correlation or covariance (in addition, an ellipse represents a latent, or unobserved variable; the methods described in this paper do not use latent variables, however the interested reader is directed towards a recent chapter by Bollen

Methods

In order to estimate a correlation, a covariance is estimated, which can then be standardised to give the correlation. If all variables are standardised, the covariance is equal to the correlation. The model that is estimated when a covariance is calculated can be represented in path diagram format as two variables, linked with a curved, double headed arrow, as shown in Figure 1. Each of the variables also has a double-headed arrow which represents the variance of the variable. There are three estimated parameters – the magnitude of the covariance, and the variance of each of the two variables

Covariance between x and y

**Covariance between x and y.**

A regression analysis with a single predictor is represented as Figure 2. This model is very similar to the previous one, however this time the arrow is not bi-directional. There are three parameters: the variance of ^{2}). Again, the estimate of the parameter, along with the confidence intervals, can be used to make inferences to the population and test for statistical significance.

A regression analysis, with one independant variable

**A regression analysis, with one independant variable.**

A multiple regression model is shown in Figure 3. Here there are 4 predictor variables (_{1 }to _{4}) which are used to predict an outcome variable (^{2 }in multiple regression. Second, the values of the individual regression estimates can be tested for statistical significance.

Any analysis of variance model can also be modelled as a regression model

The logic extends to the case of a multivariate ANOVA or regression, which has multiple outcome variables. The multivariate approach allows each of the individual paths to be estimated, and tested for statistical significance, however, it also allows groups of paths to be tested simultaneously, using the χ^{2 }difference test – the multivariate

When a model is restricted – that is not all paths are free to be estimated, it becomes over-identified. The difference between the data implied by the model and the observed data can be tested for statistical significance using the χ^{2 }test. Each restriction in the model adds 1 df, and this can be used to interpret the difference between the data implied by the model, and the observed data.

The simplest example of this is in the case of the correlation (Figure 1). Using data from a recent study (Wolfradt, Hempel and Miles ^{2 }test, with 1 df (the number of restrictions in the model). When this is done, the χ^{2 }associated with the test is equal to 0.65, which, with 1 df, has an associated probability of 0.421. The three approaches lead to equivalent conclusions and identical (within rounding error) probability values. However, the third approach – that of using the χ^{2 }test, gives an additional advantage – that it is possible to fix more than one parameter to a pre-specified value (usually, but not always, 0).

Multiple regression representation of path diagram

**Multiple regression representation of path diagram.**

In a multivariate regression, a set of outcomes is regressed on a set of predictors. As well as including a test of each parameter, a multivariate test is also carried out, testing the effect of each predictor on each set of outcome variables. Again using data from Wolfradt, et al., I carried out a multivariate regression (using the SPSS GLM procedure). The three predictor variables were mother's warmth, mother's rules, and mother's demands. The two outcomes were active coping and passive coping.

Four models were estimated, in the first, all parameters were free to be estimated. In the second, the two paths from warmth were restricted to zero, in the second, the two paths from rules were restricted to zero, and in the final model, the two paths from demands were restricted to zero. The first model has 0 df, and hence the implied and observed covariance matrices are always equal, and the χ^{2 }is equal to zero. Each of the other models has 2 df, because two restrictions were added to the model.

The results of the tests are shown in Table ^{2 }and associated p-value, shown in the second pair of columns in Table

Results of multivariate F test, and χ^{2 }difference test, for multivariate regression

GLM Results

SEM results

Multivariate F (df = 2, 263)

p

χ^{2 }(df = 2)

p

Rules

0.23

0.794

0.47

0.791

Demands

5.3

0.005

10.6

0.005

Warmth

10.7

<0.001

20.9

<0.001

In addition to relationships between variables being incorporated into structural equation models it is also possible to incorporate means (or, in the case of endogenous variables, intercepts). In the RAM specification, a mean is modelled as a triangle. The path diagram shown in Figure 4 effectively says "estimate the mean of the variable ^{2 }test. A simple example would be to place a restriction on the mean to a particular value – this would be the equivalent of a one sample t-test. A further example, shown in Figure 5 is a paired samples t-test. Here, the means are represented by the parameter a – both paths are given the same label, meaning that the two means are fixed to be equal. Again, this can be estimated with a χ^{2 }test.

A one way repeated measures ANOVA is also possible, using the same logic. The path diagram is shown in Figure 6. Here, the null hypotheses (that μ_{1 }= μ_{2 }= μ_{3}) is tested by fixing the parameters a, b and c to be equal to one another. It is also possible to carry out post-hoc tests, to compare each of the individual means.

Estimating the mean

**Estimating the mean.**

Representation of paired samplest test in path diagram format

**Representation of paired samplest test in path diagram format.**

Path diagram representation of one way repeated measures ANOVA with three independant variables

**Path diagram representation of one way repeated measures ANOVA with three independant variables.**

This model was tested using the first 20 cases from Wolfradt, et al ^{2 }statistic, and associated p-value. Again, the p-values from both of these approaches are very similar, demonstrating the equivalence of the two approaches.

Means and covariances of warmth, demands and rules (variances are shown in the diagonal)

Warmth (1)

0.37168421

Demands (2)

0.17187970

0.24575725

Rules (3)

-0.21171930

-0.05423559

0.24814035

μ

1.9700

2.6571

2.9733

Warmth (1)

Demands (2)

Rules (3)

Comparison of results from GLM test carried out using SPSS and test using SEM framework, using Mx.

GLM Test

Mx Test

Null Hypothesis

F (df)

p

χ^{2 }(df)

p

1. μ_{1 }= μ_{2 }= μ_{3 }

16.530 (2, 18)

0.000084

19.8 (2)

0.000050

2. μ_{1 }= μ_{2 }(rules = demands)

34.5 (1, 19)

0.00012

19.7 (1)

0.000009

3. μ_{1 }= μ_{3 }(rules = warmth)

19.29 (1, 19)

0.00031

13.31 (1)

0.00026

4. μ_{2 }= μ_{3 }(demands = warmth)

3.32 (1, 19)

0.084

3.06 (1)

0.080

Notes: ^{1}Result from multivariate test ^{2}Large number of decimal places have been given to illustrate similarity of probability values based on two methods.

The SEM analyses considered so far have only considered single groups, however, it is possible to carry out analyses across groups, where the parameters in two (or more) groups can be constrained to be equal. This multiple group approach can be used to analyse data from a mixed design, with a repeated measures factor, and an independent groups factor. The model is shown in Figure 7 (again using the data from Wolfradt, et al.). There are data from two groups, males and females. Each group has measures taken on two variables (x_{1 }and x_{2}). The parameter labelled _{1 }and _{2}, which in this case is the mean of _{2}, the parameter labelled _{1 }and x_{2}. There are three separate hypotheses to test:

1) Main effect of sex.

2) Main effect of type (_{1 }vs _{2})

3) Interaction effect of type and sex.

Again, using SPSS a mixed ANOVA can be carried out – the results of which are shown in Table

Structural equation model for a mixed ANOVA

**Structural equation model for a mixed ANOVA.**

Results of multivariate F test, and χ^{2 }difference test, for multivariate regression

Multivariate F (df = 1, 266)

p

χ^{2 }given by:

χ^{2 }(df = 1)

p

Sex

1.4

0.310

Model 2 - Model 1

0.73

0.393

Type (rules vs demands)

739.1

<0.001

Model 3 - Model 2

355.8

<0.001

Sex x Type

1.76

0.186

Model 3 - Model 0

1.76

0.184

Model 1:

Model 2:

Model 3: ^{2 }difference test between this model and model 2 provides the probability associated with the null hypothesis that there is no effect of rules vs demands (type).

Model 0: All parameters free. This model has zero df, and hence χ^{2 }will equal zero. The χ^{2 }difference test between this model and model 3 tests the interaction effect, although this will be equal to the χ^{2 }and df of model 3. This allows the difference between _{1 }and _{2 }to vary across gender, thereby testing the null hypothesis of no interaction effect.

These 4 models were estimated using the data from Wolfradt, et al. The type distinction was mother's rules versus mother's warmth. The results of each of these model tests are shown in Table

χ^{2}, df and p for models 0 to 4. Differences between these models are used to test hypotheses of main effects and interactions)

Model

χ^{2 }(df)

p

1 (

358.25 (3)

<0.0001

2 (

357.52 (2)

<0.0001

3 (

1.764 (1)

0.184

0 (no restrictions)

0 (0)

1.00

Power Analysis

The power of a statistical test is the probability that the test will find a statistically significant effect in a sample of size N, at a pre-specified level of alpha, given that an effect of a particular size exists in the population. Power of statistical tests is considered increasingly important in medical and social sciences, and most funding bodies insist that power analysis is used to determine the appropriate number of participants to use. It is increasingly recognised that power is not just a statistical or methodological issue, but an ethical issue. In medical trials, patients give their consent to take part in studies which they hope will help others in the future – if the study is underpowered, the probability of finding an effect may be minimal. The CONSORT statement (CONsolidated Standards Of Reporting Trials

[When using statistics for any amount of time, we become familiar with central distributions – these are distributions such as the ^{2}. However, these are the distribution of the statistic when the null hypothesis is true. To calculate the distribution when the null hypothesis is false, we must know the non-centrality parameter – the expected mean value of the distribution, and then examine the probability of finding a result which would be considered significant at our pre-specified level of alpha.]

Whilst it is possible in some statistical packages to calculate values for non-central distributions, it is not straightforward (although it is possible) to use these for power calculations.

There are a range of resources available for power analysis, including commercial books containing tables ^{2 }is relatively straightforward, using tables or books. However, the power to detect significant regression weights for the individual predictors is more difficult. Incorporating interactions into power analysis is also not straightforward.

A multivariate design can also have more power than a univariate design, but the power of the design is affected in complex ways by the correlation between the outcome variables

Multiwariate experimental design, with one independant variable (x) and three dependant variables (y1, y1, y2)

**Multiwariate experimental design, with one independant variable (x) and three dependant variables (y1, y1, y2).**

Power from SEM

An alternative way to approach power is to use a structural equation modelling framework. Satorra and Saris

First, a model is set up which matches the expected effect sizes in the study. From this the expected means and covariances are calculated. These data are treated as population data. Second, a model is set up where the parameters of interest are restricted to zero (or the values expected under the null hypothesis). This model is then estimated, and the χ^{2 }value of the discrepancy function is calculated. This can be used to calculate the non-centrality parameter, which is then used to estimate the probability of detecting a significant effect. It should be noted that the power estimates using the SEM approach are asymptotically equivalent to the GLM approach employed in OLS modelling, at smaller sample sizes, larger discrepancies will occur between the two methods

Much of this is automated within the structural equation modelling program Mx

Three examples of power analysis are presented. Example 1 shows how to use SEM to power a study to detect a correlation; the second is used for a mixed ANOVA, using a 2 × 2 design; the third shows how to power a study that uses a multivariate ANOVA / regression.

Example 1

In example 1, I estimate the probability of detecting a population correlation of size r = 0.3 ([see

Example 1.mx: Contains Mx syntax to run example 1.

Click here for file

Power to detect a population correlation r = 0.3, by three programs

Power

Mx (SEM approach)

GPower

nQuery

.25

18

19

21

.50

41

41

44

.75

74

73

76

.80

84

82

85

.90

113

109

113

.95

139

134

139

.99

197

188

195

Example 2: Multivariate ANOVA

The second example to be examined is the case of a multivariate ANOVA. It is well known that a multivariate design can be more powerful than a univariate design, though calculating how much more powerful can be difficult.

The simple multivariate design is shown in Figure 8. Here the effects of a single independent variable on three dependent variables are assessed. It is necessary to calculate the population covariance matrix for this example. The covariance matrix of the dependent variables is found by multiplying the vector of regression weights by its transpose, and adding the residual variances and covariances of the dependent variables.

The correlations between the dependent variables is therefore given by:

It is usually more straightforward to enter the values as fixed parameters into the SEM program, and estimate the population covariance matrix in this way.

This analysis can proceed via one of two means – three univariate analyses or one multivariate analysis. Calculation of power for the univariate analysis by conventional methods (power analysis table or program) is uncomplicated, however calculation of power for the multivariate approach is less so.

Power can be estimated for two different types of effects. First, the power to detect each of the univariate effects can be examined, second the power to detect the multivariate effect of

If all variances are standardised, the implied population covariance matrix is shown in Table

Standardised population covariance matrix for example 2

1.0

y1

0.5

1.0

y2

0.5

0.2

1.0

y3

0.5

0.2

0.2

1.0

y1

y2

y3

Power estimates can be derived for four separate tests. Three (univariate) tests of each parameter, and one multivariate test of the three parameters simultaneously. The Mx scripts are available for download in the additional files section example 2 – univariate.mx [see

Example 2 – univariate.mx: Contains Mx syntax to run example 2, univariate.

Click here for file

Example 2 – multivariate.mx: Contains Mx syntax to run example 2, multivariate.

Click here for file

Power for univariate and multivariate tests

Sample size required for 80% power

Mx

NQuery

Test that

28

26

Multivariate test (df = 3)

14

Note 1

Notes: 1 Power for a multivariate test cannot be calculated using standard software.

An extension of this analysis is to be able to relatively simply examine the effects on power of varying the correlation between the measures in a multivariate ANOVA. To carry out this analysis, the values of the population correlations between the outcome variables are altered, and the effects on the power noted. Table

Relationship between correlation between DVs and Sample size required for 80% power, in multivariate ANOVA.

Correlation between DVs

Sample size required for 80% Power

0.0

8

0.2

14

0.4

21

0.6

27

0.8

33

Example 3: Repeated Measures ANOVA: The effect of the correlation between variables

Repeated measures analysis presents a number of additional challenges to the researcher, in terms of both methodological issues

The three variables _{1}, _{2 }and _{3}, have population means of 0.8, 1.0 and 1.2 respectively, and variances of 1.0. The correlations between them were fixed to be equal in all models, and were fixed to be 0, 0.2, 0.8, or -0.2. A simple analysis was carried out, to investigate the sample size required to attain 80% power to detect a statistically significant difference, at p < 0.05, using an Mx script example 3 [see

Example 3.mx: Contains Mx syntax to run example 3, univariate.

Click here for file

The results of the analysis are shown in Table

Variation in power for repeated measures design given different level of correlation between measurements.

Size of correlations between variables

Sample size required for 80% power

0.0

125

0.2

101

0.5

65

0.8

29

-0.2

149

Concluding Remarks

This paper has presented an approach to power analysis developed by Saris and Satorra for structural equation models, that can be adapted to a very wide range of designs. The approach has three related applications.

First, in carrying out power analyses for studies, there are frequently complex relationships between different relationships in different studies. For example, the power to detect a difference in a repeated measures design is dependent upon the correlation between the variables. It may be possible to give power estimates based on 'best guess' and on upper and lower limits for these measures.

Second, for some types of studies, adequate power analysis is very complex using other approaches. To investigate, for example, the power to detect a significant difference between two partial correlations is difficult to calculate.

Third, and finally, in planning instruments to use in research. Many applied areas of research in health have multiple potential outcome measures; for example consider the range of instruments available for the assessment of quality of life. Many of these measures will have been used together in previous studies, and therefore the correlation between them may be known, or able to be estimated. The effect of these correlations on the power of the study can be investigated using this approach, which may affect the choice of measure.

For those unfamiliar with the package, and perhaps unfamiliar with SEM, the learning curve for Mx can be steep. The path diagram tool within Mx is extremely useful – the model is drawn, and restrictions can be added. The program will then use the diagram to produce the Mx syntax which can then be edited. This approach leads to faster, and more error-free, syntax. The author is happy to be contacted by email to attempt to assist with particular problems that readers may encounter. A document is available which describes how SPSS can be used to calculate the power, given the χ^{2 }of the model [see

Appendix: Description of adaptation of approach for other programs.

Click here for file

Finally, for readers who may be interested in further exploration of these issues, it should be noted that an alternative approach to estimating fit in SEM has been presented by MacCallum, Browne and Sugawara.

Competing Interests

None declared.

Acknowledgements

Thanks to Diane Miles and Thom Baguley, for their comments on earlier drafts of this paper, and to Keith Widaman and Frühling Rijsdijk who reviewed this paper, pointing out a number of areas where clarifications and improvements could be made.

Pre-publication history

The pre-publication history for this paper can be accessed here: