Abstract
Background
Metaanalysis is increasingly used as a key source of evidence synthesis to inform clinical practice. The theory and statistical foundations of metaanalysis continually evolve, providing solutions to many new and challenging problems. In practice, most metaanalyses are performed in general statistical packages or dedicated metaanalysis programs.
Results
Herein, we introduce MetaAnalyst, a novel, powerful, intuitive, and free metaanalysis program for the metaanalysis of a variety of problems. MetaAnalyst is implemented in C# atop of the Microsoft .NET framework, and features a graphical user interface. The software performs several metaanalysis and metaregression models for binary and continuous outcomes, as well as analyses for diagnostic and prognostic test studies in the frequentist and Bayesian frameworks. Moreover, MetaAnalyst includes a flexible tool to edit and customize generated metaanalysis graphs (e.g., forest plots) and provides output in many formats (images, Adobe PDF, Microsoft Wordready RTF). The software architecture employed allows for rapid changes to be made to either the Graphical User Interface (GUI) or to the analytic modules.
We verified the numerical precision of MetaAnalyst by comparing its output with that from standard metaanalysis routines in Stata over a large database of 11,803 metaanalyses of binary outcome data, and 6,881 metaanalyses of continuous outcome data from the Cochrane Library of Systematic Reviews. Results from analyses of diagnostic and prognostic test studies have been verified in a limited number of metaanalyses versus MetaDisc and MetaTest. Bayesian statistical analyses use the OpenBUGS calculation engine (and are thus as accurate as the standalone OpenBUGS software).
Conclusion
We have developed and validated a new program for conducting metaanalyses that combines the advantages of existing software for this task.
Background
Systematic reviews of randomized controlled trials or epidemiological studies have emerged as a key source of evidence across medical disciplines [1,2]. A central component of many systematic reviews is metaanalysis, the quantitative synthesis of information across methodologically and epidemiologically similar studies that address the same research question. Metaanalysis increases the statistical power to detect effects for which individual studies may be underpowered. Reciprocally, in the absence of statistically significant effects, it can increase the power to exclude clinically important differences. Most importantly, metaanalytic methodologiesparticularly metaregression, provide the framework to quantify and explore betweenstudy heterogeneity (betweenstudy dissimilarity) [3].
Metaanalysis is usually performed using computer programs. Herein we present a new program for the Microsoft Windows operating system, MetaAnalyst, and report on its testing versus other widely used and accepted software. MetaAnalyst features an easy and intuitive graphical user interface and has a spreadsheetbased layout. The program was developed by the Tufts Evidencebased Practice Center under contract with the US Agency for Healthcare Research and Quality (AHRQ). It is available for use by the AHRQdesignated Evidencebased Practice Centers for performing metaanalyses in their evidence reports. Additionally, the software is now being made available to all interested investigators worldwide at no cost. The latest version can be obtained from http://tuftscaes.org/meta_analyst/ webcite (last accessed 11/12/2009).
Existing software
Metaanalysis can be performed in various general statistical and numerical analysis environments (e.g., Stata, R/Splus, Octave/MATLAB), or in dedicated programs (e.g., the Microsoft DOS version of MetaAnalyst, Comprehensive MetaAnalysis, RevMan, MIX [4]). A recent overview [5] compared the features of 6 graphical user interface packages dedicated to metaanalysis.
Two of the most popular dedicated metaanalysis packages are Comprehensive MetaAnalysis and MIX. The former is a commercial product, costing $1295 for a licence, while the latter is a free plugin for the commercial Microsoft Excel package. Both feature intuitive, spreadsheet interfaces for data entry, and provide numerical and graphical output in standard formats. However, both implement only basic methods for the metaanalysis of binary and continuous data (Table 1). In addition, they do not handle metaanalysis of diagnostic and prognostic test studies: for basic metaanalysis of diagnostic test studies, one would have to use yet another specialized program, e.g., MetaDiSc [6] or MetaTest (Joseph Lau). To perform more advanced analyses (such as random effects metaregression [7], bivariate diagnostic test metaanalysis [810], or Bayesian analyses) one would have to carefully specify complicated model statements in a general statistical programming environment.
Table 1. Comparison of MetaAnalysis Software
As shown in Table 1, in MetaAnalyst strives to combine the easeofuse of standalone metaanalysis packages with the advanced analytic capabilities offered by general statistical packages.
Implementation
MetaAnalyst is written primarily in C#, and runs atop Microsoft's .NET framework. The .NET framework allowed rapid development of an intuitive Windowsbased user interface. Data entry and management follows the familiar Microsoft Excel^{©}like spreadsheet layout. We use specialized opensource software libraries to create plots (Zedgraph library for graphs and charts [11]) and reports (iTextSharp [12] document generation toolkit). Although the .NET Common Language Runtime is an open standard, and therefore theoretically platform independent, MetaAnalyst currently runs only on the Windows operating system.
The design of MetaAnalyst is based on the ModelViewControl design pattern [13], which emphasizes separating the interface from the underlying algorithmic models. This decoupling of the 'backend' from the 'frontend' allows rapid changes to be made to the Graphical User Interface (GUI) without reworking the underlying statistical routines. Indeed, for testing purposes (discussed at length in the Results section), we bypass the frontend entirely and script tests via calls to the backend. We plan on allowing advanced users to utilize this functionality directly, e.g., to run batch analyses. For example, Figure 1 displays sample code to perform a metaanalysis of binary data with the Peto method on the data contained in "my_data.csv".
Figure 1. Example call to the backend from scripting environment.
For Bayesian analyses, we invoke OpenBUGS [14] on the backend and then present the output to the user via the MetaAnalyst interface. Using OpenBUGS for Bayesian analyses provides two major benefits: OpenBUGS is a popular piece of software that has been thoroughly tested by the statistical community. Second, it incorporates a programming language that enables us to implement in MetaAnalyst any model that can be fit in OpenBugs. We use IronPython [15], an implementation of the Python [16] programming language that runs on the .NET virtual machine to facilitate rapid data processing and text manipulation. This is particularly useful for file I/O and for our interaction with the OpenBUGS library, which requires us to generate model, data and initial value text files dynamically and write them to disk (see Figure 2).
Figure 2. Schematic depiction of MetaAnalyst/BUGS interaction.
Results
Methods Implemented
Statistical methods
Unlike other dedicated metaanalysis packages, MetaAnalyst integrates the capabilities to perform metaanalyses of binary or continuous outcomes and diagnostic or prognostic tests, combining the functionality of software such as MIX and MetaDiSc. For each of these types of outcomes, we have implemented standard metaanalysis routines, as well as some more advanced ones. Table 2 summarizes the analyses MetaAnalyst can currently perform.
Table 2. Methods available in MetaAnalyst
Currently (as of version Beta 3.1) MetaAnalyst implements only one Bayesian model for each type of data (binary, continuous and diagnostic; for model details see http://tuftscaes.org/meta_analyst/AppendixA.html webcite, last accessed 11/12/2009). Because of the way we have interfaced MetaAnalyst with OpenBUGS (Figure 2) we can easily add additional models.
For detailed explanation of the statistical routines used, including handling of zerocells, please see our methods document at: http://tuftscaes.org/meta_analyst/metaanalyst_methods.html webcite (last accessed 1112/2009).
Exploratory and sensitivity analyses
MetaAnalyst automates cumulative metaanalysis, leaveoneout sensitivity analysis and subgroup analysis. In cumulative metaanalysis one reorders the studies according to a covariate (e.g., increasing year of publication) and reestimates the summary effect at each step, i.e., each time a new study is added. It is typically a graphical analysis that plots the aggregate overall estimate at each step [17,18]. This elucidates the evolution, or pattern, of evidence over time. Leaveoneout analyses explore the influence of individual studies as follows: If there are n studies in the metaanalysis at hand, plot n summary estimates, each corresponding to leaving one of the n studies out of the calculation. This plot illuminates influential studies, as when they are left out of the analysis, the overall estimate will be notably perturbed. Subgroup analysis is a tool for exploring the effects of a treatment on population subgroups, e.g., females older than 50 years old versus younger women. This is done by conducting separate metaanalyses on the respective subgroups and plotting overall estimates for both.
MetaAnalyst generates different graphical output suitable to the data at hand. Figures 3 and 4 summarize the plots available.
User Interface
The GUI comprises two tabs; one for data entry and editing and the other for displaying the results of analyses (see Figure 5). The help panel on the bottom of the tab is always available and provides contextspecific explanations and instructions for the user. The main data manipulation tool is a spreadsheet with a standard dataentry interface. Data can either be entered by hand, or imported from Excel (xls) or Comma Separated Value (csv) files via an import 'wizard'. MetaAnalyst uses its own custom data file format to save data, which bundles commaseparated study data with some metadata (for example, data type, covariate names, etc.) about the metaanalysis.
Figure 5. Screenshot of the Data Entry screen in MetaAnalyst.
While editing data, the outcome and corresponding confidence intervals are updated dynamically. The outcome metric can be changed via a rightclick menu (e.g., from odds ratio to risk difference) and the outcomes will be recomputed automatically. While Figure 5 shows binary data being manipulated, the interface is analogous for continuous and diagnostic data.
The user can also provide additional numeric and string variables that describe characteristics of the analyzed studies. By convention, useradded numeric variables are termed covariates. Typically, covariates are used as explanatory variables in metaregression analyses. To use covariates in the analyses a user has to activate them (by right clicking on the corresponding covariate name and choosing the respective option). When at least one covariate is activated, fixed and random effects metaregression becomes available as an analytic option in the program's menus. If the covariate is excluded from the analyses, metaregression is not available as an option. To perform metaregression with several explanatory variables, the user simply activates the corresponding covariates.
Useradded string variables are termed labels and are typically used to provide textual descriptions, or to specify subgroups for subgroup analyses. When a dataset contains at least one label, the program allows the user to perform subgroup analyses according to the categories defined by the selected label. The subgroups are automatically named according to the contents of the label. Labels are ignored in metaregression analyses (though displayed in plots when pertinent). For example, suppose the user adds a label "country". Further suppose that studies 1 and 2 are labelled "United States" while studies 3 and 4 are labelled "India". Then a subgroup analysis performed using the "country" label will automatically plot the overall (pooled) effects for the studies that were labelled as being conducted in "India" (1 and 2) and the pooled estimate will also be plotted for those labelled as being conducted in the "United States" (3 and 4).
Studies can be included and excluded from a particular analysis by selecting/deselecting the corresponding checkbox in the first column. Once the data is entered, the outcome metric set and the studies and covariates desired to be excluded from the analysis (if any) are deselected, users can perform an analysis via the drop down 'Analysis' menu, at which point they will be prompted with the dialogue shown in Figure 6. Here users can pick the model to be used in the analysis, and specify the parameters for the selected model. The program provides editable default values for many of the options.
Figure 6. Binary analysis specifications.
After the analysis runs to completion, the results will be displayed in the results tab (Figure 7).
Figure 7. Results tab.
The lefthand side of the results tab shows a tree populated with collapsible parent nodes for each analysis that has been run ("Analysis 1" and "Analysis 2", in the figure). Each of these parent nodes have child nodes corresponding to the various tabular and graphical outputs associated with the analysis. Clicking on one of these child nodes, e.g., "Forest Plot", scrolls the corresponding graphic into view.
All of the tables and graphics can be copied to the user's clipboard via a rightclick menu (and subsequently pasted into other programs). Additionally, the tables and any text therein can be edited and formatted via an embedded editor. Moreover, the user can edit forest plots using the forest plot editor tool, as seen in Figure 8. Using this interface, the user can change which columns are displayed in the forest plot (e. g., study sizes), and in which order, as well as the symbols used for point and overall estimates, the scale of the plot (minimum and maximum) values and more.
Figure 8. Forest Plot Editing.
In addition to the interactive output shown in Figures 7 and 8, Adobe^{© }PDF and Microsoft^{© }Word ready Rich Text Format report files are generated and saved. These include all of the tabular and graphical output. Finally, the graphics themselves are automatically output to separate image (.PNG) files, for use in other programs. Our aim was to provide the output in as many formats as possible, in order to provide flexibility to the user.
Validation
To validate the computational results, we systematically tested MetaAnalyst vs. the results from metan version 1.86 in Stata. We compared the output of the programs in 11,803 metaanalyses of binary outcomes and 6,881 metaanalyses of continuous outcomes from issue 4 of the Cochrane Library of Systematic Reviews, 2005. This database of metaanalyses was described elsewhere [19], and it includes metaanalyses that have very different characteristics.
Over the 11,803 analyses (over all methods and all metrics) for binary data and 6,881 analyses over continuous data, we recorded the minimum of the absolute and normalized differences between the outputs from Stata and MetaAnalyst, where the normalized difference is defined as Δ_{rel }=  Θ_{Stata } Θ_{MA}/Θ_{Stata}. Θ is any of the numerical output of the program such as a summary effect size for each metaanalysis metric and method, its variance, and the Q, τ^{2 }and I^{2 }statistics (for random effects models). We aimed to identify differences that are beyond those introduced by machine (im)precision. If there are such differences, both the absolute and normalized difference between the two numbers will be relatively large. When the numbers in question are very large, the absolute difference might be relatively large (merely because of machine imprecision) whereas their normalized difference will be very small. Reciprocally, for small magnitudes the normalized difference can be relatively large (in the absence of computational errors), while the absolute difference is very small.
Over the binary set of metaanalyses, the maximum discrepancy was 2.9 × 10^{6}. For continuous data analyses, the maximum discrepancy was 7.4 × 10^{5}. These maximum discrepancies appeared in metaanalyses with extreme between study heterogeneity, and are ascribed to rounding errors (version 1.86 of metan does not use double precision for all internal calculations as MetaAnalyst does).
As previously discussed, Bayesian analyses are run through OpenBUGS, and so the output is as thoroughly tested as OpenBUGS.
Testing for diagnostic test accuracy analyses is not as extensive, because we have not found a suitable reference scripting environment to test MetaAnalyst output against. However, the simple diagnostic test methods are based on weighted proportions (sensitivity, specificity), relative risks (likelihood ratios), odds ratios and regression (SROC, metaregression). These methods use the same computational algorithms as those for binary data and so have been tested. Two diagnostic methods remain to be sufficiently tested; bivariate metaanalysis of sensitivity and specificity for diagnostic tests and random effects SROC. These analyses are flagged as not thoroughly checked when they are requested from the user. However, these will soon be reconfigured to run in OpenBUGS so that they will be validated as well.
Discussion
In order to attain widespread use, metaanalysis software must be easy to use. In particular, requiring that users learn an entire language to run their analyses will prohibit general adaptation of a program. Dedicated metaanalysis programs such as MIX, Comprehensive Metaanalysis, and MetaDiSc are appealing due to their small learning curve. On the other hand, by their very nature, such programs are less flexible than general statistical packages. For example, they have no scripting functionality, which precludes their use for largescale empirical research or simulation studies. Further, they are not able to perform advanced analyses, such as bivariate diagnostic test metaanalyses, because they cannot maximize difficult likelihood functions, and they cannot be readily extended to include additional analytic options.
Conclusion
MetaAnalyst mitigates several of the weaknesses inherent to dedicated metaanalysis packages. It incorporates their easeofuse, while providing advanced analytic methods that can be implemented in packages such as Stata, R and SAS by a statistical programmer.
The current version of MetaAnalyst is made available free of charge to interested researchers. It runs on any version of Windows that is compatible with the .NET platform (comprising Windows 98, ME, NT 4.0, 2000, XP and Vista). We have already started development of a crossplatform completely opensource version of the software that uses the R statistical language, and will be readily modifiable and extendable by any interested party http://www.github.com/bwallace/OpenMetaanalyst webcite.
Availability and requirements
An installer file for the latest version of MetaAnalyst has been provided as an additional/supplemental file for the peerreviewers [Additional File 1]. Alternatively, the latest version can be obtained from http://tuftscaes.org/meta_analyst/ webcite (last accessed 11/12/2009). MetaAnalyst is made readily available to any scientist wishing to use it for noncommercial purposes, without any restriction (including the need for a material transfer agreement).
Additional file 1. MetaAnalyst installer. This is a Windows installer for the MetaAnalyst software described in this manuscript.
Format: MSI Size: 8.3MB Download file
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
BW developed MetaAnalyst, porting some code to the program that was originally written by CS. TT and BW designed the testing of MetaAnalyst, and BW performed all analyses. All authors interpreted the results. BW wrote the first draft of the manuscript which was critically commented on by all other authors. All authors have read and approved the final manuscript.
Acknowledgements
Funding: MetaAnalyst has been developed with funding from the National Center for Research Resources (NCRR, grant number R33 RR17109) and the Agency for Healthcare Research and Quality (AHRQ, contract number 290020022) and R01HS018574, a contract from AHRQ (No 290020022),
References

Lau J, Schmid CH, Chalmers TC: Cumulative metaanalysis of clinical trials builds evidence for exemplary medical care.
J Clin Epidemiol 1995, 48:4557. PubMed Abstract  Publisher Full Text

Mosteller F, Colditz GA: Understanding research synthesis (metaanalysis).
Annu Rev Public Health 1996, 17:123. PubMed Abstract  Publisher Full Text

Lau J, Ioannidis JP, Schmid CH: Summing up evidence: one answer is not always enough.
Lancet 1998, 351:12327. PubMed Abstract  Publisher Full Text

Bax L, Yu LM, Ikeda N, Tsuruta H, Moons KG: Development and validation of MIX: comprehensive free software for metaanalysis of causal research data.
BMC Med Res Methodol 2006, 6:50. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Bax L, Yu LM, Ikeda N, Moons KG: A systematic comparison of software dedicated to metaanalysis of causal studies.
BMC Med Res Methodol 2007, 7:40. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A: MetaDiSc: a software for metaanalysis of test accuracy data.
BMC Med Res Methodol 2006, 6:31. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

van Houwelingen HC, Arends LR, Stijnen T: Advanced methods in metaanalysis: multivariate approach and metaregression.
Stat Med 2002, 21:589624. PubMed Abstract  Publisher Full Text

Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA: A unification of models for metaanalysis of diagnostic accuracy studies.
Biostatistics 2007, 8:23951. PubMed Abstract  Publisher Full Text

Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH: Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews.
J Clin Epidemiol 2005, 58:98290. PubMed Abstract  Publisher Full Text

Rutter CM, Gatsonis CA: A hierarchical regression approach to metaanalysis of diagnostic test accuracy evaluations.
Stat Med 2001, 20:286584. PubMed Abstract  Publisher Full Text

Champion J: Zedgraph. [http://zedgraph.org] webcite
2009.
Ref Type: Electronic Citation

iTextSharp [http://itextsharp.sourceforge.net/] webcite
Ref Type: Electronic Citation

Gamma E, Helm R, Johnson R, Vlissides J: Design Patterns: Elements of Reusable ObjectOriented Software. AddisonWesley; 1995.

IronPython [http://www.codeplex.com/IronPython] webcite
2009.
Ref Type: Electronic Citation

van Rossum G: Python programming language. [http://www.python.org] webcite
2009.
Ref Type: Electronic Citation

Lau J, Antman EM, JimenezSilva J, Kupelnick B, Mosteller F, Chalmers TC: Cumulative metaanalysis of therapeutic trials for myocardial infarction.
N Engl J Med 1992, 327:24854. PubMed Abstract  Publisher Full Text

Lau J, Schmid CH, Chalmers TC: Cumulative metaanalysis of clinical trials builds evidence for exemplary medical care.
J Clin Epidemiol 1995, 48:4557. PubMed Abstract  Publisher Full Text

Ioannidis JP, Trikalinos TA: The appropriateness of asymmetry tests for publication bias in metaanalyses: a large survey.
CMAJ 2007, 176:109196. PubMed Abstract  PubMed Central Full Text

Lau J, Ioannidis JP, Terrin N, Schmid CH, Olkin I: The case of the misleading funnel plot.
BMJ 2006, 333:597600. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Terrin N, Schmid CH, Lau J: In an empirical evaluation of the funnel plot, researchers could not visually identify publication bias.
J Clin Epidemiol 2005, 58:894901. PubMed Abstract  Publisher Full Text
Prepublication history
The prepublication history for this paper can be accessed here: