MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data
1 Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
2 San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA
3 Cincinnati Children’s Hospital Research Foundation, Cincinnati, OH, USA
4 Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
5 Department of Pharmacology, University of California, San Diego, La Jolla, CA, USA
BMC Bioinformatics 2014, 15:69 doi:10.1186/1471-2105-15-69Published: 12 March 2014
Mandatory deposit of raw microarray data files for public access, prior to study publication, provides significant opportunities to conduct new bioinformatics analyses within and across multiple datasets. Analysis of raw microarray data files (e.g. Affymetrix CEL files) can be time consuming, complex, and requires fundamental computational and bioinformatics skills. The development of analytical workflows to automate these tasks simplifies the processing of, improves the efficiency of, and serves to standardize multiple and sequential analyses. Once installed, workflows facilitate the tedious steps required to run rapid intra- and inter-dataset comparisons.
We developed a workflow to facilitate and standardize Meta-Analysis of Affymetrix Microarray Data analysis (MAAMD) in Kepler. Two freely available stand-alone software tools, R and AltAnalyze were embedded in MAAMD. The inputs of MAAMD are user-editable csv files, which contain sample information and parameters describing the locations of input files and required tools. MAAMD was tested by analyzing 4 different GEO datasets from mice and drosophila.
MAAMD automates data downloading, data organization, data quality control assesment, differential gene expression analysis, clustering analysis, pathway visualization, gene-set enrichment analysis, and cross-species orthologous-gene comparisons. MAAMD was utilized to identify gene orthologues responding to hypoxia or hyperoxia in both mice and drosophila. The entire set of analyses for 4 datasets (34 total microarrays) finished in ~ one hour.
MAAMD saves time, minimizes the required computer skills, and offers a standardized procedure for users to analyze microarray datasets and make new intra- and inter-dataset comparisons.