Open Access Highly Accessed Open Badges Database

Microarray meta-analysis database (M2DB): a uniformly pre-processed, quality controlled, and manually curated human clinical microarray database

Wei-Chung Cheng1, Min-Lung Tsai2, Cheng-Wei Chang1, Ching-Lung Huang1, Chaang-Ray Chen1, Wun-Yi Shu3, Yun-Shien Lee45, Tzu-Hao Wang56, Ji-Hong Hong78, Chia-Yang Li1 and Ian C Hsu1*

Author Affiliations

1 Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu, 300, Taiwan

2 Institute of Athletics, National Taiwan Sport University, No. 16, Section 1, Shuan-Shih Road, Taichung, 404, Taiwan

3 Institute of Statistics, National Tsing Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu, 300, Taiwan

4 Department of Biotechnology, Ming Chuan University, 5 De Ming Rd., Gui Shan District, Taoyuan, 333, Taiwan

5 Genomic Medicine Research Core Laboratory, Chang Gung Memorial Hospital, No.5, Fuxing St., Taoyuan, 333, Taiwan

6 Department of Obstetrics and Gynecology, Lin-Kou Medical Center, Chang Gung Memorial Hospital and Chang Gung University, No.5, Fuxing St., Taoyuan, 333, Taiwan

7 Department of Radiation Oncology, Chang Gung Memorial Hospital, No.5, Fuxing St., Taoyuan, 333, Taiwan

8 Department of Medical Imaging and Radiological Science, Chang Gung University, No.259 Wen-Hwa 1st Road, Kwei-Shan, Taoyuan, 333, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2010, 11:421  doi:10.1186/1471-2105-11-421

Published: 10 August 2010



Over the past decade, gene expression microarray studies have greatly expanded our knowledge of genetic mechanisms of human diseases. Meta-analysis of substantial amounts of accumulated data, by integrating valuable information from multiple studies, is becoming more important in microarray research. However, collecting data of special interest from public microarray repositories often present major practical problems. Moreover, including low-quality data may significantly reduce meta-analysis efficiency.


M2DB is a human curated microarray database designed for easy querying, based on clinical information and for interactive retrieval of either raw or uniformly pre-processed data, along with a set of quality-control metrics. The database contains more than 10,000 previously published Affymetrix GeneChip arrays, performed using human clinical specimens. M2DB allows online querying according to a flexible combination of five clinical annotations describing disease state and sampling location. These annotations were manually curated by controlled vocabularies, based on information obtained from GEO, ArrayExpress, and published papers. For array-based assessment control, the online query provides sets of QC metrics, generated using three available QC algorithms. Arrays with poor data quality can easily be excluded from the query interface. The query provides values from two algorithms for gene-based filtering, and raw data and three kinds of pre-processed data for downloading.


M2DB utilizes a user-friendly interface for QC parameters, sample clinical annotations, and data formats to help users obtain clinical metadata. This database provides a lower entry threshold and an integrated process of meta-analysis. We hope that this research will promote further evolution of microarray meta-analysis.