Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Preferred analysis methods for Affymetrix GeneChips. II. An expanded, balanced, wholly-defined spike-in dataset

Qianqian Zhu127, Jeffrey C Miecznikowski25 and Marc S Halfon1346*

Author Affiliations

1 Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA

2 Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY 14214, USA

3 Department of Biology, State University of New York at Buffalo, Buffalo, NY 14260, USA

4 New York State Center of Excellence in Bioinformatics and the Life Sciences, Buffalo, NY 14203, USA

5 Department of Biostatistics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA

6 Department of Molecular and Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA

7 Current Address: Center for Human Genome Variation, Duke University, Durham, NC 27708, USA

For all author emails, please log on.

BMC Bioinformatics 2010, 11:285  doi:10.1186/1471-2105-11-285

Published: 27 May 2010

Abstract

Background

Concomitant with the rise in the popularity of DNA microarrays has been a surge of proposed methods for the analysis of microarray data. Fully controlled "spike-in" datasets are an invaluable but rare tool for assessing the performance of various methods.

Results

We generated a new wholly defined Affymetrix spike-in dataset consisting of 18 microarrays. Over 5700 RNAs are spiked in at relative concentrations ranging from 1- to 4-fold, and the arrays from each condition are balanced with respect to both total RNA amount and degree of positive versus negative fold change. We use this new "Platinum Spike" dataset to evaluate microarray analysis routes and contrast the results to those achieved using our earlier Golden Spike dataset.

Conclusions

We present updated best-route methods for Affymetrix GeneChip analysis and demonstrate that the degree of "imbalance" in gene expression has a significant effect on the performance of these methods.