Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

Improving reliability and absolute quantification of human brain microarray data by filtering and scaling probes using RNA-Seq

Jeremy A Miller, Vilas Menon, Jeff Goldy, Ajamete Kaykas, Chang-Kyu Lee, Kimberly A Smith, Elaine H Shen, John W Phillips, Ed S Lein and Mike J Hawrylycz*

Author Affiliations

Allen Institute for Brain Science, 551 N 34th Street, Seattle, WA 98103, USA

For all author emails, please log on.

BMC Genomics 2014, 15:154  doi:10.1186/1471-2164-15-154

Published: 24 February 2014

Abstract

Background

High-throughput sequencing is gradually replacing microarrays as the preferred method for studying mRNA expression levels, providing nucleotide resolution and accurately measuring absolute expression levels of almost any transcript, known or novel. However, existing microarray data from clinical, pharmaceutical, and academic settings represent valuable and often underappreciated resources, and methods for assessing and improving the quality of these data are lacking.

Results

To quantitatively assess the quality of microarray probes, we directly compare RNA-Seq to Agilent microarrays by processing 231 unique samples from the Allen Human Brain Atlas using RNA-Seq. Both techniques provide highly consistent, highly reproducible gene expression measurements in adult human brain, with RNA-Seq slightly outperforming microarray results overall. We show that RNA-Seq can be used as ground truth to assess the reliability of most microarray probes, remove probes with off-target effects, and scale probe intensities to match the expression levels identified by RNA-Seq. These sequencing scaled microarray intensities (SSMIs) provide more reliable, quantitative estimates of absolute expression levels for many genes when compared with unscaled intensities. Finally, we validate this result in two human cell lines, showing that linear scaling factors can be applied across experiments using the same microarray platform.

Conclusions

Microarrays provide consistent, reproducible gene expression measurements, which are improved using RNA-Seq as ground truth. We expect that our strategy could be used to improve probe quality for many data sets from major existing repositories.

Keywords:
Allen Brain Atlas; Microarray; RNA-Seq; High-throughput sequencing; Transcriptome profiling; Reliability; Gene expression; Brain