Log on / register
Feedback | Support | My details
Open AccessMethodology article

Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: High-resolution annotation for microarrays

Jun Lu1 email, Joseph C Lee1 email, Marc L Salit2 email and Margaret C Cam1 email

Genomics Core Laboratory, National Institute of Diabetes & Digestive & Kidney Diseases, National Institutes of Health, Bethesda, MD 20892 USA

Chemical Science and Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899 USA

author email corresponding author email

BMC Bioinformatics 2007, 8:108doi:10.1186/1471-2105-8-108

Published: 29 March 2007

Abstract

Background

Extracting biological information from high-density Affymetrix arrays is a multi-step process that begins with the accurate annotation of microarray probes. Shortfalls in the original Affymetrix probe annotation have been described; however, few studies have provided rigorous solutions for routine data analysis.

Results

Using AceView, a comprehensive human transcript database, we have reannotated the probes by matching them to RNA transcripts instead of genes. Based on this transcript-level annotation, a new probe set definition was created in which every probe in a probe set maps to a common set of AceView gene transcripts. In addition, using artificial data sets we identified that a minimal probe set size of 4 is necessary for reliable statistical summarization. We further demonstrate that applying the new probe set definition can detect specific transcript variants contributing to differential expression and it also improves cross-platform concordance.

Conclusion

We conclude that our transcript-level reannotation and redefinition of probe sets complement the original Affymetrix design. Redefinitions introduce probe sets whose sizes may not support reliable statistical summarization; therefore, we advocate using our transcript-level mapping redefinition in a secondary analysis step rather than as a replacement. Knowing which specific transcripts are differentially expressed is important to properly design probe/primer pairs for validation purposes. For convenience, we have created custom chip-description-files (CDFs) and annotation files for our new probe set definitions that are compatible with Bioconductor, Affymetrix Expression Console or third party software.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.