Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Jetset: selecting the optimal microarray probe set to represent a gene

Qiyuan Li1, Nicolai J Birkbak1, Balazs Gyorffy3, Zoltan Szallasi12 and Aron C Eklund1*

Author affiliations

1 Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark

2 Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology (CHIP@HST), Harvard Medical School, Boston, MA 02115, USA

3 Joint Research Laboratory of the Hungarian Academy of Sciences and the Semmelweis University, Semmelweis University 1st Dept of Pediatrics, H-1083 Budapest, Hungary

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2011, 12:474  doi:10.1186/1471-2105-12-474

Published: 15 December 2011

Abstract

Background

Interpretation of gene expression microarrays requires a mapping from probe set to gene. On many Affymetrix gene expression microarrays, a given gene may be detected by multiple probe sets, which may deliver inconsistent or even contradictory measurements. Therefore, obtaining an unambiguous expression estimate of a pre-specified gene can be a nontrivial but essential task.

Results

We developed scoring methods to assess each probe set for specificity, splice isoform coverage, and robustness against transcript degradation. We used these scores to select a single representative probe set for each gene, thus creating a simple one-to-one mapping between gene and probe set. To test this method, we evaluated concordance between protein measurements and gene expression values, and between sets of genes whose expression is known to be correlated. For both test cases, we identified genes that were nominally detected by multiple probe sets, and we found that the probe set chosen by our method showed stronger concordance.

Conclusions

This method provides a simple, unambiguous mapping to allow assessment of the expression levels of specific genes of interest.