Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Linking microarray reporters with protein functions

Stan Gaj123*, Arie van Erk24, Rachel IM van Haaften2 and Chris TA Evelo2

Author Affiliations

1 Nutrigenomics Consortium, Top Institute Food and Nutrition, Wageningen, The Netherlands

2 BiGCaT Bioinformatics, University Maastricht, Maastricht, The Netherlands

3 Department of Human Biology, Nutrition and Toxicology Research Institute Maastricht (NUTRIM), University Maastricht, Maastricht, The Netherlands

4 Laboratory of Experimental & Molecular Cardiology, The Interuniversity Cardiovascular institute of the Netherlands (ICIN), University Maastricht, Maastricht, The Netherlands

For all author emails, please log on.

BMC Bioinformatics 2007, 8:360  doi:10.1186/1471-2105-8-360

Published: 26 September 2007

Abstract

Background

The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways.

Results

This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways.

Conclusion

Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.