Linking microarray reporters with protein functions
1 Nutrigenomics Consortium, Top Institute Food and Nutrition, Wageningen, The Netherlands
2 BiGCaT Bioinformatics, University Maastricht, Maastricht, The Netherlands
3 Department of Human Biology, Nutrition and Toxicology Research Institute Maastricht (NUTRIM), University Maastricht, Maastricht, The Netherlands
4 Laboratory of Experimental & Molecular Cardiology, The Interuniversity Cardiovascular institute of the Netherlands (ICIN), University Maastricht, Maastricht, The Netherlands
BMC Bioinformatics 2007, 8:360 doi:10.1186/1471-2105-8-360Published: 26 September 2007
The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways.
This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways.
Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.