This article is part of the supplement: EADGENE and SABRE Post-analyses Workshop
sigReannot: an oligo-set re-annotation pipeline based on similarities with the Ensembl transcripts and Unigene clusters
1 Sigenae UR875 Biométrie et Intelligence Artificielle, Institut National de la Recherche Agronomique (INRA), BP 52627, 31326 Castanet-Tolosan Cedex, France
2 Sigenae INRA-UMR SENAH, 35590 St-Gilles, France
3 INRA, UMR598 génétique animale F-35000 Rennes, France
4 AGROCAMPUS OUEST, UMR598 génétique animale F-35000 Rennes, France
BMC Proceedings 2009, 3(Suppl 4):S3 doi:10.1186/1753-6561-3-S4-S3Published: 16 July 2009
Microarray is a powerful technology enabling to monitor tens of thousands of genes in a single experiment. Most microarrays are now using oligo-sets. The design of the oligo-nucleotides is time consuming and error prone. Genome wide microarray oligo-sets are designed using as large a set of transcripts as possible in order to monitor as many genes as possible. Depending on the genome sequencing state and on the assembly state the knowledge of the existing transcripts can be very different. This knowledge evolves with the different genome builds and gene builds. Once the design is done the microarrays are often used for several years. The biologists working in EADGENE expressed the need of up-to-dated annotation files for the oligo-sets they share including information about the orthologous genes of model species, the Gene Ontology, the corresponding pathways and the chromosomal location.
The results of SigReannot on a chicken micro-array used in the EADGENE project compared to the initial annotations show that 23% of the oligo-nucleotide gene annotations were not confirmed, 2% were modified and 1% were added. The interest of this up-to-date annotation procedure is demonstrated through the analysis of real data previously published.
SigReannot uses the oligo-nucleotide design procedure criteria to validate the probe-gene link and the Ensembl transcripts as reference for annotation. It therefore produces a high quality annotation based on reference gene sets.