Log on / register
Feedback | Support | My details
Open AccessMethodology article

Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression

Rodrigo Malig1 email, Cristian Varela2 email, Eduardo Agosin3 email and Francisco Melo1 email

1Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile

2The Australian Wine Research Institute, PO Box 197, Glen Osmond, Adelaide, SA 5064, Australia

3Departamento de Ingeniería Química y Bioprocesos, Facultad de Ingeniería, Pontificia Universidad Católica de Chile.Vicuña Mackenna 4860, Santiago, Chile

author email corresponding author email

BMC Bioinformatics 2006, 7:487doi:10.1186/1471-2105-7-487

Published: 4 November 2006

Abstract

Background

In this study, we present a robust and reliable computational method for tag-to-gene assignment in serial analysis of gene expression (SAGE). The method relies on current genome information and annotation, incorporation of several new features, and key improvements over alternative methods, all of which are important to determine gene expression levels more accurately. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome.

Results

We applied this method to the Saccharomyces cerevisiae genome, producing the most thorough and accurate annotation of potential virtual SAGE tags that is available today for this organism. The usefulness of this method is exemplified by the significant reduction of ambiguous cases in existing experimental SAGE data. In addition, we report new insights from the analysis of existing SAGE data. First, we found that experimental SAGE tags mapping onto introns, intron-exon boundaries, and non-coding RNA elements are observed in all available SAGE data. Second, a significant fraction of experimental SAGE tags was found to map onto genomic regions currently annotated as intergenic. Third, a significant number of existing experimental SAGE tags for yeast has been derived from truncated cDNAs, which are synthesized through oligo-d(T) priming to internal poly-(A) regions during reverse transcription.

Conclusion

We conclude that an accurate and unambiguous tag mapping process is essential to increase the quality and the amount of information that can be extracted from SAGE experiments. This is supported by the results obtained here and also by the large impact that the erroneous interpretation of these data could have on downstream applications.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.