Open Access Software

Novel definition files for human GeneChips based on GeneAnnot

Francesco Ferrari1, Stefania Bortoluzzi2, Alessandro Coppe2, Alexandra Sirota3, Marilyn Safran4, Michael Shmoish5, Sergio Ferrari1, Doron Lancet3, Gian Antonio Danieli2 and Silvio Bicciato6*

Author Affiliations

1 Department of Biomedical Sciences, University of Modena and Reggio Emilia, via G. Campi 287, 41100, Modena, Italy

2 Department of Biology, University of Padova, via G. Colombo 3, 35131, Padova, Italy

3 Department of Molecular Genetics, The Weizmann Institute of Science, Rehovot 76100, Israel

4 Department of Biological Services, The Weizmann Institute of Science, Rehovot 76100, Israel

5 Bioinformatics Knowledge Unit, The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion, Israel Institute of Technology, Haifa, Israel

6 Department of Chemical Engineering Processes, University of Padova, via F. Marzolo 9, 35131, Padova, Italy

For all author emails, please log on.

BMC Bioinformatics 2007, 8:446  doi:10.1186/1471-2105-8-446

Published: 15 November 2007

Abstract

Background

Improvements in genome sequence annotation revealed discrepancies in the original probeset/gene assignment in Affymetrix microarray and the existence of differences between annotations and effective alignments of probes and transcription products. In the current generation of Affymetrix human GeneChips, most probesets include probes matching transcripts from more than one gene and probes which do not match any transcribed sequence.

Results

We developed a novel set of custom Chip Definition Files (CDF) and the corresponding Bioconductor libraries for Affymetrix human GeneChips, based on the information contained in the GeneAnnot database. GeneAnnot-based CDFs are composed of unique custom-probesets, including only probes matching a single gene.

Conclusion

GeneAnnot-based custom CDFs solve the problem of a reliable reconstruction of expression levels and eliminate the existence of more than one probeset per gene, which often leads to discordant expression signals for the same transcript when gene differential expression is the focus of the analysis. GeneAnnot CDFs are freely distributed and fully compliant with Affymetrix standards and all available software for gene expression analysis. The CDF libraries are available from http://www.xlab.unimo.it/GA_CDF webcite, along with supplementary information (CDF libraries, installation guidelines and R code, CDF statistics, and analysis results).