Novel definition files for human GeneChips based on GeneAnnot
1 Department of Biomedical Sciences, University of Modena and Reggio Emilia, via G. Campi 287, 41100, Modena, Italy
2 Department of Biology, University of Padova, via G. Colombo 3, 35131, Padova, Italy
3 Department of Molecular Genetics, The Weizmann Institute of Science, Rehovot 76100, Israel
4 Department of Biological Services, The Weizmann Institute of Science, Rehovot 76100, Israel
5 Bioinformatics Knowledge Unit, The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion, Israel Institute of Technology, Haifa, Israel
6 Department of Chemical Engineering Processes, University of Padova, via F. Marzolo 9, 35131, Padova, Italy
BMC Bioinformatics 2007, 8:446 doi:10.1186/1471-2105-8-446Published: 15 November 2007
Improvements in genome sequence annotation revealed discrepancies in the original probeset/gene assignment in Affymetrix microarray and the existence of differences between annotations and effective alignments of probes and transcription products. In the current generation of Affymetrix human GeneChips, most probesets include probes matching transcripts from more than one gene and probes which do not match any transcribed sequence.
We developed a novel set of custom Chip Definition Files (CDF) and the corresponding Bioconductor libraries for Affymetrix human GeneChips, based on the information contained in the GeneAnnot database. GeneAnnot-based CDFs are composed of unique custom-probesets, including only probes matching a single gene.
GeneAnnot-based custom CDFs solve the problem of a reliable reconstruction of expression levels and eliminate the existence of more than one probeset per gene, which often leads to discordant expression signals for the same transcript when gene differential expression is the focus of the analysis. GeneAnnot CDFs are freely distributed and fully compliant with Affymetrix standards and all available software for gene expression analysis. The CDF libraries are available from http://www.xlab.unimo.it/GA_CDF webcite, along with supplementary information (CDF libraries, installation guidelines and R code, CDF statistics, and analysis results).