Establishing a major cause of discrepancy in the calibration of Affymetrix GeneChips
1 Department of Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ, UK
2 Department of Mathematical Sciences, University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ, UK
3 Department of Biochemistry, University College London, Gower Street, London, WC1E 6BT, UK
BMC Bioinformatics 2007, 8:195 doi:10.1186/1471-2105-8-195Published: 11 June 2007
Affymetrix GeneChips are a popular platform for performing whole-genome experiments on the transcriptome. There are a range of different calibration steps, and users are presented with choices of different background subtractions, normalisations and expression measures. We wished to establish which of the calibration steps resulted in the biggest uncertainty in the sets of genes reported to be differentially expressed.
Our results indicate that the sets of genes identified as being most significantly differentially expressed, as estimated by the z-score of fold change, is relatively insensitive to the choice of background subtraction and normalisation. However, the contents of the gene list are most sensitive to the choice of expression measure. This is irrespective of whether the experiment uses a rat, mouse or human chip and whether the chip definition is made using probe mappings from Unigene, RefSeq, Entrez Gene or the original Affymetrix definitions. It is also irrespective of whether both Present and Absent, or just Present, Calls from the MAS5 algorithm are used to filter genelists, and this conclusion holds for genes of differing intensities. We also reach the same conclusion after assigning genes to be differentially expressed using t-statistics, although this approach results in a large amount of false positives in the sets of genes identified due to the small numbers of replicates typically used in microarray experiments.
The major calibration uncertainty that biologists need to consider when analysing Affymetrix data is how their multiple probe values are condensed into one expression measure.