Strong position-dependent effects of sequence mismatches on signal ratios measured using long oligonucleotide microarrays
- Equal contributors
1 Biosciences Building, School of Biological Sciences, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK
2 Faculty of Life Sciences, University of Manchester, Smith Building, Oxford Road, Manchester, M13 9PT, UK
3 School of Computer Science, Kilburn Building, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
4 North West Institute of Bio-Health Informatics, School of Medicine, Stopford Building, Oxford Road, Manchester, M13 9PT, UK
BMC Genomics 2008, 9:317 doi:10.1186/1471-2164-9-317Published: 3 July 2008
Microarrays are an important and widely used tool. Applications include capturing genomic DNA for high-throughput sequencing in addition to the traditional monitoring of gene expression and identifying DNA copy number variations. Sequence mismatches between probe and target strands are known to affect the stability of the probe-target duplex, and hence the strength of the observed signals from microarrays.
We describe a large-scale investigation of microarray hybridisations to murine probes with known sequence mismatches, demonstrating that the effect of mismatches is strongly position-dependent and for small numbers of sequence mismatches is correlated with the maximum length of perfectly matched probe-target duplex. Length of perfect match explained 43% of the variance in log2 signal ratios between probes with one and two mismatches. The correlation with maximum length of perfect match does not conform to expectations based on considering the effect of mismatches purely in terms of reducing the binding energy. However, it can be explained qualitatively by considering the entropic contribution to duplex stability from configurations of differing perfect match length.
The results of this study have implications in terms of array design and analysis. They highlight the significant effect that short sequence mismatches can have upon microarray hybridisation intensities even for long oligonucleotide probes.
All microarray data presented in this study are available from the GEO database , under accession number [GEO: GSE9669]