Mathematical design of prokaryotic clone-based microarrays
1 Wageningen Centre for Food Sciences. Diedenweg 20, 6700 AN Wageningen, The Netherlands
2 TNO Quality of Life. Utrechtseweg 48, 3700 AJ Zeist, The Netherlands
3 BioDetection Systems, Kruislaan 406, 1098 SM, Amsterdam, The Netherlands
4 Wageningen University and Research Centre, Systems and Control Group, Department of Agrotechnology and Food Sciences, Bornsesteeg 59, 6708 PD Wageningen, The Netherlands
5 HAS Den Bosch, Onderwijsboulevard 221, 5200 MA, Den Bosch, The Netherlands
BMC Bioinformatics 2005, 6:238 doi:10.1186/1471-2105-6-238Published: 28 September 2005
Clone-based microarrays, on which each spot represents a random genomic fragment, are a good alternative to open reading frame-based microarrays, especially for microorganisms for which the complete genome sequence is not available. Since the generation of a genomic DNA library is a random process, it is beforehand uncertain which genes are represented. Nevertheless, the genome coverage of such an array, which depends on different variables like the insert size and the number of clones in the library, can be predicted by mathematical approaches. When applying the classical formulas that determine the probability that a certain sequence is represented in a DNA library at the nucleotide level, massive amounts of clones would be necessary to obtain a proper coverage of the genome.
This paper describes the development of two complementary equations for determining the genome coverage at the gene level. The first equation predicts the fraction of genes that is represented on the array in a detectable way and cover at least a set part (the minimal insert coverage) of the genomic fragment by which these genes are represented. The higher this minimal insert coverage, the larger the chance that changes in expression of a specific gene can be detected and attributed to that gene. The second equation predicts the fraction of genes that is represented in spots on the array that only represent genes from a single transcription unit, which information can be interpreted in a quantitative way.
Validation of these equations shows that they form reliable tools supporting optimal design of prokaryotic clone-based microarrays.