Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Can Zipf's law be adapted to normalize microarrays?

Tim Lu1, Christine M Costello13, Peter JP Croucher1, Robert Häsler1, Günther Deuschl2 and Stefan Schreiber1*

Author Affiliations

1 Department of Medicine, Christian-Albrechts-University, Kiel, Germany

2 Department of Neurology, University Hospital Schleswig Holstein, Kiel, Germany

3 The Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Ireland

For all author emails, please log on.

BMC Bioinformatics 2005, 6:37  doi:10.1186/1471-2105-6-37

Published: 23 February 2005

Abstract

Background

Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented.

Results

Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques.

Conclusion

Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays).