Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Normalization of oligonucleotide arrays based on the least-variant set of genes

Stefano Calza1,2 email, Davide Valentini1 email and Yudi Pawitan1 email

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

Department of Biomedical Sciences and Biotechnology, University of Brescia, Italy

author email corresponding author email

BMC Bioinformatics 2008, 9:140doi:10.1186/1471-2105-9-140

Published: 5 March 2008

Abstract

Background

It is well known that the normalization step of microarray data makes a difference in the downstream analysis. All normalization methods rely on certain assumptions, so differences in results can be traced to different sensitivities to violation of the assumptions. Illustrating the lack of robustness, in a striking spike-in experiment all existing normalization methods fail because of an imbalance between up- and down-regulated genes. This means it is still important to develop a normalization method that is robust against violation of the standard assumptions

Results

We develop a new algorithm based on identification of the least-variant set (LVS) of genes across the arrays. The array-to-array variation is evaluated in the robust linear model fit of pre-normalized probe-level data. The genes are then used as a reference set for a non-linear normalization. The method is applicable to any existing expression summaries, such as MAS5 or RMA.

Conclusion

We show that LVS normalization outperforms other normalization methods when the standard assumptions are not satisfied. In the complex spike-in study, LVS performs similarly to the ideal (in practice unknown) housekeeping-gene normalization. An R package called lvs is available in http://www.meb.ki.se/~yudpaw webcite.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.