Comparison of normalization methods for CodeLink Bioarray data
1 Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease, Division of Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
2 Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
3 Center for Automated Learning and Discovery and Language Technology Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Citation and License
BMC Bioinformatics 2005, 6:309 doi:10.1186/1471-2105-6-309Published: 28 December 2005
The quality of microarray data can seriously affect the accuracy of downstream analyses. In order to reduce variability and enhance signal reproducibility in these data, many normalization methods have been proposed and evaluated, most of which are for data obtained from cDNA microarrays and Affymetrix GeneChips. CodeLink Bioarrays are a newly emerged, single-color oligonucleotide microarray platform. To date, there are no reported studies that evaluate normalization methods for CodeLink Bioarrays.
We compared five existing normalization approaches, in terms of both noise reduction and signal retention: Median (suggested by the manufacturer), CyclicLoess, Quantile, Iset, and Qspline. These methods were applied to two real datasets (a time course dataset and a lung disease-related dataset) generated by CodeLink Bioarrays and were assessed using multiple statistical significance tests. Compared to Median, CyclicLoess and Qspline exhibit a significant and the most consistent improvement in reduction of variability and retention of signal. CyclicLoess appears to retain more signal than Qspline. Quantile reduces more variability than Median in both datasets, yet fails to consistently retain more signal in the time course dataset. Iset does not improve over Median in either noise reduction or signal enhancement in the time course dataset.
Median is insufficient either to reduce variability or to retain signal effectively for CodeLink Bioarray data. CyclicLoess is a more suitable approach for normalizing these data. CyclicLoess also seems to be the most effective method among the five different normalization strategies examined.