Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks
1 Plant Bioengineering Laboratory, Northeast Agricultural University, Harbin, China
2 State Key Lab of Agrobiotechnology and Department of Biology, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
BMC Bioinformatics 2010, 11:520 doi:10.1186/1471-2105-11-520Published: 19 October 2010
Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discretization has been conducted, in the context of gene regulatory network inference from time series gene expression data.
In this study, we propose a new discretization method "bikmeans", and compare its performance with four other widely-used discretization methods using different datasets, modeling algorithms and number of intervals. Sensitivities, specificities and total accuracies were calculated and statistical analysis was carried out. Bikmeans method always gave high total accuracies.
Our results indicate that proper discretization methods can consistently improve gene regulatory network inference independent of network modeling algorithms and datasets. Our new method, bikmeans, resulted in significant better total accuracies than other methods.