BMC Bioinformatics Volume 5
|
Viewing options:Associated material:Related literature:- Articles citing this article
- Other articles by authors
- Related articles/pages
Tools:Post to:
|
Methodology articleA novel Mixture Model Method for identification of differentially expressed genes from DNA microarray dataKayvan Najarian1 , Maryam Zaheri1 , Ali A Rad2 , Siamak Najarian3 and Javad Dargahi3  1Computer Science Department, University of North Carolina Charlotte, University City Blvd, Charlotte, NC, USA 2Computer Engineering and IT Department, Amirkabir University of Technology, Tehran, Iran 3Mechanical and Industrial Engineering Department, Concordia University, CONCAVE Research Centre, CR-200, Concordia University, Quebec, Canada author email corresponding author email
BMC Bioinformatics 2004,
5:201doi:10.1186/1471-2105-5-201
|
|
| Published: |
16 December 2004 |
Abstract
Background
The main goal in analyzing microarray data is to determine the genes that are differentially expressed across two types of tissue samples or samples obtained under two experimental conditions. Mixture model method (MMM hereafter) is a nonparametric statistical method often used for microarray processing applications, but is known to over-fit the data if the number of replicates is small. In addition, the results of the MMM may not be repeatable when dealing with a small number of replicates. In this paper, we propose a new version of MMM to ensure the repeatability of the results in different runs, and reduce the sensitivity of the results on the parameters.
Results
The proposed technique is applied to the two different data sets: Leukaemia data set and a data set that examines the effects of low phosphate diet on regular and Hyp mice. In each study, the proposed algorithm successfully selects genes closely related to the disease state that are verified by biological information.
Conclusion
The results indicate 100% repeatability in all runs, and exhibit very little sensitivity on the choice of parameters. In addition, the evaluation of the applied method on the Leukaemia data set shows 12% improvement compared to the MMM in detecting the biologically-identified 50 expressed genes by Thomas et al. The results witness to the successful performance of the proposed algorithm in quantitative pathogenesis of diseases and comparative evaluation of treatment methods. |