Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

A method to identify differential expression profiles of time-course gene data with Fourier transformation

Jaehee Kim1*, Robert Todd Ogden2 and Haseong Kim3

Author Affiliations

1 Department of Statistics, Duksung Women’s University, Seoul, South Korea

2 Department of Biostatistics, Columbia University, New York, USA

3 Biochemicals and Synthetic Biology Research Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon, South Korea

For all author emails, please log on.

BMC Bioinformatics 2013, 14:310  doi:10.1186/1471-2105-14-310

Published: 18 October 2013

Abstract

Background

Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis.

Results

This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization.

The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles.

Conclusions

Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be potentially used to identify genes which have the same patterns or biological processes, and help facing the present and forthcoming challenges of data analysis in functional genomics.