More powerful significant testing for time course gene expression data using functional principal component analysis approaches
Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Rochester, NY, 14642, USA
BMC Bioinformatics 2013, 14:6 doi:10.1186/1471-2105-14-6Published: 16 January 2013
One of the fundamental problems in time course gene expression data analysis is to identify genes associated with a biological process or a particular stimulus of interest, like a treatment or virus infection. Most of the existing methods for this problem are designed for data with longitudinal replicates. But in reality, many time course gene experiments have no replicates or only have a small number of independent replicates.
We focus on the case without replicates and propose a new method for identifying differentially expressed genes by incorporating the functional principal component analysis (FPCA) into a hypothesis testing framework. The data-driven eigenfunctions allow a flexible and parsimonious representation of time course gene expression trajectories, leaving more degrees of freedom for the inference compared to that using a prespecified basis. Moreover, the information of all genes is borrowed for individual gene inferences.
The proposed approach turns out to be more powerful in identifying time course differentially expressed genes compared to the existing methods. The improved performance is demonstrated through simulation studies and a real data application to the Saccharomyces cerevisiae cell cycle data.