This article is part of the supplement: Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology
A model selection approach to discover age-dependent gene expression patterns using quantile regression models
1 School of Information Technologies, The University of Sydney, NSW 2006, Australia
2 Muscle Research Unit, Bosch Institute, Discipline of Anatomy and Histology, The University of Sydney, NSW 2006, Australia
3 Sydney Bioinformatics and Centre for Mathematical Biology, The University of Sydney, NSW 2006, Australia
4 NICTA, Australian Technology Park, Eveleigh, NSW 2015, Australia
BMC Genomics 2009, 10(Suppl 3):S16 doi:10.1186/1471-2164-10-S3-S16Published: 3 December 2009
It has been a long-standing biological challenge to understand the molecular regulatory mechanisms behind mammalian ageing. Harnessing the availability of many ageing microarray datasets, a number of studies have shown that it is possible to identify genes that have age-dependent differential expression (DE) or differential variability (DV) patterns. The majority of the studies identify "interesting" genes using a linear regression approach, which is known to perform poorly in the presence of outliers or if the underlying age-dependent pattern is non-linear. Clearly a more robust and flexible approach is needed to identify genes with various age-dependent gene expression patterns.
Here we present a novel model selection approach to discover genes with linear or non-linear age-dependent gene expression patterns from microarray data. To identify DE genes, our method fits three quantile regression models (constant, linear and piecewise linear models) to the expression profile of each gene, and selects the least complex model that best fits the available data. Similarly, DV genes are identified by fitting and comparing two quantile regression models (non-DV and the DV models) to the expression profile of each gene. We show that our approach is much more robust than the standard linear regression approach in discovering age-dependent patterns. We also applied our approach to analyze two human brain ageing datasets and found many biologically interesting gene expression patterns, including some very interesting DV patterns, that have been overlooked in the original studies. Furthermore, we propose that our model selection approach can be extended to discover DE and DV genes from microarray datasets with discrete class labels, by considering different quantile regression models.
In this paper, we present a novel application of quantile regression models to identify genes that have interesting linear or non-linear age-dependent expression patterns. One important contribution of this paper is to introduce a model selection approach to DE and DV gene identification, which is most commonly tackled by null hypothesis testing approaches. We show that our approach is robust in analyzing real and simulated datasets. We believe that our approach is applicable in many ageing or time-series data analysis tasks.