Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Identifying common prognostic factors in genomic cancer studies: A novel index for censored outcomes

Sigrid Rouam12*, Thierry Moreau3 and Philippe Broët12

Author Affiliations

1 Computational and Mathematical Biology, Genome Institute of Singapore, Singapore 138672, Singapore

2 Univ Paris-Sud, JE2492, Villejuif, F-94807 France

3 Inserm, U780, Villejuif, F-94807 France; Univ Paris-Sud, Villejuif, F-94807 France

For all author emails, please log on.

BMC Bioinformatics 2010, 11:150  doi:10.1186/1471-2105-11-150

Published: 24 March 2010

Abstract

Background

With the growing number of public repositories for high-throughput genomic data, it is of great interest to combine the results produced by independent research groups. Such a combination allows the identification of common genomic factors across multiple cancer types and provides new insights into the disease process. In the framework of the proportional hazards model, classical procedures, which consist of ranking genes according to the estimated hazard ratio or the p-value obtained from a test statistic of no association between survival and gene expression level, are not suitable for gene selection across multiple genomic datasets with different sample sizes. We propose a novel index for identifying genes with a common effect across heterogeneous genomic studies designed to remain stable whatever the sample size and which has a straightforward interpretation in terms of the percentage of separability between patients according to their survival times and gene expression measurements.

Results

The simulations results show that the proposed index is not substantially affected by the sample size of the study and the censoring. They also show that its separability performance is higher than indices of predictive accuracy relying on the likelihood function. A simulated example illustrates the good operating characteristics of our index. In addition, we demonstrate that it is linked to the score statistic and possesses a biologically relevant interpretation.

The practical use of the index is illustrated for identifying genes with common effects across eight independent genomic cancer studies of different sample sizes. The meta-selection allows the identification of four genes (ESPL1, KIF4A, HJURP, LRIG1) that are biologically relevant to the carcinogenesis process and have a prognostic impact on survival outcome across various solid tumors.

Conclusion

The proposed index is a promising tool for identifying factors having a prognostic impact across a collection of heterogeneous genomic datasets of various sizes.