DART: Denoising Algorithm based on Relevance network Topology improves molecular pathway activity inference
1 KCL-UCL Comprehensive Cancer Imaging Center, Guy's Campus, London SE1 1UL, UK
2 Statistical Genomics Group, Paul O'Gorman Building, UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
3 Richard Dimbleby Department of Cancer Research, Randall Division & Division of Cancer Studies, King's College London, WC2R 2LS, UK
4 Research Oncology, 3rd Floor, Bermondsey Wing,Guy's Hospital, Great Maze Pond, London, SE1 9RT, UK
5 Breakthrough Breast Research Unit, Guy's Hospital, King's Health Partners Academic Health Sciences Centres, London, SE1 9RT, UK
6 Department of Radiology, St Thomas' Hospital, London SE1 7EH, UK
BMC Bioinformatics 2011, 12:403 doi:10.1186/1471-2105-12-403Published: 19 October 2011
Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (e.g gene or protein expression) with independent and highly curated structural interaction data (e.g protein interaction networks) or more generally with prior knowledge pathway databases. However, it is unclear how best to use the pathway knowledge information in the context of molecular profiles of any given study.
We present an algorithm called DART (Denoising Algorithm based on Relevance network Topology) which filters out noise before estimating pathway activity. Using simulated and real multidimensional cancer genomic data and by comparing DART to other algorithms which do not assess the relevance of the prior pathway information, we here demonstrate that substantial improvement in pathway activity predictions can be made if prior pathway information is denoised before predictions are made. We also show that genes encoding hubs in expression correlation networks represent more reliable markers of pathway activity. Using the Netpath resource of signalling pathways in the context of breast cancer gene expression data we further demonstrate that DART leads to more robust inferences about pathway activity correlations. Finally, we show that DART identifies a hypothesized association between oestrogen signalling and mammographic density in ER+ breast cancer.
Evaluating the consistency of prior information of pathway databases in molecular tumour profiles may substantially improve the subsequent inference of pathway activity in clinical tumour specimens. This de-noising strategy should be incorporated in approaches which attempt to infer pathway activity from prior pathway models.