Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Open Badges Methodology article

A novel approach to detect hot-spots in large-scale multivariate data

Jianhua Wu1, Keith M Kendrick2 and Jianfeng Feng13*

Author Affiliations

1 Department of Computer Science, Warwick University, Coventry CV4 7AL, UK

2 Cognitive and Behavioural Neuroscience, The Babraham Institute, Cambridge CB2 4AT, UK

3 Department of Mathematics, Hunan Normal University, 410081 , PRoC

For all author emails, please log on.

BMC Bioinformatics 2007, 8:331  doi:10.1186/1471-2105-8-331

Published: 11 September 2007



Progressive advances in the measurement of complex multifactorial components of biological processes involving both spatial and temporal domains have made it difficult to identify the variables (genes, proteins, neurons etc.) significantly changed activities in response to a stimulus within large data sets using conventional statistical approaches. The set of all changed variables is termed hot-spots. The detection of such hot spots is considered to be an NP hard problem, but by first establishing its theoretical foundation we have been able to develop an algorithm that provides a solution.


Our results show that a first-order phase transition is observable whose critical point separates the hot-spot set from the remaining variables. Its application is also found to be more successful than existing approaches in identifying statistically significant hot-spots both with simulated data sets and in real large-scale multivariate data sets from gene arrays, electrophysiological recording and functional magnetic resonance imaging experiments.


In summary, this new statistical algorithm should provide a powerful new analytical tool to extract the maximum information from complex biological multivariate data.