Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Correspondence

Challenges of the information age: the impact of false discovery on pathway identification

Colin J Rog, Srinivasa C Chekuri and Mary E Edgerton*

Author Affiliations

M.D. Anderson Cancer Center, Department of Pathology, 1515 Holcombe Blvd, Houston, TX, 77030, USA

For all author emails, please log on.

BMC Research Notes 2012, 5:647  doi:10.1186/1756-0500-5-647

Published: 21 November 2012



Pathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored.


The frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene.


This can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.

Databases; Pathways; Genes; Networks; Bioinformatics; Cancer pathways