Open Access Highly Accessed Research article

Finding function: evaluation methods for functional genomic data

Chad L Myers12, Daniel R Barrett12, Matthew A Hibbs12, Curtis Huttenhower12 and Olga G Troyanskaya12*

Author Affiliations

1 Department of Computer Science, Princeton University, Princeton, NJ 08544, USA

2 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ, 08544, USA

For all author emails, please log on.

BMC Genomics 2006, 7:187  doi:10.1186/1471-2164-7-187

Published: 25 July 2006

Abstract

Background

Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications.

Results

We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website.

Conclusion

Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion.