This article is part of the supplement: Proceedings from the Great Lakes Bioinformatics Conference 2011

Open Access Proceedings

THINK Back: KNowledge-based Interpretation of High Throughput data

Fernando Farfán1*, Jun Ma2, Maureen A Sartor3, George Michailidis14 and Hosagrahar V Jagadish13

Author Affiliations

1 Computer Science and Engineering Department, University of Michigan, Ann Arbor, MI, USA

2 Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA

3 Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA

4 Department of Statistics, University of Michigan, Ann Arbor, MI, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13(Suppl 2):S4  doi:10.1186/1471-2105-13-S2-S4

Published: 13 March 2012

Abstract

Results of high throughput experiments can be challenging to interpret. Current approaches have relied on bulk processing the set of expression levels, in conjunction with easily obtained external evidence, such as co-occurrence. While such techniques can be used to reason probabilistically, they are not designed to shed light on what any individual gene, or a network of genes acting together, may be doing. Our belief is that today we have the information extraction ability and the computational power to perform more sophisticated analyses that consider the individual situation of each gene. The use of such techniques should lead to qualitatively superior results.

The specific aim of this project is to develop computational techniques to generate a small number of biologically meaningful hypotheses based on observed results from high throughput microarray experiments, gene sequences, and next-generation sequences. Through the use of relevant known biomedical knowledge, as represented in published literature and public databases, we can generate meaningful hypotheses that will aide biologists to interpret their experimental data.

We are currently developing novel approaches that exploit the rich information encapsulated in biological pathway graphs. Our methods perform a thorough and rigorous analysis of biological pathways, using complex factors such as the topology of the pathway graph and the frequency in which genes appear on different pathways, to provide more meaningful hypotheses to describe the biological phenomena captured by high throughput experiments, when compared to other existing methods that only consider partial information captured by biological pathways.