This article is part of the supplement: Proceedings of the Tenth Annual MCBIOS Conference
Towards human-computer synergetic analysis of large-scale biological data
1 Department of Computer Science, San Francisco State University, San Francisco, CA, USA
2 Center for Discovery and Innovation in Parasitic Diseases, University of California, San Francisco, CA, USA
3 Open University Program, San Francisco State University, CA, USA
4 Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Francisco, CA, USA
5 Romberg Tiburon Center and Department of Biology, San Francisco State University, San Francisco, CA, USA
6 Department of Integrative Biology, University of California Berkeley, CA, USA
7 Department of Anatomy and Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
BMC Bioinformatics 2013, 14(Suppl 14):S10 doi:10.1186/1471-2105-14-S14-S10Published: 9 October 2013
Advances in technology have led to the generation of massive amounts of complex and multifarious biological data in areas ranging from genomics to structural biology. The volume and complexity of such data leads to significant challenges in terms of its analysis, especially when one seeks to generate hypotheses or explore the underlying biological processes. At the state-of-the-art, the application of automated algorithms followed by perusal and analysis of the results by an expert continues to be the predominant paradigm for analyzing biological data. This paradigm works well in many problem domains. However, it also is limiting, since domain experts are forced to apply their instincts and expertise such as contextual reasoning, hypothesis formulation, and exploratory analysis after the algorithm has produced its results. In many areas where the organization and interaction of the biological processes is poorly understood and exploratory analysis is crucial, what is needed is to integrate domain expertise during the data analysis process and use it to drive the analysis itself.
In context of the aforementioned background, the results presented in this paper describe advancements along two methodological directions. First, given the context of biological data, we utilize and extend a design approach called experiential computing from multimedia information system design. This paradigm combines information visualization and human-computer interaction with algorithms for exploratory analysis of large-scale and complex data. In the proposed approach, emphasis is laid on: (1) allowing users to directly visualize, interact, experience, and explore the data through interoperable visualization-based and algorithmic components, (2) supporting unified query and presentation spaces to facilitate experimentation and exploration, (3) providing external contextual information by assimilating relevant supplementary data, and (4) encouraging user-directed information visualization, data exploration, and hypotheses formulation. Second, to illustrate the proposed design paradigm and measure its efficacy, we describe two prototype web applications. The first, called XMAS (E
The proposed design philosophy, combines visualization, algorithmic components and cognitive expertise into a seamless processing-analysis-exploration framework that facilitates sense-making, exploration, and discovery. Using XMAS, we present case studies that analyze transcriptional data from two highly complex domains: gene expression in the placenta during human pregnancy and reaction of marine organisms to heat stress. With PSPACE, we demonstrate how complex structure-function relationships can be explored. These results demonstrate the novelty, advantages, and distinctions of the proposed paradigm. Furthermore, the results also highlight how domain insights can be combined with algorithms to discover meaningful knowledge and formulate evidence-based hypotheses during the data analysis process. Finally, user studies against comparable systems indicate that both XMAS and PSPACE deliver results with better interpretability while placing lower cognitive loads on the users. XMAS is available at: http://tintin.sfsu.edu:8080/xmas webcite. PSPACE is available at: http://pspace.info/ webcite.