AUREA: an open-source software system for accurate and user-friendly identification of relative expression molecular signatures
1 Institute for Systems Biology, Seattle, WA, USA
2 Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
3 Department of Bioengineering, University of Illinois, Urbana, IL, USA
4 Center for Biophysics and Computational Biology, University of Illinois, Urbana, IL, USA
5 Department of Computer Science, University of Illinois, Urbana, IL
BMC Bioinformatics 2013, 14:78 doi:10.1186/1471-2105-14-78Published: 5 March 2013
Public databases such as the NCBI Gene Expression Omnibus contain extensive and exponentially increasing amounts of high-throughput data that can be applied to molecular phenotype characterization. Collectively, these data can be analyzed for such purposes as disease diagnosis or phenotype classification. One family of algorithms that has proven useful for disease classification is based on relative expression analysis and includes the Top-Scoring Pair (TSP), k-Top-Scoring Pairs (k-TSP), Top-Scoring Triplet (TST) and Differential Rank Conservation (DIRAC) algorithms. These relative expression analysis algorithms hold significant advantages for identifying interpretable molecular signatures for disease classification, and have been implemented previously on a variety of computational platforms with varying degrees of usability. To increase the user-base and maximize the utility of these methods, we developed the program AUREA (Adaptive Unified Relative Expression Analyzer)—a cross-platform tool that has a consistent application programming interface (API), an easy-to-use graphical user interface (GUI), fast running times and automated parameter discovery.
Herein, we describe AUREA, an efficient, cohesive, and user-friendly open-source software system that comprises a suite of methods for relative expression analysis. AUREA incorporates existing methods, while extending their capabilities and bringing uniformity to their interfaces. We demonstrate that combining these algorithms and adaptively tuning parameters on the training sets makes these algorithms more consistent in their performance and demonstrate the effectiveness of our adaptive parameter tuner by comparing accuracy across diverse datasets.
We have integrated several relative expression analysis algorithms and provided a unified interface for their implementation while making data acquisition, parameter fixing, data merging, and results analysis ‘point-and-click’ simple. The unified interface and the adaptive parameter tuning of AUREA provide an effective framework in which to investigate the massive amounts of publically available data by both ‘in silico’ and ‘bench’ scientists. AUREA can be found at http://price.systemsbiology.net/AUREA/ webcite.