Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

Signature Evaluation Tool (SET): a Java-based tool to evaluate and visualize the sample discrimination abilities of gene expression signatures

Chih-Hung Jen1, Tsun-Po Yang25, Chien-Yi Tung2, Shu-Han Su2, Chi-Hung Lin124, Ming-Ta Hsu13 and Hsei-Wei Wang124*

Author Affiliations

1 Microarray & Gene Expression Analysis Core Facility, VGH National Yang-Ming University Genome Research Center, Taipei, Taiwan

2 Institute of Microbiology and Immunology, National Yang-Ming University, Taipei, Taiwan

3 Institute of Biochemistry and Molecular Biology, National Yang-Ming University, Taipei, Taiwan

4 Department of Teaching and Research, Taipei City Hospital, Taipei, Taiwan

5 Current address: EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

For all author emails, please log on.

BMC Bioinformatics 2008, 9:58  doi:10.1186/1471-2105-9-58

Published: 28 January 2008

Abstract

Background

The identification of specific gene expression signature for distinguishing sample groups is a dominant field in cancer research. Although a number of tools have been developed to identify optimal gene expression signatures, the number of signature genes obtained is often overly large to be applied clinically. Furthermore, experimental verification is sometimes limited by the availability of wet-lab materials such as antibodies and reagents. A tool to evaluate the discrimination power of candidate genes is therefore in high demand by clinical researchers.

Results

Signature Evaluation Tool (SET) is a Java-based tool adopting the Golub's weighted voting algorithm as well as incorporating the visual presentation of prediction strength for each array sample. SET provides a flexible and easy-to-follow platform to evaluate the discrimination power of a gene signature. Here, we demonstrated the application of SET for several purposes: (1) for signatures consisting of a large number of genes, SET offers the ability to rapidly narrow down the number of genes; (2) for a given signature (from third party analyses or user-defined), SET can re-evaluate and re-adjust its discrimination power by selecting/de-selecting genes repeatedly; (3) for multiple microarray datasets, SET can evaluate the classification capability of a signature among datasets; and (4) by providing a module to visualize the prediction strength for each sample, SET allows users to re-evaluate the discrimination power on mis-grouped or less-certain samples. Information obtained from the above applications could be useful in prognostic analyses or clinical management decisions.

Conclusion

Here we present SET to evaluate and visualize the sample-discrimination ability of a given gene expression signature. This tool provides a filtration function for signature identification and lies between clinical analyses and class prediction (or feature selection) tools. The simplicity, flexibility and brevity of SET could make it an invaluable tool for marker identification in clinical research.