Two-transcript gene expression classifiers in the diagnosis and prognosis of human diseases
1 Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
2 Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
3 Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
4 Department of Applied Mathematics and Statistics & Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland 21218, USA
5 Department of Pathology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
6 Babraham Institute, Cambridge, CB22 3AT, UK
BMC Genomics 2009, 10:583 doi:10.1186/1471-2164-10-583Published: 5 December 2009
Identification of molecular classifiers from genome-wide gene expression analysis is an important practice for the investigation of biological systems in the post-genomic era - and one with great potential for near-term clinical impact. The 'Top-Scoring Pair' (TSP) classification method identifies pairs of genes whose relative expression correlates strongly with phenotype. In this study, we sought to assess the effectiveness of the TSP approach in the identification of diagnostic classifiers for a number of human diseases including bacterial and viral infection, cardiomyopathy, diabetes, Crohn's disease, and transformed ulcerative colitis. We examined transcriptional profiles from both solid tissues and blood-borne leukocytes.
The algorithm identified multiple predictive gene pairs for each phenotype, with cross-validation accuracy ranging from 70 to nearly 100 percent, and high sensitivity and specificity observed in most classification tasks. Performance compared favourably with that of pre-existing transcription-based classifiers, and in some cases was comparable to the accuracy of current clinical diagnostic procedures. Several diseases of solid tissues could be reliably diagnosed through classifiers based on the blood-borne leukocyte transcriptome. The TSP classifier thus represents a simple yet robust method to differentiate between diverse phenotypic states based on gene expression profiles.
Two-transcript classifiers have the potential to reliably classify diverse human diseases, through analysis of both local diseased tissue and the immunological response assayed through blood-borne leukocytes. The experimental simplicity of this method results in measurements that can be easily translated to clinical practice.