Bayesian probit regression model for the diagnosis of pulmonary fibrosis: proof-of-principle
1 Department of Medicine, Division of Pulmonary, Allergy and Critical Care Medicine, Department of Medicine, Duke University Medical Center, Durham, North Carolina, USA
2 Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina, USA
3 Institute for Genome Science and Policy, Duke University Medical Center, Durham, North Carolina, USA
4 Department of Surgery, Division of Cardiovascular and Thoracic Surgery, Duke University Medical Center, Durham, North Carolina, USA
5 Department of Immunology, Duke University Medical Center, Durham, North Carolina, USA
6 Department of Pathology, Duke University Medical Center, Durham, North Carolina, USA
BMC Medical Genomics 2011, 4:70 doi:10.1186/1755-8794-4-70Published: 5 October 2011
The accurate diagnosis of idiopathic pulmonary fibrosis (IPF) is a major clinical challenge. We developed a model to diagnose IPF by applying Bayesian probit regression (BPR) modelling to gene expression profiles of whole lung tissue.
Whole lung tissue was obtained from patients with idiopathic pulmonary fibrosis (IPF) undergoing surgical lung biopsy or lung transplantation. Controls were obtained from normal organ donors. We performed cluster analyses to explore differences in our dataset. No significant difference was found between samples obtained from different lobes of the same patient. A significant difference was found between samples obtained at biopsy versus explant. Following preliminary analysis of the complete dataset, we selected three subsets for the development of diagnostic gene signatures: the first signature was developed from all IPF samples (as compared to controls); the second signature was developed from the subset of IPF samples obtained at biopsy; the third signature was developed from IPF explants. To assess the validity of each signature, we used an independent cohort of IPF and normal samples. Each signature was used to predict phenotype (IPF versus normal) in samples from the validation cohort. We compared the models' predictions to the true phenotype of each validation sample, and then calculated sensitivity, specificity and accuracy.
Surprisingly, we found that all three signatures were reasonably valid predictors of diagnosis, with small differences in test sensitivity, specificity and overall accuracy.
This study represents the first use of BPR on whole lung tissue; previously, BPR was primarily used to develop predictive models for cancer. This also represents the first report of an independently validated IPF gene expression signature. In summary, BPR is a promising tool for the development of gene expression signatures from non-neoplastic lung tissue. In the future, BPR might be used to develop definitive diagnostic gene signatures for IPF, prognostic gene signatures for IPF or gene signatures for other non-neoplastic lung disorders such as bronchiolitis obliterans.