Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

A multi-class predictor based on a probabilistic model: application to gene expression profiling-based diagnosis of thyroid tumors

Naoto Yukinawa1, Shigeyuki Oba1, Kikuya Kato2*, Kazuya Taniguchi2, Kyoko Iwao-Koizumi2, Yasuhiro Tamaki3, Shinzaburo Noguchi3 and Shin Ishii1

Author Affiliations

1 Laboratory of Theoretical Life Science, Graduate School of Information Sciences, Nara Institute of Science and Technology, 8916-5 Takayama, Nara, 630-0101, Japan

2 Osaka Medical Center for Cancer and Cardiovascular Diseases, 1-3-2, Nakamichi, Higashinari-ku, Osaka, 537-8511, Japan

3 Department of Surgical Oncology, Osaka University Medical School, 2-2 Yamadaoka, Suita-ku, Osaka, 565-0871, Japan

For all author emails, please log on.

BMC Genomics 2006, 7:190  doi:10.1186/1471-2164-7-190

Published: 27 July 2006

Abstract

Background

Although microscopic diagnosis has been playing the decisive role in cancer diagnostics, there have been cases in which it does not satisfy the clinical need. Differential diagnosis of malignant and benign thyroid tissues is one such case, and supplementary diagnosis such as that by gene expression profile is expected.

Results

With four thyroid tissue types, i.e., papillary carcinoma, follicular carcinoma, follicular adenoma, and normal thyroid, we performed gene expression profiling with adaptor-tagged competitive PCR, a high-throughput RT-PCR technique. For differential diagnosis, we applied a novel multi-class predictor, introducing probabilistic outputs. Multi-class predictors were constructed using various combinations of binary classifiers. The learning set included 119 samples, and the predictors were evaluated by strict leave-one-out cross validation. Trials included classical combinations, i.e., one-to-one, one-to-the-rest, but the predictor using more combination exhibited the better prediction accuracy. This characteristic was consistent with other gene expression data sets. The performance of the selected predictor was then tested with an independent set consisting of 49 samples. The resulting test prediction accuracy was 85.7%.

Conclusion

Molecular diagnosis of thyroid tissues is feasible by gene expression profiling, and the current level is promising towards the automatic diagnostic tool to complement the present medical procedures. A multi-class predictor with an exhaustive combination of binary classifiers could achieve a higher prediction accuracy than those with classical combinations and other predictors such as multi-class SVM. The probabilistic outputs of the predictor offer more detailed information for each sample, which enables visualization of each sample in low-dimensional classification spaces. These new concepts should help to improve the multi-class classification including that of cancer tissues.