Email updates

Keep up to date with the latest news and content from BMC Cancer and BioMed Central.

This article is part of the series Cancer bioinformatics: bioinformatic methods, network biomarkers and precision medicine.

Open Access Highly Accessed Research article

A molecular computational model improves the preoperative diagnosis of thyroid nodules

Sara Tomei1*, Ivo Marchetti2, Katia Zavaglia1, Francesca Lessi1, Alessandro Apollo1, Paolo Aretini1, Giancarlo Di Coscio2, Generoso Bevilacqua12 and Chiara Mazzanti1

Author affiliations

1 Division of Surgical, Molecular, and Ultrastructural Pathology, Section of Molecular Pathology, University of Pisa and Pisa University Hospital, Via Roma 57, Pisa, 56100, Italy

2 Section of Cytopathology, University of Pisa and Pisa University Hospital, Via Roma 57, Pisa, 56100, Italy

For all author emails, please log on.

Citation and License

BMC Cancer 2012, 12:396  doi:10.1186/1471-2407-12-396


The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2407/12/396


Received:29 January 2012
Accepted:31 July 2012
Published:7 September 2012

© 2012 Tomei et al.; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Thyroid nodules with indeterminate cytological features on fine needle aspiration (FNA) cytology have a 20% risk of thyroid cancer. The aim of the current study was to determine the diagnostic utility of an 8-gene assay to distinguish benign from malignant thyroid neoplasm.

Methods

The mRNA expression level of 9 genes (KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH) was analysed by quantitative PCR (q-PCR) in 93 FNA cytological samples. To evaluate the diagnostic utility of all the genes analysed, we assessed the area under the curve (AUC) for each gene individually and in combination. BRAF exon 15 status was determined by pyrosequencing. An 8-gene computational model (Neural Network Bayesian Classifier) was built and a multiple-variable analysis was then performed to assess the correlation between the markers.

Results

The AUC for each significant marker ranged between 0.625 and 0.900, thus all the significant markers, alone and in combination, can be used to distinguish between malignant and benign FNA samples. The classifier made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1, Hs.296031 and BRAF had a predictive power of 88.8%. It proved to be useful for risk stratification of the most critical cytological group of the indeterminate lesions for which there is the greatest need of accurate diagnostic markers.

Conclusion

The genetic classification obtained with this model is highly accurate at differentiating malignant from benign thyroid lesions and might be a useful adjunct in the preoperative management of patients with thyroid nodules.

Keywords:
Thyroid; Fine-needle aspiration (FNA); Area under the curve (AUC); Computational model; Preoperative diagnosis

Background

Thyroid nodules represent a very common problem. The majority (>95%) of them are benign; however, malignancy risk increases with female gender, nodule size, extremes of age (<30 and >60 years), personal or family history of thyroid malignancy and radiation exposure [1].

The advent of thyroid ultrasound allowed for an increasing number of nodules to be diagnosed, and it is now recognized that nodules are present in an estimated 50% of the general population and are detected at subclinical level. However, since only 10% of these nodules will be a true malignancy, preoperative testing to differentiate benign from malignant nodules are needed [2,3].

Currently, fine-needle aspiration (FNA) cytology is the most accurate and cost effective diagnostic test to exclude a thyroid cancer diagnosis. In general, a thyroid nodule on FNA cytology can be classified as benign, malignant, suspicious, indeterminate, or non-diagnostic [4].

Unfortunately, about 30% of FNAs are indeterminate and often require a diagnostic thyroidectomy to establish the diagnosis on permanent histological examination. Only 20% of diagnostic thyroidectomies in patients with indeterminate FNA cytology demonstrates malignant lesions on permanent histology, and these patients often require a completion thyroidectomy [5].

Therefore, because of this obvious limitation of FNA cytology in the preoperative diagnosis, there is a clinical need for reliable preoperative molecular markers to distinguish benign from malignant thyroid nodules.

A 10-gene (KIT, SYNGR2, C21orf4, Hs.296031, Hs.24183, FAM13A1, C11orf8, KIAA1128, IMPACT, CDH1) and a 6-gene (KIT, LSM7, SYNGR2, C21orf4, Hs.296031, Hs.24183) assays have been proposed to have high diagnostic accuracy to distinguish thyroid nodules [6]. Those assays have been developed from microarray analyses of tumor specimens obtained after surgical removal of thyroid nodules. Since only 20% of patients undergoing surgery have malignant lesions, preoperative tests are needed to avoid unnecessary surgery. Gene expression profiling studies have identified many other possible markers with high accuracy, however the clinical application of these markers is limited to the use of post-surgical samples. FNA cytology represents a useful tool in the preoperative evaluation of a thyroid nodule, especially because of the knowledge of the amount of tumor cells per sample. In a previous paper [7] we showed the clinical relevance of KIT expression to the diagnosis of thyroid tumors, whose RNA was extracted from cytological preoperative FNA specimens. Although KIT expression resulted to increase the diagnostic accuracy of 15% compared to the cytology alone, there were samples still remained indeterminate.

The aim of the current study was to build a q-PCR-based computational model able to preoperatively diagnose benign and malignant thyroid tumors on the basis of the expression profiles of the genes mentioned above (KIT, SYNGR2, C21orf4, Hs.296031, Hs.24183, CDH1, LSM7), plus two other genes (TC1, NATH) known to be involved in thyroid carcinogenesis from the literature [8-10]. In addition, since BRAF sequencing is so far the best molecular test used in the preoperative assessment of thyroid nodules malignancy, we also built a model including BRAF mutational status.

In the last years, a new class of techniques known as Bayesian Neural Networks (BNN) have been proposed as a supplement or alternative to standard statistical techniques. For the purpose of predicting medical outcomes, a BNN can be considered a computer intensive classification method and, in addition, BNNs do not require explicit distributional assumption (such as normality) [11].

As previously described by us, KIT is down-regulated in malignant thyroid tumors compared to the benign ones. SYNGR2 has been characterized as an integral vesicle membrane protein [12] and the only data available indicate its up-regulation in fetal mouse ovaries [13]. LSM7 has been described in the family of Sm-like proteins, involved in pre-messenger RNA splicing and decapping [14]. The interaction of LSM7 with the TACC1 complex may participate in breast cancer oncogenesis [15]. C21orf4 encodes a predicted trans-membrane protein (Tmem50b) and is one of few genes significantly over-expressed during cerebellar development in a Down syndrome mouse model [16]. The role of SYNGR2, LSM7 and C21orf4 in thyroid carcinogenesis has not yet been explored. E-Cadherin (CDH1) expression is reduced in thyroid carcinomas [17] and its promoter resulted to be hypermethylated in thyroid neoplasm [18]. Hs.24183 (now Hs.145049) has been identified as part of the 3’UTR of DDI2 (DNA-damage inducible 1 homolog 2) gene in H. sapiens, but no data exists about its role in thyroid. For Hs.296031 the only information available refers to gene sequence and mapping, but no gene and protein function are known yet. In contrast, the expression of the thyroid cancer-1 (TC1) gene resulted to be related to malignant transformation in thyroid and the potential use of TC1 gene expression as a marker of malignancy has also been shown in literature [19]. NATH (N-acetyl transferase human) is involved in protein acetylation which represents an important post-translational modification regulating oncogenesis, apoptosis and cell cycle. NATH resulted to be over-expressed at the mRNA level in papillary thyroid carcinomas compared to non-neoplastic thyroid tissue [8].

In this study we used 87 FNA cytological samples to build several preoperative computational models and 6 unknown samples to test in order to find the most discriminative one.

A correlation analysis between the markers was also performed in order to investigate their biological importance and to find a link that could give us a better understanding of the molecular mechanisms underlying thyroid cancer development.

Methods

Thyroid specimens

Preoperative thyroid FNA slides of a total of 93 patients carrying thyroid lesions (49 malignant, 38 benign, 6 unknown) were selected from archived materials of the Section of Cytopathology, Division of Surgical, Molecular and Ultrastructural Pathology, S. Chiara Hospital, Pisa. For ethical reasons we used only cases with two or more slides per patient and the molecular analysis was performed on only one of the available smears. In all cases FNA was performed using ultrasonography guidance. All smears were reviewed by a senior cytopathologist. Diagnosis was carried out on the basis of the following criteria broadly suggested in the literature: smear background, cell arrangements, cell shape, nuclear/cytoplasmic features, presence of nucleoli and mitosis. The histological diagnosis assessed ultimately the malignancy or benignity of the 93 thyroid lesions.

Ethical board

This study was approved by the Internal Review Board of the University of Pisa. All patients gave their consent for the participation to the study.

RNA and DNA isolation

Archival FNA slides stained with Papanicolaou technique were kept in xylene for 1 to 3 days, depending on the time of storage, in order to detach slide coverslips. The slides were then hydrated in a graded series of ethanol followed by a wash in distilled H2O for 1 minute. The slides were finally air dried. RNA extraction was performed using a commercial kit (High Pure RNA Paraffin kit, Roche). The lysis solution was poured on the slide to scrape off the cytological stained sample. Whole scraped tissue was then collected in a microcentrifuge tube and processed for RNA extraction. The quantity/quality of RNA was estimated with Nanodrop 1000 spectrophotometer using 1 μl of undiluted RNA solution. RNA was treated with DNase Ι recombinant, RNase-free (Roche). RNA was reverse transcribed in a final volume of 20 μl, containing 5X RT buffer, 10 mM dNTPs, 50 ng/μl Random Primers, 0.1 M DTT, 40 U/μl RNaseOUT, 50 μM oligo(dT), DEPC-Treated Water, 15 U/μl Cloned AMV reverse transcriptase (Invitrogen, Carlsbad, CA).

DNA was isolated directly from stained cells using a commercial kit (Nucleospin, Macherey-Nagel, Düren, Germany) according to the manufacturer’s instructions.

Gene expression analysis

The level of KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH expression was analysed by quantitative PCR (q-PCR) on the Rotor-Gene 6000 real time rotary analyzer (Corbett, Life Science, Australia) following the manufacturing instructions. Endogenous reference gene (B2M, beta 2 microglobulin) was used to normalize each gene expression level. PCR products were previously sequenced using the Applied Biosystems 3130xl Genetic Analyzer (Foster City, CA) to confirm gene sequence. PCR was performed in 25 μl final volume, containing 5 μl of cDNA, 12.5 μl of MESA GREEN qRT-PCR MasterMix Plus (EUROGENTEC, San Diego, CA), 40 pmol of each primer (Invitrogen, Carlsbad, CA) per reaction with the following cycling conditions: initial denaturation 95°C for 5 min; 40 cycles at 95°C for 15 sec, 61°C for 40 sec, 72°C for 40 sec; final step 25°C for 1 min. Primers were selected using Primer3 software:

KIT F: 5’- GCACCTGCTGCTGAAATGTATGACATAAT - 3’

KIT R: 5’- TTTGCTAAGTTGGAGTAAATATGATTGG - 3’

SYNGR2 F: 5’- ATCTTCTCCTGGGGTGTGCT - 3’

SYNGR2 R: 5’- AGGGTGGCTGTTGGTAGTTG - 3’

C21orf4 F: 5’- GACAACAGTGGCTGTGTTTTAAG - 3’

C21orf4 R: 5’- GCATTGGATACAGCATTTATCAT - 3’

Hs.296031 F: 5’- TGCCAAGGAGCTTTATAGAA - 3’

Hs.296031 R: 5’- ATGACGGCATGTACCAACCA - 3’

DDI2 F: 5’- TGCAGTTCCCAAACTTACCC- 3’

DDI2 R: 5’- CAGCAACATATCTCGGAGCA- 3’

CDH1 F: 5’- GCATTGCCACATACACTCTC- 3’

CDH1 R: 5’- AGCACCTTCCATGACAGAC- 3’

LSM7 F: 5’-GACGATCCGGGTAAAGTTCCA - 3’

LSM7 R: 5’- AGGTTGAGGAGTGGGTCGAA - 3’

TC1 F: 5’- AAATCTTCTGACTAATGCTAAAACG - 3’

TC1 R: 5’- TTATTGTTGCATGACATTTGC - 3’

NATH F: 5’-AAGAAACCAAAGGGGAACTT - 3’

NATH R: 5’- TAATAGGCCCAGTTTTCAGG - 3’

B2M F: 5’- CATTCCTGAAGCTGACAGCATTC - 3’

B2M R: 5’- TGCTGGATGACGTGAGTAAACC - 3’

Standard curves were generated for each gene for data analysis. To verify primers specificities, melting curve analysis was performed. Fluorescent data were acquired during the extension phase. After 40 cycles, a melting curve for each gene was generated by increasing the temperature from 50°C to 99°C (1°C for each step), while the fluorescence was measured. For each experiment a no-template reaction was included as a negative control.

For each cDNA sample the ratio between the expression value of the gene of interest and the expression value of B2M was calculated. Mean values and standard deviations of malignant and benign groups were calculated as well.

BRAF status

BRAF sequence was screened for V600E mutation by pyrosequencing. DNA was first amplified using “Rotor-Gene 6000” (Corbett Research) and then sequenced using PyroMark Q96 ID system.

PCR was performed with the following conditions: initial denaturation 95°C for 3 min; 40 cycles at 95°C for 30 sec, 55°C for 30 sec, 72°C for 30 sec; final step 60°C for 5 min with TaKaRa Ex Taq (Qiagen). PCR amplification and mutational analysis were performed in accordance to the Diatech manual (Anti-EGFR MoAb response BRAF status).

Statistical analysis

Gene expression analysis

Mann–Whitney test and Student’s t-test were used to determine differences between mRNA expression levels of KIT, LSM7, C21orf4, DDI2, SYNGR2, TC1, Hs.296031 and CDH1 and NATH, respectively. All the analyses were performed using Statgraphics Centurion (V. 15, StatPoint, Inc.).

ROC analysis

To determine the diagnostic accuracy of the molecular computational model, we calculated the area under the curve (AUC) of the receiver operating characteristic (ROC) curve for each gene individually and in combination using logistic regression analysis (Medcalc 11, Medcalc Software, Stata Software).

BNN classifier

Several computational models (Neural Network Bayesian Classifiers) were built in order to find the best combination of markers able to discriminate benign from malignant thyroid samples using Statgraphics Centurion (V. 15, StatPoint, Inc.).

Molecular diagnosis

Fisher’s test was used to compare samples correctly classified by the BNN model according to their probability score (> 90% and <90%). The diagnostic gain was then calculated after applying molecular tests (BRAF, KIT and BNN model).

Correlation analysis

In order to evaluate the biological importance of the markers analysed, a multiple-variable correlation analysis was performed between the markers (Partek software).

Results

Gene expression levels

KIT, CDH1, LSM7, C21orf4, DDI2 mRNA expression levels were significantly different between benign and malignant tumors, p(KIT) < 0.0001; p(CDH1) = 0.004; p(LSM7) = 0.03; p(C21orf4) = 0.01; p(DDI2) = 0.0001. No statistically significant difference was found for NATH, SYNGR2, TC1, Hs.296031. Among the markers, all but TC1 resulted expressed higher in benign samples compared to the malignant ones (Figure 1A).

thumbnailFigure 1. Expression mean of 49 malignant (red) and 38 benign (green) samples for each marker (A). ROC analysis for KIT, CDH1, LSM7, C21orf4, DDI2 separately. Among the markers, KIT resulted to be the most powerful in discriminating benign from malignant thyroid tumors (AUC = 0.9) (B). ROC analysis for KIT, CDH1, LSM7, C21orf4, DDI2, and BRAF status in combination (AUC = 0.88) (C).

BRAF status

Among the 49 malignant samples, 28 carried the V600E mutation. All the benign samples were wild-type. Sensitivity and specificity of BRAF test were 57 and 100%, respectively.

ROC analyses

We employed receiver-operated characteristics (ROC) curve analyses to determine model robustness for predicting malignancy in thyroid samples using the expression of each gene individually (Figure 1B, Table 1). Among the markers, KIT showed the highest AUC (0.9). We also performed a ROC analysis for the statistical significant markers (KIT, CDH1, LSM7, C21orf4, DDI2) and BRAF status in combination, the AUC resulted to be 0.8824, the sensitivity 91% and specificity 63% (Figure 1C). Although the AUC resulted quite similar to the KIT one, the predictive power increased when the markers were combined together.

Table 1. ROC analysis for each marker individually

Neural networks

The expression data of the markers were used to build Bayesian Neural Networks (BNN) in order to estimate the probability of thyroid malignancy.

We built several BNNs in order to find the most predictive one. This procedure uses a Probabilistic Neural Network (PNN) to classify cases into malignant and benign categories, based on 9 input variables (KIT, LSM7, C21orf4, DDI2, SYNGR2, TC1, Hs.296031, CDH1, NATH), by implementing a nonparametric method for classifying observations into one of benign and malignant groups based on the observed expression variables.

The Neural Network Bayesian Classifier made up of all markers has a predictive power of 80%, while the classifier made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1 and Hs.296031 resulted to have a predictive power of 87.7%.

The analysis was then conducted on 6 unknown samples. The pathological diagnosis for each sample was kept blinded until after the analysis was completed. When the blind was broken, we found that 5 of the 6 unknown samples were diagnosed by the model in concordance with the diagnosis determined by standard pathological criteria.

We also built a neural network classifier made up of the markers used in the most predictive model (KIT, CDH1, LSM7, C21orf4, DDI2, TC1 and Hs.296031) plus BRAF status. This classifier had a predictive power of 88.8%, and, more importantly, it resulted to completely discriminate the 6 unknown samples when the blind was broken (Table 2). When applying the BNN model, no classification errors came out when the probability of diagnosis was higher than 90%, thus allowing us to use this model as a correct predictor of samples with a probability score >90% (p < 0.0001).

Table 2. Probability values of the prediction model for the unknown samples

Role of molecular diagnosis in increasing the diagnostic accuracy of FNAC

We stratified the samples depending on either the histological and cytological diagnosis (Table 3) and then calculated the diagnostic gain obtained by applying BRAF molecular analysis, KIT expression model and BNN model to the indeterminate samples (Table 4).

Table 3. Histological and cytological diagnosis of 87 thyroid nodules

Table 4. Role of molecular tests in the preoperative diagnosis

Among the indeterminate samples (IFP and SPTC) at the cytological level, 11 SPTC were correctly diagnosed as malignant by BRAF test, 4 additional samples were correctly classified by KIT model as 1 malignant and 3 benign, and 9 additional samples were diagnosed by the BNN model as 1 malignant and 8 benign. As shown in Table 4, when applying the molecular analysis, 13 malignant samples were moved to the diagnostic group of PTC and the total number of PTC raised from 30 (61%) to 43 (88%) with a malignancy diagnostic gain of 27%. Similarly, 11 IFP samples were moved to the diagnostic group of BN and the total number of BN rose from 19 (50%) to 30 (79%) with a benignity diagnostic gain of 29%.

Finally, if we consider both PTC and BN diagnoses, the whole diagnostic gain is of 28% with a statistically significant p-value of 0.0001.

Correlation analysis

A multiple-variable analysis was performed to evaluate the correlation between the markers. The knowledge of the correlation of the markers could give us a better understanding of the mechanisms underlying thyroid cancer biology. In fact, the statistical correlation may reflect biologically correlation between markers.

Pearson’s correlations between pairs of variable are reported in Figure 2.

thumbnailFigure 2. Similarity matrix of KIT, SYNGR2, C21orf4, Hs.296031, Hs.24183, CDH1, LSM7, TC1 and NATH based on Pearson’s correlation coefficient.

Discussion

Many candidate markers of thyroid cancer have been identified in microarray studies that require analytic and clinical validation in a cohort large enough to permit evaluation of their clinical utility. q-PCR has become a highly reliable technique that allows precise quantification of gene expression levels identified by microarray studies from various laboratories [20-22]. Moreover, q-PCR has been in clinical use as a diagnostic test in various fields of medicine.

Currently, the diagnosis of thyroid nodules relies primarily on cytology. For the majority of patients with PTC, FNA-based cytology can make a diagnosis with high accuracy [3]. However, there is a significant proportion of neoplasm in which the FNA-based preoperative cytological diagnosis fails.

The primary aim of this study was to find a diagnostic accurate preoperative assay able to distinguish benign from malignant thyroid neoplasm. We found 5 out of 9 proposed gene markers (KIT, LSM7, C21orf4, DDI2, CDH1) differentially expressed in malignant and benign thyroid samples with a significant p-value (<0.05).

Of particular interest is the down regulation of KIT and CDH1 in malignant samples.

We previously showed that the expression silencing of KIT gene is associated with the malignant phenotype of thyroid nodules and KIT expression may represent a useful tool in the preoperative management of thyroid lesions [7]. KIT is a well-known proto-oncogene. Other studies obtained findings similar to ours [6,23]. We speculated that in some cell types KIT expression positively regulates mitogenesis and is selected for in neoplastic transformation; in other tissues (such as thyroid tissue) KIT is involved in morphogenesis and differentiation and is, therefore, negatively selected in the course of tumor progression. Although the functional consequences of this modulation are unknown so far, KIT is likely to be relevant in regulating thyrocyte differentiation and survival, however further work is needed to elucidate the biological meaning of KIT down-expression in PTCs.

CDH1 encodes for E-cadherin. We found a down regulation of CDH1 expression in malignant samples and this is in perfect concordance with the literature. Loss of E-cadherin function or expression has been implicated in cancer progression and metastasis [24-26]. In fact, E-cadherin down-regulation decreases the strength of cellular adhesion within a tissue, resulting in an increase in cellular motility. This in turn may allow cancer cells to cross the basement membrane and invade surrounding tissues.

Regarding TC1, several studies reported a higher expression of this protein in thyroid malignancies compared to benign nodules. Concordant to the literature, we observed a tendency of TC1 to be overexpressed in our cohort of malignant samples, though not statistically significant. TC1 has been shown to interact with Chibby (Cby) [27], which regulates the β-catenin-mediated transcription antagonistically and thereby enhances the signaling pathway through relieving the suppression by Cby. TC1 regulation of Cby is of considerable biological significance in the Wnt/β-catenin pathway. Indeed TC1 up-regulates β-catenin target genes implicated in invasiveness and aggressive behaviour of cancer.

For the other markers it is difficult to speculate since their function and role in thyroid carcinogenesis are still largely unknown. Additional functional studies are needed to elucidate their role in thyroid cancer initiation and progression.

When assessing the diagnostic utility of the markers, KIT, LSM7, C21orf4, DDI2, and CDH1 had a high diagnostic accuracy. Thus, all the significant markers, alone and in combination, can be used to distinguish between malignant and benign FNA samples.

Recently, a new class of techniques known as Bayesian Neural Networks (BNN) have been used as a supplement or alternative to standard statistical techniques [11]. Since they do not require explicit distributional assumptions, BNNs have been employed for the classification of medical outcomes [11]. We developed a Bayesian Artificial Neural Network model based on data collected from FNA samples. Bayesian classification has been applied across the spectrum of medicine, from optimization of pharmacotherapy dosing [28,29], predicting cancer screening [30] and diagnostic test results [31,32], to determining injury severity [33], assessing operative risk [34] and predicting surgical outcomes [35-38]. We built several Neural Networks and the most predictive one has resulted to be made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1 and Hs.296031, with a power of 87.7%. The network was then validated on 6 unknown samples. The model determined the accurate diagnosis of 5 of 6 unknown samples tested, based on a comparison to the gold standard pathological diagnosis as determined by clinical pathologists.

It’s important to notice that we have put in the model also two non-significant markers (TC1, Hs.296031), because their contribution to the predictive power seemed to be relevant. In fact, some variables although not significant may increase the discriminative power to a model refining the predictions.

The classifier built using also BRAF mutational status resulted to have a predictive power of 88.8% and to successfully discriminate the unknown samples when the blind was broken (Table 2), thus the gene expression analysis combined to the BRAF mutational analysis may represent a very useful test to preoperatively discriminate benign from malignant thyroid tumors.

The probability of the prediction of diagnosis for almost all the samples resulted to range between 95% and 100%, thus, although the general prediction value is 88.8%, the predictive power to assess each sample individually can reach a value of 100%. These data also strengthen the importance of the 8-markers model as an adjunctive tool for the preoperative diagnosis of thyroid nodules.

We also stratified the samples depending on both the histological and cytological diagnoses (Table 3). The diagnostic gain obtained by applying BRAF molecular analysis, KIT expression model and BNN model was then calculated.

By applying the BNN model, no classification errors came out when the probability of diagnosis was higher than 90%, thus allowing us to use this model as a correct predictor of samples with a probability score >90% (p < 0.0001).

We then calculated the diagnostic gain after applying molecular tests (Table 4).

Among the uncertain samples (IFP and SPTC) at the cytological level, 11 were correctly diagnosed by BRAF test, 4 additional samples by KIT model and 9 additional samples by the BNN model. It is important to point out that IFP lesions are often very difficult to diagnose even at frozen section and in this study we developed a molecular approach that is able to correctly classify as certain benign 46% (11/24) of IFP lesions. Therefore using molecular approaches these patients would have been clinically enrolled to the follow up group instead of sent to surgery. Thus, the combined use of the molecular tests resulted to produce a diagnostic gain of 28% (Table 4). Basically, what we propose in this paper is the use of BRAF molecular analysis (after uncertain cytological diagnosis) to assess the malignancy of thyroid nodules in the first place, then the use of KIT model for the indeterminate nodules and at last the use of the 8-gene model to ultimately assess the diagnosis of the nodules that otherwise would remain suspicious (Figure 3). The combinatorial power of these tools could definitely increase the percentage of thyroid nodules correctly classified while decreasing the ones remained indeterminate.

thumbnailFigure 3. Diagram showing the preoperative assessment of thyroid malignancy.

All these findings strengthen the importance of molecular pathology where morphology and molecular alterations represent a powerful approach to diagnosis. In this line, this study aimed to assess the diagnostic potential of the 8-gene expression model as an adjunctive tool in the preoperative management of thyroid nodules. We demonstrated that the 8-gene expression model provides an increased diagnostic power to the molecular pathology approach based on BRAF mutation and KIT expression analysis.

We also performed a multiple variable analysis among all the markers, independently on the diagnostic classification, in order to evaluate a possible functional correlation among the markers (Figure 2). In literature there is no evidence about the biological correlation among the well-studied markers; however it is interesting to note that the unknown marker Hs.296031 statistically correlates with NATH, C21orf4, DDI2, SYNGR2 and TC1. This may reflect also a biological correlation, thus, further studies are needed to explore this phenomenon.

Conclusion

The genetic classification obtained with the model here presented is highly accurate and may provide a tool to overcome the difficulties in today’s preoperative diagnosis of thyroid malignancies. We hoped that the quantitative nature of this test will be a useful gene-based objective adjunct to the preoperative diagnosis of a disease that currently relies solely on cytology.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

ST carried out the study, analysed the data and wrote the manuscript draft. CM and GB conceived of the manuscript, participated in its design, coordination, analysis and interpretation of data and supervised the writing of the manuscript. PA participated in the statistical data analysis. All the authors made intellectual contributions and approved the final manuscript.

References

  1. Mazzaferri EL: Thyroid cancer in thyroid nodules: finding a needle in the haystack.

    Am J Med 1992, 93(4):359-362. PubMed Abstract | Publisher Full Text OpenURL

  2. Ross DS: Nonpalpable thyroid nodules–managing an epidemic.

    J Clin Endocrinol Metab 2002, 87(5):1938-1940. PubMed Abstract | Publisher Full Text OpenURL

  3. Segev DL, Clark DP, Zeiger MA, Umbricht C: Beyond the suspicious thyroid fine needle aspirate. A review.

    Acta Cytol 2003, 47(5):709-722. PubMed Abstract | Publisher Full Text OpenURL

  4. Baloch ZW, LiVolsi VA: The quest for a magic tumor marker: continuing saga in the diagnosis of the follicular lesions of thyroid.

    Am J Clin Pathol 2002, 118(2):165-166. PubMed Abstract | Publisher Full Text OpenURL

  5. Shibru D, Hwang J, Khanafshar E, Duh QY, Clark OH, Kebebew E: Does the 3-gene diagnostic assay accurately distinguish benign from malignant thyroid neoplasms?

    Cancer 2008, 113(5):930-935. PubMed Abstract | Publisher Full Text OpenURL

  6. Mazzanti C, Zeiger MA, Costouros NG, Umbricht C, Westra WH, Smith D, Somervell H, Bevilacqua G, Alexander HR, Libutti SK: Using gene expression profiling to differentiate benign versus malignant thyroid tumors.

    Cancer Res 2004, 64(8):2898-2903. PubMed Abstract | Publisher Full Text OpenURL

  7. Tomei S, Mazzanti C, Marchetti I, Rossi L, Zavaglia K, Lessi F, Apollo A, Aretini P, Di Coscio G, Bevilacqua G: c-KIT receptor expression is strictly associated with the biological behaviour of thyroid nodules.

    J Transl Med 2012, 10(1):7. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  8. Arnesen T, Betts MJ, Pendino F, Liberles DA, Anderson D, Caro J, Kong X, Varhaug JE, Lillehaug JR: Characterization of hARD2, a processed hARD1 gene duplicate, encoding a human protein N-alpha-acetyltransferase.

    BMC Biochem 2006, 7:13. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  9. Chua EL, Young L, Wu WM, Turtle JR, Dong Q: Cloning of TC-1 (C8orf4), a novel gene found to be overexpressed in thyroid cancer.

    Genomics 2000, 69(3):342-347. PubMed Abstract | Publisher Full Text OpenURL

  10. Fluge O, Bruland O, Akslen LA, Varhaug JE, Lillehaug JR: NATH, a novel gene overexpressed in papillary thyroid carcinomas.

    Oncogene 2002, 21(33):5056-5068. PubMed Abstract | Publisher Full Text OpenURL

  11. Sargent DJ: Comparison of artificial neural networks with other statistical approaches: results from medical data sets.

    Cancer 2001, 91(8 Suppl):1636-1642. PubMed Abstract OpenURL

  12. Kedra D, Pan HQ, Seroussi E, Fransson I, Guilbaud C, Collins JE, Dunham I, Blennow E, Roe BA, Piehl F, et al.: Characterization of the human synaptogyrin gene family.

    Hum Genet 1998, 103(2):131-141. PubMed Abstract | Publisher Full Text OpenURL

  13. Olesen C, Nyeng P, Kalisz M, Jensen TH, Moller M, Tommerup N, Byskov AG: Global gene expression analysis in fetal mouse ovaries with and without meiosis and comparison of selected genes with meiosis in the testis.

    Cell Tissue Res 2007, 328(1):207-221. PubMed Abstract | Publisher Full Text OpenURL

  14. Tharun S, He W, Mayes AE, Lennertz P, Beggs JD, Parker R: Yeast Sm-like proteins function in mRNA decapping and decay.

    Nature 2000, 404(6777):515-518. PubMed Abstract | Publisher Full Text OpenURL

  15. Conte N, Charafe-Jauffret E, Delaval B, Adelaide J, Ginestier C, Geneix J, Isnardon D, Jacquemier J, Birnbaum D: Carcinogenesis and translational controls: TACC1 is down-regulated in human cancers and associates with mRNA regulators.

    Oncogene 2002, 21(36):5619-5630. PubMed Abstract | Publisher Full Text OpenURL

  16. Moldrich RX, Laine J, Visel A, Beart PM, Laffaire J, Rossier J, Potier MC: Transmembrane protein 50b (C21orf4), a candidate for Down syndrome neurophenotypes, encodes an intracellular membrane protein expressed in the rodent brain.

    Neuroscience 2008, 154(4):1255-1266. PubMed Abstract | Publisher Full Text OpenURL

  17. Hardy RG, Vicente-Duenas C, Gonzalez-Herrero I, Anderson C, Flores T, Hughes S, Tselepis C, Ross JA, Sanchez-Garcia I: Snail family transcription factors are implicated in thyroid carcinogenesis.

    Am J Pathol 2007, 171(3):1037-1046. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  18. Hoque MO, Rosenbaum E, Westra WH, Xing M, Ladenson P, Zeiger MA, Sidransky D, Umbricht CB: Quantitative assessment of promoter methylation profiles in thyroid neoplasms.

    J Clin Endocrinol Metab 2005, 90(7):4011-4018. PubMed Abstract | Publisher Full Text OpenURL

  19. Sunde M, McGrath KC, Young L, Matthews JM, Chua EL, Mackay JP, Death AK: TC-1 is a novel tumorigenic and natively disordered protein associated with thyroid cancer.

    Cancer Res 2004, 64(8):2766-2773. PubMed Abstract | Publisher Full Text OpenURL

  20. Ginzinger DG: Gene quantification using real-time quantitative PCR: an emerging technology hits the mainstream.

    Exp Hematol 2002, 30(6):503-512. PubMed Abstract | Publisher Full Text OpenURL

  21. Ohlsson L, Hammarstrom ML, Israelsson A, Naslund L, Oberg A, Lindmark G, Hammarstrom S: Biomarker selection for detection of occult tumour cells in lymph nodes of colorectal cancer patients using real-time quantitative RT-PCR.

    Br J Cancer 2006, 95(2):218-225. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  22. Schroder CP, Ruiters MH, de Jong S, Tiebosch AT, Wesseling J, Veenstra R, de Vries J, Hoekstra HJ, de Leij LF, de Vries EG: Detection of micrometastatic breast cancer by means of real time quantitative RT-PCR and immunostaining in perioperative blood samples and sentinel nodes.

    Int J Cancer 2003, 106(4):611-618. PubMed Abstract | Publisher Full Text OpenURL

  23. Rosen J, He M, Umbricht C, Alexander HR, Dackiw AP, Zeiger MA, Libutti SK: A six-gene model for differentiating benign from malignant thyroid tumors on the basis of gene expression.

    Surgery 2005, 138(6):1050-1056.

    discussion 1056–1057

    PubMed Abstract | Publisher Full Text OpenURL

  24. Baranwal S, Alahari SK: Molecular mechanisms controlling E-cadherin expression in breast cancer.

    Biochem Biophys Res Commun 2009, 384(1):6-11. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  25. Cavallaro U, Christofori G: Cell adhesion and signalling by cadherins and Ig-CAMs in cancer.

    Nat Rev Cancer 2004, 4(2):118-132. PubMed Abstract | Publisher Full Text OpenURL

  26. Schmalhofer O, Brabletz S, Brabletz T: E-cadherin, beta-catenin, and ZEB1 in malignant progression of cancer.

    Cancer Metastasis Rev 2009, 28(1–2):151-166. PubMed Abstract | Publisher Full Text OpenURL

  27. Jung Y, Bang S, Choi K, Kim E, Kim Y, Kim J, Park J, Koo H, Moon RT, Song K, et al.: TC1 (C8orf4) enhances the Wnt/beta-catenin pathway by relieving antagonistic activity of Chibby.

    Cancer Res 2006, 66(2):723-728. PubMed Abstract | Publisher Full Text OpenURL

  28. Rodvold KA, Pryka RD, Kuehl PG, Blum RA, Donahue P: Bayesian forecasting of serum gentamicin concentrations in intensive care patients.

    Clin Pharmacokinet 1990, 18(5):409-418. PubMed Abstract | Publisher Full Text OpenURL

  29. Wakefield J, Racine-Poon A: An application of Bayesian population pharmacokinetic/pharmacodynamic models to dose recommendation.

    Stat Med 1995, 14(9–10):971-986. PubMed Abstract OpenURL

  30. Burnside ES, Rubin DL, Fine JP, Shachter RD, Sisney GA, Leung WK: Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: initial experience.

    Radiology 2006, 240(3):666-673. PubMed Abstract | Publisher Full Text OpenURL

  31. Christiansen CL, Wang F, Barton MB, Kreuter W, Elmore JG, Gelfand AE, Fletcher SW: Predicting the cumulative risk of false-positive mammograms.

    J Natl Cancer Inst 2000, 92(20):1657-1666. PubMed Abstract | Publisher Full Text OpenURL

  32. Edwards FH, Schaefer PS, Cohen AJ, Bellamy RF, Thompson L, Graeber GM, Barry MJ: Use of artificial intelligence for the preoperative diagnosis of pulmonary lesions.

    Ann Thorac Surg 1989, 48(4):556-559. PubMed Abstract | Publisher Full Text OpenURL

  33. Burd RS, Ouyang M, Madigan D: Bayesian logistic injury severity score: a method for predicting mortality using international classification of disease-9 codes.

    Acad Emerg Med 2008, 15(5):466-475. PubMed Abstract | Publisher Full Text OpenURL

  34. Fazio VW, Tekkis PP, Remzi F, Lavery IC: Assessment of operative risk in colorectal cancer surgery: the Cleveland Clinic Foundation colorectal cancer model.

    Dis Colon Rectum 2004, 47(12):2015-2024. PubMed Abstract | Publisher Full Text OpenURL

  35. Biagioli B, Scolletta S, Cevenini G, Barbini E, Giomarelli P, Barbini P: A multivariate Bayesian model for assessing morbidity after coronary artery surgery.

    Crit Care 2006, 10(3):R94. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  36. Edwards FH, Peterson RF, Bridges C, Ceithaml EL: 1988: use of a Bayesian statistical model for risk assessment in coronary artery surgery. Updated in 1995.

    Ann Thorac Surg 1995, 59(6):1611-1612. PubMed Abstract | Publisher Full Text OpenURL

  37. Hoot N, Aronsky D: Using Bayesian networks to predict survival of liver transplant patients.

    AMIA Annu Symp Proc 2005, 2005:345-349. OpenURL

  38. Lenihan CR, O'Kelly P, Mohan P, Little D, Walshe JJ, Kieran NE, Conlon PJ: MDRD-estimated GFR at one year post-renal transplant is a predictor of long-term graft function.

    Ren Fail 2008, 30(4):345-352. PubMed Abstract | Publisher Full Text OpenURL

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2407/12/396/prepub