J Pathol Transl Med.  2025 Jan;59(1):39-49. 10.4132/jptm.2024.09.14.

Diagnosis of invasive encapsulated follicular variant papillary thyroid carcinoma by protein-based machine learning

Affiliations
  • 1Department of Pathology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
  • 2Department of Pathology, University of Yamanashi, Chuo City, Japan
  • 3Functional Proteomics Technology Laboratory, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Pathumthani, Thailand
  • 4Chulalongkorn GenePRO Center, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
  • 5Department of Oral Biology, Faculty of Dentistry, Mahidol University, Bangkok, Thailand
  • 6Precision Pathology of Neoplasia Research Group, Department of Pathology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand

Abstract

Background
Although the criteria for follicular-pattern thyroid tumors are well-established, diagnosing these lesions remains challenging in some cases. In the recent World Health Organization Classification of Endocrine and Neuroendocrine Tumors (5th edition), the invasive encapsulated follicular variant of papillary thyroid carcinoma was reclassified as its own entity. It is crucial to differentiate this variant of papillary thyroid carcinoma from low-risk follicular pattern tumors due to their shared morphological characteristics. Proteomics holds significant promise for detecting and quantifying protein biomarkers. We investigated the potential value of a protein biomarker panel defined by machine learning for identifying the invasive encapsulated follicular variant of papillary thyroid carcinoma, initially using formalin- fixed paraffin-embedded samples.
Methods
We developed a supervised machine-learning model and tested its performance using proteomics data from 46 thyroid tissue samples.
Results
We applied a random forest classifier utilizing five protein biomarkers (ZEB1, NUP98, C2C2L, NPAP1, and KCNJ3). This classifier achieved areas under the curve (AUCs) of 1.00 and accuracy rates of 1.00 in training samples for distinguishing the invasive encapsulated follicular variant of papillary thyroid carcinoma from non-malignant samples. Additionally, we analyzed the performance of single-protein/gene receiver operating characteristic in differentiating the invasive encapsulated follicular variant of papillary thyroid carcinoma from others within The Cancer Genome Atlas projects, which yielded an AUC >0.5.
Conclusions
We demonstrated that integration of high-throughput proteomics with machine learning can effectively differentiate the invasive encapsulated follicular variant of papillary thyroid carcinoma from other follicular pattern thyroid tumors.

Keyword

Follicular pattern thyroid tumors; Thyroid carcinoma; Machine learning, proteomics; Histological diagnosis

Figure

  • Fig. 1. (A) The study design, featuring the training and internal testing phases of our model. (B) The screening process used to pinpoint proteins that most effectively distinguish between IEFVPTC and non-IEFVPTC. IEFVPTC, invasive encapsulated follicular variant of papillary thyroid carcinoma; SMOTE, Synthetic Minority Oversampling Techniques; MUD, model univariate deviance.

  • Fig. 2. (A) Model univariate deviance (MUD) plot of the optimal cumulative number of proteins. (B) A heatmap with hierarchical clustering of the five selected proteins used to train the model. IEFVPTC, invasive encapsulated follicular variant of papillary thyroid carcinoma.

  • Fig. 3. Receiver operating characteristic analyses of our model for differentiating invasive encapsulated follicular variant of papillary thyroid carcinoma (IEFVPTC) from non-IEFVPTC in the training (A) and internal test (B) sets. This features the calibration plots of our model in both the training (C) and internal test (D) phases and the confusion matrices during the training (E) and internal testing (F) periods. AUC, area under the curve.

  • Fig. 4. Sensitivity analysis of our model when the input is disturbed by 30% (upper), 40% (middle), and 50% (lower). This indicates model robustness under different conditions. AUC, areas under the curve; IEFVPTC, invasive encapsulated follicular variant of papillary thyroid carcinoma.


Reference

References

1. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021; 71:209–49. DOI: 10.3322/caac.21660. PMID: 33538338.
2. Miranda-Filho A, Lortet-Tieulent J, Bray F, et al. Thyroid cancer incidence trends by histology in 25 countries: a population-based study. Lancet Diabetes Endocrinol. 2021; 9:225–34. DOI: 10.1016/s2213-8587(21)00027-9. PMID: 33662333.
3. Kitahara CM, Sosa JA. The changing incidence of thyroid cancer. Nat Rev Endocrinol. 2016; 12:646–53. DOI: 10.1038/nrendo.2016.110. PMID: 27418023.
4. WHO classification of tumours of endocrine organs. 5th beta ed. [Internet]. Geneva: World Health Organization;2022. [cited 2024 May 20]. Available from: https://tumourclassification.iarc.who.int.
5. Baloch ZW, Asa SL, Barletta JA, et al. Overview of the 2022 WHO classification of thyroid neoplasms. Endocr Pathol. 2022; 33:27–63. DOI: 10.1007/s12022-022-09707-3. PMID: 35288841.
6. Na HY, Park SY. Noninvasive follicular thyroid neoplasm with papillary-like nuclear features: its updated diagnostic criteria, preoperative cytologic diagnoses and impact on the risk of malignancy. J Pathol Transl Med. 2022; 56:319–25. DOI: 10.4132/jptm.2022.09.29. PMID: 36345620.
7. Haugen BR, Alexander EK, Bible KC, et al. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid. 2016; 26:1–133. DOI: 10.1089/thy.2015.0020. PMID: 26462967.
8. Gillet LC, Leitner A, Aebersold R. Mass spectrometry applied to bottom-up proteomics: entering the high-throughput era for hypothesis testing. Annu Rev Anal Chem (Palo Alto Calif). 2016; 9:449–72. DOI: 10.1146/annurev-anchem-071015-041535. PMID: 27049628.
9. Kulyyassov A, Fresnais M, Longuespee R. Targeted liquid chromatography-tandem mass spectrometry analysis of proteins: basic principles, applications, and perspectives. Proteomics. 2021; 21:e2100153. DOI: 10.1002/pmic.202100153. PMID: 34591362.
10. Sun Y, Li L, Zhou Y, et al. Stratification of follicular thyroid tumours using data-independent acquisition proteomics and a comprehensive thyroid tissue spectral library. Mol Oncol. 2022; 16:1611–24. DOI: 10.1002/1878-0261.13198. PMID: 35194950.
11. Huang D, Zhang H, Li L, et al. Proteotypic differences of follicularpatterned thyroid neoplasms. Front Endocrinol (Lausanne). 2022; 13:854611. DOI: 10.3389/fendo.2022.854611. PMID: 35923625.
12. Suzuki A, Nojima S, Tahara S, et al. Identification of invasive subpopulations using spatial transcriptome analysis in thyroid follicular tumors. J Pathol Transl Med. 2024; 58:22–8. DOI: 10.4132/jptm.2023.11.21. PMID: 38229431.
13. Nguyen TP, Roytrakul S, Buranapraditkun S, Shuangshoti S, Kitkumthorn N, Keelawat S. Proteomics profile in encapsulated follicular patterned thyroid neoplasms. Sci Rep. 2024; 14:16343. DOI: 10.1038/s41598-024-67079-6. PMID: 39013964.
14. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002; 16:321–57. DOI: 10.1613/jair.953.
15. van Buuren S, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. J Stat Softw. 2011; 45:1–67.
16. Li Y, Wu F, Ge W, et al. Risk stratification of papillary thyroid cancers using multidimensional machine learning. Int J Surg. 2024; 110:372–84. DOI: 10.1097/js9.0000000000000814. PMID: 37916932.
17. Wenk D, Zuo C, Kislinger T, Sepiashvili L. Recent developments in mass-spectrometry-based targeted proteomics of clinical cancer biomarkers. Clin Proteomics. 2024; 21:6. DOI: 10.1186/s12014-024-09452-1. PMID: 38287260.
18. Rezania S, Kammerer S, Li C, et al. Overexpression of KCNJ3 gene splice variants affects vital parameters of the malignant breast cancer cell line MCF-7 in an opposing manner. BMC Cancer. 2016; 16:628. DOI: 10.1186/s12885-016-2664-8. PMID: 27519272.
19. Chandra B, Michmerhuizen NL, Shirnekhi HK, et al. Phase separation mediates NUP98 fusion oncoprotein leukemic transformation. Cancer Discov. 2022; 12:1152–69. DOI: 10.1158/2159-8290.cd-21-0674. PMID: 34903620.
20. Zhang P, Sun Y, Ma L. ZEB1: at the crossroads of epithelial-mesenchymal transition, metastasis and therapy resistance. Cell Cycle. 2015; 14:481–7. DOI: 10.1080/15384101.2015.1006048. PMID: 25607528.
21. Xu B, Wang L, Tuttle RM, Ganly I, Ghossein R. Prognostic impact of extent of vascular invasion in low-grade encapsulated follicular cell-derived thyroid carcinomas: a clinicopathologic study of 276 cases. Hum Pathol. 2015; 46:1789–98. DOI: 10.1016/j.humpath.2015.08.015. PMID: 26482605.
Full Text Links
  • JPTM
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2025 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr