J Korean Med Sci.  2021 Sep;36(35):e224. 10.3346/jkms.2021.36.e224.

Word Embedding Reveals Cyfra 21-1 as a Biomarker for Chronic Obstructive Pulmonary Disease

Affiliations
  • 1Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
  • 2Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
  • 3Department of Radiology, School of Medicine, Kangwon National University, Chuncheon, Korea
  • 4Environmental Health Center, Kangwon National University Hospital, Chuncheon, Korea
  • 5Department of Internal Medicine, Soonchunyang University Bucheon Hospital, Bucheon, Korea
  • 6Department of Convergence Software, Hallym University, Chuncheon, Korea

Abstract

Background
Although patients with chronic obstructive pulmonary disease (COPD) experience high morbidity and mortality worldwide, few biomarkers are available for COPD. Here, we analyzed potential biomarkers for the diagnosis of COPD by using word embedding.
Methods
To determine which biomarkers are likely to be associated with COPD, we selected respiratory disease-related biomarkers. Degrees of similarity between the 26 selected biomarkers and COPD were measured by word embedding. And we infer the similarity with COPD through the word embedding model trained in the large-capacity medical corpus, and search for biomarkers with high similarity among them. We used Word2Vec, Canonical Correlation Analysis, and Global Vector for word embedding. We evaluated the associations of selected biomarkers with COPD parameters in a cohort of patients with COPD.
Results
Cytokeratin 19 fragment (Cyfra 21-1) was selected because of its high similarity and its significant correlation with the COPD phenotype. Serum Cyfra 21-1 levels were determined in patients with COPD and controls (4.3 ± 5.9 vs. 3.9 ± 3.6 ng/mL, P = 0.611). The emphysema index was significantly correlated with the serum Cyfra 21-1 level (correlation coefficient = 0.219,P = 0.015).
Conclusion
Word embedding may be used for the discovery of biomarkers for COPD and Cyfra 21-1 may be used as a biomarker for emphysema. Additional studies are needed to validate Cyfra 21-1 as a biomarker for COPD.

Keyword

Chronic Obstructive Pulmonary Disease; Biomarker; Word Embedding; Cyfra 21-1

Reference

1. Kim HK, Lee SD. Pathophysiology of chronic obstructive pulmonary disease. Tuberc Respir Dis (Seoul). 2005; 59(1):5–13.
Article
2. Lopez AD, Shibuya K, Rao C, Mathers CD, Hansell AL, Held LS, et al. Chronic obstructive pulmonary disease: current burden and future projections. Eur Respir J. 2006; 27(2):397–412. PMID: 16452599.
Article
3. Mathers CD, Loncar D. Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med. 2006; 3(11):e442. PMID: 17132052.
Article
4. Kim C, Kim Y, Yang DW, Rhee CK, Kim SK, Hwang YI, et al. Direct and indirect costs of chronic obstructive pulmonary disease in Korea. Tuberc Respir Dis (Seoul). 2019; 82(1):27–34. PMID: 30302958.
Article
5. Vos T, Flaxman AD, Naghavi M, Lozano R, Michaud C, Ezzati M, et al. Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012; 380(9859):2163–2196. PMID: 23245607.
6. Abubakar I, Tillmann T, Banerjee A. GBD 2013 Mortality and Causes of Death Collaborators. Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet. 2015; 385(9963):117–171. PMID: 25530442.
7. Cazzola M, MacNee W, Martinez FJ, Rabe KF, Franciosi LG, Barnes PJ, et al. Outcomes for COPD pharmacological trials: from lung function to biomarkers. Eur Respir J. 2008; 31(2):416–469. PMID: 18238951.
Article
8. Morrow DA, de Lemos JA. Benchmarks for the assessment of novel cardiovascular biomarkers. Circulation. 2007; 115(8):949–952. PMID: 17325253.
Article
9. Singh D. Blood eosinophil counts in chronic obstructive pulmonary disease: a biomarker of inhaled corticosteroid effects. Tuberc Respir Dis (Seoul). 2020; 83(3):185–194. PMID: 32578413.
Article
10. Sin DD. Chronic obstructive pulmonary disease: reactive past, preventive future. Proc Am Thorac Soc. 2009; 6(6):523–523.
Article
11. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv. 2013.
12. Yoon BH, Kim YS. Correlation analysis of chronic obstructive pulmonary disease (COPD) and its biomarkers using the word embeddings. In : Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers); Taipei, Taiwan; IJCNLP. 2017. p. 337–342.
13. Hahm CR, Lim MN, Kim HY, Hong SH, Han SS, Lee SJ, et al. Implications of the pulmonary artery to ascending aortic ratio in patients with relatively mild chronic obstructive pulmonary disease. J Thorac Dis. 2016; 8(7):1524–1531. PMID: 27499939.
Article
14. Koo HK, Hong Y, Lim MN, Yim JJ, Kim WJ. Relationship between plasma matrix metalloproteinase levels, pulmonary function, bronchodilator response, and emphysema severity. Int J Chron Obstruct Pulmon Dis. 2016; 11:1129–1137. PMID: 27313452.
Article
15. Vogelmeier CF, Criner GJ, Martinez FJ, Anzueto A, Barnes PJ, Bourbeau J, et al. Global strategy for the diagnosis, management, and prevention of chronic obstructive lung disease 2017 report. GOLD executive summary. Am J Respir Crit Care Med. 2017; 195(5):557–582. PMID: 28128970.
Article
16. Ahn J, Cho J. Current serum lung cancer biomarkers. J Mol Biomark Diagn. 2013; 4:2.
Article
17. Srikanthan K, Feyh A, Visweshwar H, Shapiro JI, Sodhi K. Systematic review of metabolic syndrome biomarkers: a panel for early detection, management, and risk stratification in the West Virginian population. Int J Med Sci. 2016; 13(1):25–38. PMID: 26816492.
Article
18. Fabbri LM, Hurd S. Global strategy for the diagnosis, management and prevention of COPD: 2003 update. Eur Respir J. 2003; 22(1):1–2. PMID: 12882441.
Article
19. Nakayama M, Satoh H, Ishikawa H, Fujiwara M, Kamma H, Ohtsuka M, et al. Cytokeratin 19 fragment in patients with nonmalignant respiratory diseases. Chest. 2003; 123(6):2001–2006. PMID: 12796181.
20. Arai T, Inoue Y, Sugimoto C, Inoue Y, Nakao K, Takeuchi N, et al. CYFRA 21-1 as a disease severity marker for autoimmune pulmonary alveolar proteinosis. Respirology. 2014; 19(2):246–252. PMID: 24251830.
Article
21. Han Y, Heo Y, Hong Y, Kwon SO, Kim WJ. Correlation between physical activity and lung function in dusty areas: results from the chronic obstructive pulmonary disease in dusty areas (CODA) cohort. Tuberc Respir Dis (Seoul). 2019; 82(4):311–318. PMID: 31172706.
Article
22. Stieber P, Bodenmüller H, Banauch D, Hasholzner U, Dessauer A, Ofenloch-Hähnle B, et al. Cytokeratin 19 fragments: a new marker for non-small-cell lung cancer. Clin Biochem. 1993; 26(4):301–304. PMID: 7694815.
Article
23. Zemans RL, Jacobson S, Keene J, Kechris K, Miller BE, Tal-Singer R, et al. Multiple biomarkers predict disease severity, progression and mortality in COPD. Respir Res. 2017; 18(1):117. PMID: 28610627.
Article
24. Kim DK, Cho MH, Hersh CP, Lomas DA, Miller BE, Kong X, et al. Genome-wide association analysis of blood biomarkers in chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2012; 186(12):1238–1247. PMID: 23144326.
Article
25. Brusselle G, Pavord ID, Landis S, Pascoe S, Lettis S, Morjaria N, et al. Blood eosinophil levels as a biomarker in COPD. Respir Med. 2018; 138:21–31. PMID: 29724389.
Article
26. Park SY, Lee JG, Kim J, Park Y, Lee SK, Bae MK, et al. Preoperative serum CYFRA 21-1 level as a prognostic factor in surgically treated adenocarcinoma of lung. Lung Cancer. 2013; 79(2):156–160. PMID: 23206831.
Article
27. Kim J, Jung H, Kim D, Lee S, Kim M, Park K. Lack of clinical utility for CYFRA 21-1 in medical screening. Korean J Fam Pract. 2018; 8(1):73–79.
Article
28. Wieskopf B, Demangeat C, Purohit A, Stenger R, Gries P, Kreisman H, et al. Cyfra 21-1 as a biologic marker of non-small cell lung cancer. Evaluation of sensitivity, specificity, and prognostic role. Chest. 1995; 108(1):163–169. PMID: 7541742.
29. Simpson JK, Maher TM, Bentley J, Braybrooke R, Carter P, Costa MJ, et al. CYFRA-21-1 as a biomarker with prognostic potential in idiopathic pulmonary fibrosis: an analysis of the PROFILE cohort. Am J Respir Crit Care Med. 2017; 195:A6791.
30. Joo H, Park J, Lee SD, Oh YM. Comorbidities of chronic obstructive pulmonary disease in Koreans: a population-based study. J Korean Med Sci. 2012; 27(8):901–906. PMID: 22876057.
Article
31. Park TS, Lee JS, Seo JB, Hong Y, Yoo JW, Kang BJ, KOLD Study Group, et al. Study design and outcomes of Korean obstructive lung disease (KOLD) cohort study. Tuberc Respir Dis (Seoul). 2014; 76(4):169–174. PMID: 24851130.
Article
Full Text Links
  • JKMS
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr