A Machine Learning Model for Prostate Cancer Prediction in Korean Men

Choi, Sukjung; So, Beomgi; Oh, Shane; Park, Hongzoo; Lee, Sang Wook; Song, Geehyun; Lee, Jong Min; Jo, Jung Ki; Kim, Seon Hyeok; Lee, Si Eun; Cho, Eun-Bi; Jung, Jae Hung; Kim, Jeong Hyun

J Urol Oncol. 2024 Nov;22(3):201-210. 10.22465/juo.244800400020.

A Machine Learning Model for Prostate Cancer Prediction in Korean Men

Choi S ¹
So B ²
Oh S ²
Park H ¹
Lee SW ¹
Song G ³
Lee JM ¹
Jo JK ⁴
Kim SH ²
Lee SE ²
Cho EB ⁵
Jung JH ⁶
Kim JH ¹

Affiliations

¹Department of Urology, Kangwon National University School of Medicine, Chuncheon, Korea
²LifeSemantics Inc., Seoul, Korea
³Department of Urology, National Medical Center, Seoul, Korea
⁴Department of Urology, Hanyang University College of Medicine, Seoul, Korea
⁵Department of Biomedical Research Institute and Biobank, Kangwon National University Hospital, Chuncheon, Korea
⁶Department of Precision Medicine and Urology, Yonsei University Wonju College of Medicine, Wonju, Korea

KMID: 2561832
DOI: http://doi.org/10.22465/juo.244800400020

Abstract

Purpose
Unnecessary prostate biopsies for detecting prostate cancer (PCa) should be minimized. Therefore, this study developed a machine learning (ML) model to predict PCa in Korean men and evaluated its usability.
Materials and Methods
We retrospectively analyzed clinical data from 928 patients who underwent prostate biopsies at Kangwon National University Hospital between May 2013 and May 2023. Of these, 377 (41.6%) were diagnosed with PCa, and 551 (59.4%) did not have cancer. For external validation, clinical data from 385 patients aged 48–89 years who underwent prostate biopsies from September 2005 to September 2023 at Wonju Severance Christian Hospital were also included. Twenty-two clinical features were used to develop an ML model to predict PCa. Features were selected based on their contributions to model performance, leading to the inclusion of 15 features. A meta-learner was constructed using logistic regression to predict the probability of PCa, and the classifier was trained and validated on randomly extracted training and test sets at an 8:2 ratio.
Results
The prostate health index, prostate volume, age, nodule on digital rectal examination, and prostate-specific antigen were the top 5 features for predicting PCa. The area under the receiver operating characteristic curve (AUC) of the meta-learner logistic regression model was 0.89, and the accuracy, sensitivity, and specificity were 0.828, 0.711, and 0.909, respectively. Our model also showed excellent prediction performance for high-grade PCa, with a Gleason score of 7 or higher and an AUC of 0.903. Furthermore, we evaluated the performance of the model using external cohort clinical data and achieved an AUC of 0.863.
Conclusions
Our ML model excelled in predicting PCa, specifically clinically significant PCa. Although extensive cross-validation in other clinical cohorts is needed, this ML model is a promising option for future diagnostics.

Keyword

Diagnosis; Prostatic neoplasms; Machine learning

A Machine Learning Model for Prostate Cancer Prediction in Korean Men

Abstract

Keyword

Cited

Save citations to file

Email citations