Investig Clin Urol.  2024 Nov;65(6):551-558. 10.4111/icu.20240159.

Application of deep learning for semantic segmentation in robotic prostatectomy: Comparison of convolutional neural networks and visual transformers

Affiliations
  • 1Department of Urology, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Korea
  • 2STARLABS Corp., Seoul, Korea
  • 3Department of Urology, Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Korea
  • 4Department of Urology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Abstract

Purpose
Semantic segmentation is a fundamental part of the surgical application of deep learning. Traditionally, segmentation in vision tasks has been performed using convolutional neural networks (CNNs), but the transformer architecture has recently been introduced and widely investigated. We aimed to investigate the performance of deep learning models in segmentation in robot-assisted radical prostatectomy (RARP) and identify which of the architectures is superior for segmentation in robotic surgery.
Materials and Methods
Intraoperative images during RARP were obtained. The dataset was randomly split into training and validation data. Segmentation of the surgical instruments, bladder, prostate, vas and seminal vesicle was performed using three CNN models (DeepLabv3, MANet, and U-Net++) and three transformers (SegFormer, BEiT, and DPT), and their performances were analyzed.
Results
The overall segmentation performance during RARP varied across different model architectures. For the CNN models, DeepLabV3 achieved a mean Dice score of 0.938, MANet scored 0.944, and U-Net++ reached 0.930. For the transformer architectures, SegFormer attained a mean Dice score of 0.919, BEiT scored 0.916, and DPT achieved 0.940. The performance of CNN models was superior to that of transformer models in segmenting the prostate, vas, and seminal vesicle.
Conclusions
Deep learning models provided accurate segmentation of the surgical instruments and anatomical structures observed during RARP. Both CNN and transformer models showed reliable predictions in the segmentation task; however, CNN models may be more suitable than transformer models for organ segmentation and may be more applicable in unusual cases. Further research with large datasets is needed.

Keyword

Artificial intelligence; Computer vision systems; Deep learning; Prostatectomy
Full Text Links
  • ICU
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr