Application of deep learning for semantic segmentation in robotic prostatectomy:
Comparison of convolutional neural networks and visual transformers

Pak, Sahyun; Park, Sung Gon; Park, Jeonghyun; Choi, Hong Rock; Lee, Jun Ho; Lee, Wonchul; Cho, Sung Tae; Lee, Young Goo; Ahn, Hanjong

Abstract

Purpose
Semantic segmentation is a fundamental part of the surgical application of deep learning. Traditionally, segmentation in vision tasks has been performed using convolutional neural networks (CNNs), but the transformer architecture has recently been introduced and widely investigated. We aimed to investigate the performance of deep learning models in segmentation in robot-assisted radical prostatectomy (RARP) and identify which of the architectures is superior for segmentation in robotic surgery.
Materials and Methods
Intraoperative images during RARP were obtained. The dataset was randomly split into training and validation data. Segmentation of the surgical instruments, bladder, prostate, vas and seminal vesicle was performed using three CNN models (DeepLabv3, MANet, and U-Net++) and three transformers (SegFormer, BEiT, and DPT), and their performances were analyzed.
Results
The overall segmentation performance during RARP varied across different model architectures. For the CNN models, DeepLabV3 achieved a mean Dice score of 0.938, MANet scored 0.944, and U-Net++ reached 0.930. For the transformer architectures, SegFormer attained a mean Dice score of 0.919, BEiT scored 0.916, and DPT achieved 0.940. The performance of CNN models was superior to that of transformer models in segmenting the prostate, vas, and seminal vesicle.
Conclusions
Deep learning models provided accurate segmentation of the surgical instruments and anatomical structures observed during RARP. Both CNN and transformer models showed reliable predictions in the segmentation task; however, CNN models may be more suitable than transformer models for organ segmentation and may be more applicable in unusual cases. Further research with large datasets is needed.

Application of deep learning for semantic segmentation in robotic prostatectomy: Comparison of convolutional neural networks and visual transformers

Abstract

Keyword

Cited

Save citations to file

Email citations