Clin Exp Otorhinolaryngol.  2023 Feb;16(1):28-36. 10.21053/ceo.2022.00675.

Deep Learning Techniques for Ear Diseases Based on Segmentation of the Normal Tympanic Membrane

Affiliations
  • 1Gang-won Research Institute of ICT Convergence, Gangneung-Wonju National University, Gangneung, Korea
  • 2Department of Otorhinolaryngology, Yonsei University Wonju College of Medicine, Wonju, Korea
  • 3Research Institute of Hearing Enhancement, Yonsei University Wonju College of Medicine, Wonju, Korea

Abstract


Objectives
. Otitis media is a common infection worldwide. Owing to the limited number of ear specialists and rapid development of telemedicine, several trials have been conducted to develop novel diagnostic strategies to improve the diagnostic accuracy and screening of patients with otologic diseases based on abnormal otoscopic findings. Although these strategies have demonstrated high diagnostic accuracy for the tympanic membrane (TM), the insufficient explainability of these techniques limits their deployment in clinical practice.
Methods
. We used a deep convolutional neural network (CNN) model based on the segmentation of a normal TM into five substructures (malleus, umbo, cone of light, pars flaccida, and annulus) to identify abnormalities in otoscopic ear images. The mask R-CNN algorithm learned the labeled images. Subsequently, we evaluated the diagnostic performance of combinations of the five substructures using a three-layer fully connected neural network to determine whether ear disease was present.
Results
. We obtained the receiver operating characteristic (ROC) curve of the optimal conditions for the presence or absence of eardrum diseases according to each substructure separately or combinations of substructures. The highest area under the curve (0.911) was found for a combination of the malleus, cone of light, and umbo, compared with the corresponding areas under the curve of 0.737–0.873 for each substructure. Thus, an algorithm using these five important normal anatomical structures could prove to be explainable and effective in screening abnormal TMs.
Conclusion
. This automated algorithm can improve diagnostic accuracy by discriminating between normal and abnormal TMs and can facilitate appropriate and timely referral consultations to improve patients’ quality of life in the context of primary care.

Keyword

Tympanic Membrane; Deep Learning; Mask R-CNN; Otitis Media; Otoendoscopy

Figure

  • Fig. 1. (A, B) Normal anatomic substructures of the tympanic membrane. (C) Otoendoscopy image and two diagnostic classes of normal and abnormal tympanic membranes, including nine diseases subgroups. AOM, acute otitis media; SOM, otitis media with serous effusion; MOM, otitis media with mucoid effusion; COM w/o P, chronic otitis media without perforation; COM w P, chronic otitis media with perforation; Traumatic TM, traumatic drum perforation; Sclerosis TM, tympanosclerosis; Tube, tympanostomy tube inserted status; Chole, congenital cholesteatoma.

  • Fig. 2. Pre-processing with “LabelMe.” (A) A schematic flow of image analysis. (B) Labeling with contours of the five substructures (malleus with lateral process and handle, whole annulus, pars flaccida, umbo, and cone of light) was done manually by a specialized otologist. (C) Sample images showing the delineation of the five substructures on a normal tympanic membrane. (D) Example of the results for the five substructures analyzed with mask R-CNN.

  • Fig. 3. The comparisons of intersections over union (IoUs) in the subgroups according to five substructures (malleus, annulus, cone of light, umbo, and pars flaccida). AOM, acute otitis media; SOM, otitis media with serous effusion; MOM, otitis media with mucoid effusion; COM w/o P, chronic otitis media without perforation; COM w P, chronic otitis media with perforation; Traumatic TM, traumatic drum perforation; Sclerosis TM, tympanosclerosis; Tube, tympanostomy tube inserted status; Chole, congenital cholesteatoma.

  • Fig. 4. Results of mask R-CNN. (A) Fine-tuning according to the learning rate (0.01, 0.001, 0.0001, 0.00001, and scheduled). (B) Fine-tuning according to the layers with stage 1 (network heads), stage 2 (over Resnet stage 4), and stage 3 (all layers). The layer of stage 2 showed the lowest validation loss and the lowest computation power. (C) Receiver operating characteristic (ROC) curves of the three-layer fully connected neural network algorithm according to each substructure. (D) ROC curve according to combinations of the substructures. (E) Precision and recall curves for each substructure. (F) Precision and recall curves for the combined substructures. We could obtain good prediction results with combinations of the other four substructures. We could also diagnose abnormal tympanic membranes (TMs) with the malleus, cone of light, and umbo in comparison with normal TMs, with a satisfactory result (area under the curve [AUC], 0.911).

  • Fig. 5. The matrix of precision, recall, F1, and support values between the normal tympanic membranes (TMs) and the combined group of SOM, COM w P, and traumatic TM. (A) Matrix of raw cases sorted between the true and predicted classes. (B) Matrix of proportions for precision, recall, F1, and support between the normal and the combined groups. The combined group of SOM, COM w P, and traumatic TM had the most significant values (precision, 0.950; recall, 0.960) compared to the normal TM group. SOM, otitis media with serous effusion; COM w P, chronic otitis media with perforations; Traumatic TM, traumatic drum perforation.


Reference

1. Joe H, Seo YJ. A newly designed tympanostomy stent with TiO2 coating to reduce Pseudomonas aeruginosa biofilm formation. J Biomater Appl. 2018; Oct. 33(4):599–605.
Article
2. Lee SH, Ha SM, Jeong MJ, Park DJ, Polo CN, Seo YJ, et al. Effects of reactive oxygen species generation induced by Wonju City particulate matter on mitochondrial dysfunction in human middle ear cell. Environ Sci Pollut Res Int. 2021; Sep. 28(35):49244–57.
Article
3. Demant MN, Jensen RG, Bhutta MF, Laier GH, Lous J, Homoe P. Smartphone otoscopy by non-specialist health workers in rural Greenland: a cross-sectional study. Int J Pediatr Otorhinolaryngol. 2019; Nov. 126:109628.
Article
4. Cha D, Pae C, Seong SB, Choi JY, Park HJ. Automated diagnosis of ear disease using ensemble deep learning with a big otoendoscopy image database. EBioMedicine. 2019; Jul. 45:606–14.
Article
5. Zeng X, Jiang Z, Luo W, Li H, Li H, Li G, et al. Efficient and accurate identification of ear diseases using an ensemble deep learning model. Sci Rep. 2021; May. 11(1):10839.
Article
6. Singh A, Sengupta S, Lakshminarayanan V. Explainable deep learning models in medical image analysis. J Imaging. 2020; Jun. 6(6):52.
Article
7. Liu X, Song L, Liu S, Zhang Y. A review of deep-learning-based medical image segmentation methods. Sustainability. 2021; Jan. 13(3):1224.
Article
8. Rosenfeld RM, Shin JJ, Schwartz SR, Coggins R, Gagnon L, Hackell JM, et al. clinical practice guideline: otitis media with effusion (update). Otolaryngol Head Neck Surg. 2016; Feb. 154(1 Suppl):S1–41.
Article
9. Sanna M, Russo A, Caruso A, Taibah A, Piras G. Color atlas of endootoscopy. Thieme;2017.
10. Russell BC, Torralba A, Murphy KP, Freeman WT. LabelMe: a database and web-based tool for image annotation. Int J Comput Vis. 2008; May. 77(1):157–73.
Article
11. He K, Gkioxari G, Dollar P, Girshick R. Mask R-CNN. International Conference on Computer Vision;2017. p. 2980–8.
12. Peng J, Wang Y. Medical image segmentation with limited supervision: a review of deep network models. IEEE Access. 2021; 9:36827–51.
Article
13. Aggarwal R, Sounderajah V, Martin G, Ting DS, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med. 2021; Apr. 4(1):65.
Article
14. Wang G, Li W, Ourselin S, Vercauteren T. Automatic brain tumor segmentation using cascaded anisotropic convolutional neural networks. In : In : Crimi A, Bakas S, Kuijf H, Menze B, Reyes M, editors. Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. Proceedings of the Third International Workshop BrainLes; 2017 Sep 14; Quebec City (QC). Springer;2018. p. 10670.
15. Liu Y, Zhang P, Song Q, Li A, Zhang P, Gui Z. Automatic segmentation of cervical nuclei based on deep learning and a conditional random field. IEEE Access. 2018; 6:53709–21.
Article
16. Zhao C, Han J, Jia Y, Gou F. Lung nodule detection via 3D U-Net and contextual convolutional neural network. In : 2018 International Conference on Networking and Network Applications; 2018. Xi’an, China. p. 356–61.
Article
17. Mulay S, Deepika G, Jeevakala S, Ram K, Sivaprakasam M. Liver segmentation from multimodal images using HED-Mask R-CNN. In : In : Li Q, Leahy R, Dong B, Li X, editors. Multiscale multimodal medical imaging. Proceedings of the First International Workshop MMMI 2019; Shenzhen. Springer;2019. p. 68–75.
18. Shu JH, Nian FD, Yu MH, Li X. An improved mask R-CNN model for multiorgan segmentation. Math Probl Eng. 2020; 2020:8351725.
Article
19. Prajapati SA, Nagaraj R, Mitra S. Classification of dental diseases using CNN and transfer learning. In : 5th International Symposium on Computational and Business Intelligence (ISCBI); 2017. Dubai, United Arab Emirates. p. 70–4.
Article
20. Anantharaman R, Velazquez M, Lee Y. Utilizing Mask R-CNN for detection and segmentation of oral diseases. In : 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2018. Madrid, Spain. p. 2197–204.
Article
21. Myburgh HC, van Zijl WH, Swanepoel D, Hellstrom S, Laurent C. Otitis media diagnosis for developing countries using tympanic membrane image-analysis. EBioMedicine. 2016; Feb. 5:156–60.
Article
22. Pichichero ME, Poole MD. Assessing diagnostic accuracy and tympanocentesis skills in the management of otitis media. Arch Pediatr Adolesc Med. 2001; Oct. 155(10):1137–42.
Article
Full Text Links
  • CEO
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr