Korean J Dermatol.
2024 Mar;62(3):143-151.
ChatGPT’s Potential in Dermatology Knowledge Retrieval:
An Analysis Using Korean Dermatology Residency Examination Questions
- Affiliations
-
- 1Department of Dermatology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
Abstract
- Background
The rapid evolution of artificial intelligence (AI), particularly large language models such as OpenAI’s ChatGPT, has significantly impacted various domains including healthcare. Dermatology can potentially benefit from the integration of ChatGPT.
Objective
To analyze the performance of ChatGPT in dermatology, with particular focus on understanding the model’s capabilities and limitations when addressing dermatology questions.
Methods
We employed 144 questions from Korean Dermatology Residency Evaluation Examinations. Questions were formatted consistently and fed to both GPT-4 and GPT-3.5 models. Performance was qualitatively assessed based on accuracy, completeness, and logical reasoning. Reasons for wrong answers were analyzed.
Results
The overall correctness rate for GPT-4 was 70.8% and 59.0% for GPT-3.5. In terms of accuracy, 70.1% of the explanations were completely accurate. A total of 90.3% of responses provided by GPT-4 were complete in terms of content. Illogical reasoning was detected only in 4.9% of the answers. A total of 69.0% of wrong answers were attributed to information errors, followed by logical errors in 16.7%. There was no significant difference in the correctness rates among question types, but the correctness rate decreased significantly with question difficulty. ChatGPT showed near-perfect consistency, demonstrating a 97.9% agreement and a Cohen’s kappa of 0.958.
Conclusion
This study demonstrated the potential of ChatGPT in dermatology knowledge retrieval, outperforming dermatology residents in terms of correctness rate, completeness, and accuracy. There were more information errors than logical errors, and the inaccuracy caused by these errors was identified as the main limiting factor of ChatGPT.