Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT

Dashti, Mahmood; Ghasemi, Shohreh; Ghadimi, Niloofar; Hefzi, Delband; Karimian, Azizeh; Zare, Niusha; Fahimipour, Amir; Khurshid, Zohaib; Chafjiri, Maryam Mohammadalizadeh; Ghaedsharaf, Sahar

Imaging Sci Dent. 2024 Sep;54(3):271-275. 10.5624/isd.20240037.

Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT

Affiliations

¹Dentofacial Deformities Research Center, Research Institute of Dental Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
²Department of Trauma and Craniofacial Reconstruction, Queen Mary College, London, England
³Department of Oral and Maxillofacial Radiology, Dental School, Islamic Azad University of Medical Sciences, Tehran, Iran
⁴School of Dentistry, Tehran University of Medical Science, Tehran, Iran
⁵Department of Biostatistics, Dental Research Center, Golestan University of Medical Sciences, Gorgan, Iran
⁶Department of Operative Dentistry, University of Southern California, CA, USA
⁷Discipline of Oral Surgery, Medicine and Diagnostics, School of Dentistry, Faculty of Medicine and Health, Westmead Centre for Oral Health, The University of Sydney, Sydney, Australia
⁸Department of Prosthodontics and Dental Implantology, King Faisal University, Al Ahsa, Kingdom of Saudi Arabia
⁹Department of Oral and Maxillofacial Pathology, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran
¹⁰Department of Oral and Maxillofacial Radiology, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran

KMID: 2559837
DOI: http://doi.org/10.5624/isd.20240037

Abstract

Purpose
Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care.
Materials and Methods
This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams -specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and Advanced Dental Admission Test (ADAT) - excluding image-based questions. Using customized prompts, ChatGPT’s answers were evaluated against official answer sheets.
Results
ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions. In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5’s rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P = 0.009). Both versions showed similar patterns in incorrect responses.
Conclusion
Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4’s perfect score in comprehension questions underscores its trainability in specific subjects. However, both versions exhibited weaker performance in mathematical analysis, suggesting this as an area for improvement.

Keyword

Artificial Intelligence; Deep Learning; Dentistry; Education; Dental

Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT

Abstract

Keyword

Cited

Save citations to file

Email citations