Evaluating the image recognition capabilities of GPT-4V and Gemini Pro in the Japanese national dental examination

被引：0

作者：

Fukuda, Hikaru

Morishita, Masaki ^{[1
,2
,3
]}

Muraoka, Kosuke ^{[3
]}

Yamaguchi, Shino ^{[4
]}

Nakamura, Taiji ^{[5
]}

Yoshioka, Izumi ^{[6
]}

Awano, Shuji ^{[3
]}

Ono, Kentaro ^{[7
]}

机构：

[1] Kyushu Dent Univ, Dept Sci Phys Funct, Div Maxillofacial Surg, Kitakyushu, Japan

[2] Kyushu Dent Univ Hosp, Hlth Informat Management Off, Kitakyushu, Japan

[3] Dept Oral Funct, Div Clin Educ Dev & Res, Kyushu, Japan

[4] Kyushu Dent Univ, Sch Oral Hlth Sci, Kitakyushu, Japan

[5] Kyushu Dent Univ, Dept Oral Funct, Div Periodontol, Kitakyushu, Japan

[6] Kyushu Dent Univ, Dept Sci Phys Funct, Div Oral Med, Kitakyushu, Japan

[7] Kyushu Dent Univ, Dept Hlth Promot, Div Physiol, Kitakyushu, Japan

来源：

JOURNAL OF DENTAL SCIENCES | 2025年 / 20卷 / 01期

关键词：

ChatGPT-4V; Gemini Pro; Japanese national dental examination; Large language models;

D O I：

10.1016/j.jds.2024.06.015

中图分类号：

R78 [口腔科学];

学科分类号：

1003 ;

摘要：

Background/purpose: OpenAI's GPT-4V and Google's Gemini Pro, being Large Language Models (LLMs) equipped with image recognition capabilities, have the potential to be utilized in future medical diagnosis and treatment, ands serve as valuable educational support tools for students. This study compared and evaluated the image recognition capabilities of GPT-4V and Gemini Pro using questions from the Japanese National Dental Examination (JNDE) to investigate their potential as educational support tools. Materials and methods: We analyzed 160 questions from the 116th JNDE, administered in March 2023, using ChatGPT-4V, and Gemini Pro, which have image recognition functions. Standardized prompts were used for all LLMs, and statistical analysis was conducted using Fisher's exact test and the Mann-Whitney U test. Results: For the 160 JNDE questions, the accuracy rates of GPT-4V and Gemini Pro were 35.0% and 28.1%, respectively, with GPT-4V being the highest, although not statistically significant. Across dental specialties, the accuracy rates of the GPT-4V were generally higher than those of the Gemini Pro, with some areas showing equal accuracy. Accuracy rates tended to decrease with an increased number of images within a question, suggesting that the number of images influenced the correctness of the responses. Conclusion: The overall superior performance of GPT-4V compared to Gemini Pro may be attributed to the continuous updates in OpenAI's model. This research demonstrates the potential of LLMs as educational support tools in dentistry, while also highlighting areas that (c) 2025 Association for Dental Sciences of the Republic of China. Publishing services by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons. org/licenses/by-nc-nd/4.0/).

引用

页码：368 / 372

页数：5

共 29 条

[1] Evaluating GPT-4V's performance in the Japanese national dental examination: A challenge explored
Morishita, Masaki
Fukuda, Hikaru
Muraoka, Kosuke
Nakamura, Taiji
Hayashi, Masanari
Yoshioka, Izumi
Ono, Kentaro
Awano, Shuji
JOURNAL OF DENTAL SCIENCES, 2024, 19 (03) : 1595 - 1600
[2] GPT-4V passes the BLS and ACLS examinations: An analysis of GPT-4V's image recognition capabilities
King, Ryan C.
Bharani, Vishnu
Shah, Kunal
Yeo, Yee Hui
Samaan, Jamil S.
RESUSCITATION, 2024, 195
[3] Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study
Nakao, Takahiro
Miki, Soichiro
Nakamura, Yuta
Kikuchi, Tomohiro
Nomura, Yukihiro
Hanaoka, Shouhei
Yoshikawa, Takeharu
Abe, Osamu
JMIR MEDICAL EDUCATION, 2024, 10
[4] Comparing Diagnostic Accuracy of Radiologists versus GPT-4V and Gemini Pro Vision Using Image Inputs from Diagnosis Please Cases
Suh, Pae Sun
Shim, Woo Hyun
Suh, Chong Hyun
Heo, Hwon
Park, Chae Ri
Eom, Hye Joung
Park, Kye Jin
Choe, Jooae
Kim, Pyeong Hwa
Park, Hyo Jung
Ahn, Yura
Park, Ho Young
Choi, Yoonseok
Woo, Chang-Yun
Park, Hyungjun
RADIOLOGY, 2024, 312 (01) : e240273
[5] Performance of GPT-4V in Answering the Japanese Otolaryngology Board Certification Examination Questions: Evaluation Study
Noda, Masao
Ueno, Takayoshi
Koshu, Ryota
Takaso, Yuji
Shimada, Mari Dias
Saito, Chizu
Sugimoto, Hisashi
Fushiki, Hiroaki
Ito, Makoto
Nomura, Akihiro
Yoshizaki, Tomokazu
JMIR MEDICAL EDUCATION, 2024, 10
[6] An exploratory assessment of GPT-4o and GPT-4 performance on the Japanese National Dental Examination
Morishita, Masaki
Fukuda, Hikaru
Yamaguchi, Shino
Muraoka, Kosuke
Nakamura, Taiji
Hayashi, Masanari
Yoshioka, Izumi
Ono, Kentaro
Awano, Shuji
SAUDI DENTAL JOURNAL, 2024, 36 (12) : 1577 - 1581
[7] Image and data mining in reticular chemistry powered by GPT-4V
Zheng, Zhiling
He, Zhiguo
Khattab, Omar
Rampal, Nakul
Zaharia, Matei A.
Borgs, Christian
Chayes, Jennifer T.
Yaghi, Omar M.
DIGITAL DISCOVERY, 2024, 3 (03): : 491 - 501
[8] Integrating Text and Image Analysis: Exploring GPT-4V's Capabilities in Advanced Radiological Applications Across Subspecialties
Busch, Felix
Han, Tianyu
Makowski, Marcus R.
Truhn, Daniel
Bressem, Keno K.
Adams, Lisa
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
[9] A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
Nakajima, Nozomu
Fujimori, Takahito
Furuya, Masayuki
Kanie, Yuya
Imai, Hirotatsu
Kita, Kosuke
Uemura, Keisuke
Okada, Seiji
CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (03)
[10] Evaluation of Multimodal ChatGPT (GPT-4V) in Describing Mammography Image Features
Haver, Hana
Bahl, Manisha
Doo, Florence
Kamel, Peter
Parekh, Vishwa
Jeudy, Jean
Yi, Paul
CANADIAN ASSOCIATION OF RADIOLOGISTS JOURNAL-JOURNAL DE L ASSOCIATION CANADIENNE DES RADIOLOGISTES, 2024, 75 (04): : 947 - 949

← 1 2 3 →