Evaluating the image recognition capabilities of GPT-4V and Gemini Pro in the Japanese national dental examination

被引:0
|
作者
Fukuda, Hikaru
Morishita, Masaki [1 ,2 ,3 ]
Muraoka, Kosuke [3 ]
Yamaguchi, Shino [4 ]
Nakamura, Taiji [5 ]
Yoshioka, Izumi [6 ]
Awano, Shuji [3 ]
Ono, Kentaro [7 ]
机构
[1] Kyushu Dent Univ, Dept Sci Phys Funct, Div Maxillofacial Surg, Kitakyushu, Japan
[2] Kyushu Dent Univ Hosp, Hlth Informat Management Off, Kitakyushu, Japan
[3] Dept Oral Funct, Div Clin Educ Dev & Res, Kyushu, Japan
[4] Kyushu Dent Univ, Sch Oral Hlth Sci, Kitakyushu, Japan
[5] Kyushu Dent Univ, Dept Oral Funct, Div Periodontol, Kitakyushu, Japan
[6] Kyushu Dent Univ, Dept Sci Phys Funct, Div Oral Med, Kitakyushu, Japan
[7] Kyushu Dent Univ, Dept Hlth Promot, Div Physiol, Kitakyushu, Japan
关键词
ChatGPT-4V; Gemini Pro; Japanese national dental examination; Large language models;
D O I
10.1016/j.jds.2024.06.015
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
Background/purpose: OpenAI's GPT-4V and Google's Gemini Pro, being Large Language Models (LLMs) equipped with image recognition capabilities, have the potential to be utilized in future medical diagnosis and treatment, ands serve as valuable educational support tools for students. This study compared and evaluated the image recognition capabilities of GPT-4V and Gemini Pro using questions from the Japanese National Dental Examination (JNDE) to investigate their potential as educational support tools. Materials and methods: We analyzed 160 questions from the 116th JNDE, administered in March 2023, using ChatGPT-4V, and Gemini Pro, which have image recognition functions. Standardized prompts were used for all LLMs, and statistical analysis was conducted using Fisher's exact test and the Mann-Whitney U test. Results: For the 160 JNDE questions, the accuracy rates of GPT-4V and Gemini Pro were 35.0% and 28.1%, respectively, with GPT-4V being the highest, although not statistically significant. Across dental specialties, the accuracy rates of the GPT-4V were generally higher than those of the Gemini Pro, with some areas showing equal accuracy. Accuracy rates tended to decrease with an increased number of images within a question, suggesting that the number of images influenced the correctness of the responses. Conclusion: The overall superior performance of GPT-4V compared to Gemini Pro may be attributed to the continuous updates in OpenAI's model. This research demonstrates the potential of LLMs as educational support tools in dentistry, while also highlighting areas that (c) 2025 Association for Dental Sciences of the Republic of China. Publishing services by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons. org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:368 / 372
页数:5
相关论文
共 29 条
  • [1] Evaluating GPT-4V's performance in the Japanese national dental examination: A challenge explored
    Morishita, Masaki
    Fukuda, Hikaru
    Muraoka, Kosuke
    Nakamura, Taiji
    Hayashi, Masanari
    Yoshioka, Izumi
    Ono, Kentaro
    Awano, Shuji
    JOURNAL OF DENTAL SCIENCES, 2024, 19 (03) : 1595 - 1600
  • [2] GPT-4V passes the BLS and ACLS examinations: An analysis of GPT-4V's image recognition capabilities
    King, Ryan C.
    Bharani, Vishnu
    Shah, Kunal
    Yeo, Yee Hui
    Samaan, Jamil S.
    RESUSCITATION, 2024, 195
  • [3] Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study
    Nakao, Takahiro
    Miki, Soichiro
    Nakamura, Yuta
    Kikuchi, Tomohiro
    Nomura, Yukihiro
    Hanaoka, Shouhei
    Yoshikawa, Takeharu
    Abe, Osamu
    JMIR MEDICAL EDUCATION, 2024, 10
  • [4] Comparing Diagnostic Accuracy of Radiologists versus GPT-4V and Gemini Pro Vision Using Image Inputs from Diagnosis Please Cases
    Suh, Pae Sun
    Shim, Woo Hyun
    Suh, Chong Hyun
    Heo, Hwon
    Park, Chae Ri
    Eom, Hye Joung
    Park, Kye Jin
    Choe, Jooae
    Kim, Pyeong Hwa
    Park, Hyo Jung
    Ahn, Yura
    Park, Ho Young
    Choi, Yoonseok
    Woo, Chang-Yun
    Park, Hyungjun
    RADIOLOGY, 2024, 312 (01) : e240273
  • [5] Performance of GPT-4V in Answering the Japanese Otolaryngology Board Certification Examination Questions: Evaluation Study
    Noda, Masao
    Ueno, Takayoshi
    Koshu, Ryota
    Takaso, Yuji
    Shimada, Mari Dias
    Saito, Chizu
    Sugimoto, Hisashi
    Fushiki, Hiroaki
    Ito, Makoto
    Nomura, Akihiro
    Yoshizaki, Tomokazu
    JMIR MEDICAL EDUCATION, 2024, 10
  • [6] An exploratory assessment of GPT-4o and GPT-4 performance on the Japanese National Dental Examination
    Morishita, Masaki
    Fukuda, Hikaru
    Yamaguchi, Shino
    Muraoka, Kosuke
    Nakamura, Taiji
    Hayashi, Masanari
    Yoshioka, Izumi
    Ono, Kentaro
    Awano, Shuji
    SAUDI DENTAL JOURNAL, 2024, 36 (12) : 1577 - 1581
  • [7] Image and data mining in reticular chemistry powered by GPT-4V
    Zheng, Zhiling
    He, Zhiguo
    Khattab, Omar
    Rampal, Nakul
    Zaharia, Matei A.
    Borgs, Christian
    Chayes, Jennifer T.
    Yaghi, Omar M.
    DIGITAL DISCOVERY, 2024, 3 (03): : 491 - 501
  • [8] Integrating Text and Image Analysis: Exploring GPT-4V's Capabilities in Advanced Radiological Applications Across Subspecialties
    Busch, Felix
    Han, Tianyu
    Makowski, Marcus R.
    Truhn, Daniel
    Bressem, Keno K.
    Adams, Lisa
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [9] A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
    Nakajima, Nozomu
    Fujimori, Takahito
    Furuya, Masayuki
    Kanie, Yuya
    Imai, Hirotatsu
    Kita, Kosuke
    Uemura, Keisuke
    Okada, Seiji
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (03)
  • [10] Evaluation of Multimodal ChatGPT (GPT-4V) in Describing Mammography Image Features
    Haver, Hana
    Bahl, Manisha
    Doo, Florence
    Kamel, Peter
    Parekh, Vishwa
    Jeudy, Jean
    Yi, Paul
    CANADIAN ASSOCIATION OF RADIOLOGISTS JOURNAL-JOURNAL DE L ASSOCIATION CANADIENNE DES RADIOLOGISTES, 2024, 75 (04): : 947 - 949