Usefulness of the large language model ChatGPT (GPT-4) as a diagnostic tool and information source in dermatology

被引：1

作者：

Nielsen, Jacob P. S. ^{[1
,4
]}

Gronhoj, Christian ^{[1
]}

Skov, Lone ^{[2
,3
]}

Gyldenlove, Mette ^{[2
,3
]}

机构：

[1] Copenhagen Univ Hosp, Dept Otorhinolaryngol Head & Neck Surg & Audiol, Copenhagen, Denmark

[2] Copenhagen Univ Hosp Herlev & Gentofte, Dept Dermatol & Allergy, Copenhagen, Denmark

[3] Univ Copenhagen, Fac Hlth & Med Sci, Dept Clin Med, Copenhagen, Denmark

[4] Copenhagen Univ Hosp, Dept Otorhinolaryngol Head & Neck Surg & Audiol, Rigshosp, Blegdamsvej 9, DK-2100 Copenhagen, Denmark

来源：

JEADV CLINICAL PRACTICE | 2024年 / 3卷 / 05期

关键词：

AI; artificial intelligence; Chatbot; ChatGPT; clinical dermatology; GPT-4; information source; Large Language Model; LLM; skin disease;

D O I：

10.1002/jvc2.459

中图分类号：

R75 [皮肤病学与性病学];

学科分类号：

100206 ;

摘要：

BackgroundThe field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre-Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT-4, capable of analyzing clinical images.ObjectivesTo evaluate ChatGPT as a diagnostic tool and information source in clinical dermatology.MethodsA total of 15 clinical images were selected from the Danish web atlas, Danderm, depicting various common and rare skin conditions. The images were uploaded to ChatGPT version GPT-4, which was prompted with 'Please provide a description, a potential diagnosis, and treatment options for the following dermatological condition'. The generated responses were assessed by senior registrars in dermatology and consultant dermatologists in terms of accuracy, relevance, and depth (scale 1-5), and in addition, the image quality was rated (scale 0-10). Demographic and professional information about the respondents was registered.ResultsA total of 23 physicians participated in the study. The majority of the respondents were consultant dermatologists (83%), and 48% had more than 10 years of training. The overall image quality had a median rating of 10 out of 10 [interquartile range (IQR): 9-10]. The overall median rating of the ChatGPT generated responses was 2 (IQR: 1-4), while overall median ratings in terms of relevance, accuracy, and depth were 2 (IQR: 1-4), 3 (IQR: 2-4) and 2 (IQR: 1-3), respectively.ConclusionsDespite the advancements in ChatGPT, including newly added image processing capabilities, the chatbot demonstrated significant limitations in providing reliable and clinically useful responses to illustrative images of various dermatological conditions.

引用

页码：1570 / 1575

页数：6

共 50 条

[1] ChatGPT and GPT-4 in Ophthalmology: Applications of Large Language Model Artificial Intelligence in Retina
Ong, Joshua
Hariprasad, Seenu M.
Chhablani, Jay
OPHTHALMIC SURGERY LASERS & IMAGING RETINA, 2023, 54 (10): : 557 - 562
[2] The potential and pitfalls of using a large language model such as ChatGPT, GPT-4, or LLaMA as a clinical assistant
Zhang, Jingqing
Sun, Kai
Jagadeesh, Akshay
Falakaflaki, Parastoo
Kayayan, Elena
Tao, Guanyu
Ghahfarokhi, Mahta Haghighat
Gupta, Deepa
Gupta, Ashok
Gupta, Vibhor
Guo, Yike
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1884 - 1891
[3] The Emotional Intelligence of the GPT-4 Large Language Model
Vzorin, Gleb D.
Bukinich, Alexey M.
Sedykh, Anna V.
Vetrova, Irina I.
Sergienko, Elena A.
PSYCHOLOGY IN RUSSIA-STATE OF THE ART, 2024, 17 (02): : 85 - 99
[4] Large language models such as ChatGPT and GPT-4 for patient-centered care in radiology
Fink, Matthias A.
RADIOLOGIE, 2023, 63 (09): : 665 - 671
[5] ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology?
Egli, Adrian
CLINICAL INFECTIOUS DISEASES, 2023, 77 (09) : 1322 - 1328
[6] ChatGPT and Patient Information in Nuclear Medicine: GPT-3.5 Versus GPT-4
Currie, Geoff
Robbie, Stephanie
Tually, Peter
JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 307 - 313
[7] ChatGPT/GPT-4 (large language models): Opportunities and challenges of perspective in bariatric healthcare professionals
Law, Saikam
Oldfield, Brian
Yang, Wah
OBESITY REVIEWS, 2024, 25 (07)
[8] A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
Nakajima, Nozomu
Fujimori, Takahito
Furuya, Masayuki
Kanie, Yuya
Imai, Hirotatsu
Kita, Kosuke
Uemura, Keisuke
Okada, Seiji
CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (03)
[9] ChatGPT as a Source of Information for Bariatric Surgery Patients: a Comparative Analysis of Accuracy and Comprehensiveness Between GPT-4 and GPT-3.5
Jamil S. Samaan
Nithya Rajeev
Wee Han Ng
Nitin Srinivasan
Jonathan A. Busam
Yee Hui Yeo
Kamran Samakar
Obesity Surgery, 2024, 34 : 1987 - 1989
[10] ChatGPT as a Source of Information for Bariatric Surgery Patients: a Comparative Analysis of Accuracy and Comprehensiveness Between GPT-4 and GPT-3.5
Samaan, Jamil S.
Rajeev, Nithya
Ng, Wee Han
Srinivasan, Nitin
Busam, Jonathan A.
Yeo, Yee Hui
Samakar, Kamran
OBESITY SURGERY, 2024, 34 (05) : 1987 - 1989

← 1 2 3 4 5 →