GPT-4 Performance for Neurologic Localization

被引:3
|
作者
Lee, Jung-Hyun [1 ,2 ,3 ]
Choi, Eunhee [4 ]
McDougal, Robert [5 ,6 ,7 ,8 ]
Lytton, William W. [1 ,2 ,9 ]
机构
[1] SUNY Downstate Hlth Sci Univ, Dept Neurol, Brooklyn, NY 11203 USA
[2] Kings Cty Hosp, Dept Neurol, Brooklyn, NY 11203 USA
[3] Maimonides Hosp, Dept Neurol, Brooklyn, NY 11219 USA
[4] Lincoln Med Ctr, Dept Internal Med, Bronx, NY USA
[5] Yale Univ, Yale Sch Publ Hlth, Dept Biostat, New Haven, CT USA
[6] Yale Univ, Yale Sch Med, Program Computat Biol & Bioinformat, New Haven, CT USA
[7] Yale Univ, Wu Tsai Inst, Yale Sch Med, New Haven, CT USA
[8] Yale Univ, Yale Sch Med, Sect Biomed Informat & Data Sci, New Haven, CT USA
[9] SUNY Downstate Hlth Sci Univ, Dept Physiol & Pharmacol, Brooklyn, NY USA
关键词
D O I
10.1212/CPJ.0000000000200293
中图分类号
R74 [神经病学与精神病学];
学科分类号
摘要
Background and ObjectivesIn health care, large language models such as Generative Pretrained Transformers (GPTs), trained on extensive text datasets, have potential applications in reducing health care disparities across regions and populations. Previous software developed for lesion localization has been limited in scope. This study aims to evaluate the capability of GPT-4 for lesion localization based on clinical presentation.MethodsGPT-4 was prompted using history and neurologic physical examination (H&P) from published cases of acute stroke followed by questions for clinical reasoning with answering for "single or multiple lesions," "side," and "brain region" using Zero-Shot Chain-of-Thought and Text Classification prompting. GPT-4 output on 3 separate trials for each of 46 cases was compared with imaging-based localization.ResultsGPT-4 successfully processed raw text from H&P to generate accurate neuroanatomical localization and detailed clinical reasoning. Performance metrics across trial-based analysis for specificity, sensitivity, precision, and F1-score were 0.87, 0.74, 0.75, and 0.74, respectively, for side; 0.94, 0.85, 0.84, and 0.85, respectively, for brain region. Class labels within the brain region were similarly high for all regions except the cerebellum and were also similar when considering all 3 trials to examine metrics by case. Errors were due to extrinsic causes-inadequate information in the published cases, and intrinsic causes-failures of logic or inadequate knowledge base.DiscussionThis study reveals capabilities of GPT-4 in the localization of acute stroke lesions, showing a potential future role as a clinical tool in neurology.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Performance of GPT-4 on Chinese Nursing Examination
    Miao, Yiqun
    Luo, Yuan
    Zhao, Yuhan
    Li, Jiawei
    Liu, Mingxuan
    Wang, Huiying
    Chen, Yuling
    Wu, Ying
    NURSE EDUCATOR, 2024, 49 (06) : E338 - E343
  • [2] Performance of Novel GPT-4 in Otolaryngology Knowledge Assessment
    Revercomb, Lucy
    Patel, Aman M.
    Fu, Daniel
    Filimonov, Andrey
    INDIAN JOURNAL OF OTOLARYNGOLOGY AND HEAD & NECK SURGERY, 2024, 76 (06) : 6112 - 6114
  • [3] Is GPT-4 a reliable rater? Evaluating consistency in GPT-4's text ratings
    Hackl, Veronika
    Mueller, Alexandra Elena
    Granitzer, Michael
    Sailer, Maximilian
    FRONTIERS IN EDUCATION, 2023, 8
  • [4] GPT-4 as a biomedical simulator
    Schaefer M.
    Reichl S.
    ter Horst R.
    Nicolas A.M.
    Krausgruber T.
    Piras F.
    Stepper P.
    Bock C.
    Samwald M.
    Computers in Biology and Medicine, 2024, 178
  • [5] Performance of GPT-4 Vision on kidney pathology exam questions
    Miao, Jing
    Thongprayoon, Charat
    Cheungpasitporn, Wisit
    Cornell, Lynn D.
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2024, 162 (03) : 220 - 226
  • [6] Assessing GPT-4 multimodal performance in radiological image analysis
    Brin, Dana
    Sorin, Vera
    Barash, Yiftach
    Konen, Eli
    Glicksberg, Benjamin S.
    Nadkarni, Girish N.
    Klang, Eyal
    EUROPEAN RADIOLOGY, 2025, 35 (04) : 1959 - 1965
  • [7] Performance of GPT-4 Vision on kidney pathology exam questions
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2024,
  • [8] Assessing the Performance of GPT-3.5 and GPT-4 on the 2023 Japanese Nursing Examination
    Kaneda, Yudai
    Takahashi, Ryo
    Kaneda, Uiri
    Akashima, Shiori
    Okita, Haruna
    Misaki, Sadaya
    Yamashiro, Akimi
    Ozaki, Akihiko
    Tanimoto, Tetsuya
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)
  • [9] Evaluation of the performance of GPT-3.5 and GPT-4 on the Polish Medical Final Examination
    Maciej Rosoł
    Jakub S. Gąsior
    Jonasz Łaba
    Kacper Korzeniewski
    Marcel Młyńczak
    Scientific Reports, 13
  • [10] Evaluation of the performance of GPT-3.5 and GPT-4 on the Polish Medical Final Examination
    Rosol, Maciej
    Gasior, Jakub S.
    Laba, Jonasz
    Korzeniewski, Kacper
    Mlynczak, Marcel
    SCIENTIFIC REPORTS, 2023, 13 (01)