Language Models for Hierarchical Classification of Radiology Reports With Attention Mechanisms, BERT, and GPT-4

被引:2
|
作者
Olivato, Matteo [1 ]
Putelli, Luca [1 ]
Arici, Nicola [1 ]
Emilio Gerevini, Alfonso [1 ]
Lavelli, Alberto [2 ]
Serina, Ivan [1 ]
机构
[1] Univ Brescia, Dept Informat Engn, I-25121 Brescia, Italy
[2] Fdn Bruno Kessler, I-38123 Trento, Italy
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Radiology; Task analysis; Computed tomography; Biological system modeling; Training; Lung; Data models; Deep learning; Large language models; Attention mechanism; BERT; BioBIT; deep learning; GPT-4; large language models; natural language processing; Italian language; prompt engineering; radiology reports; Italian radiology reports; text classification;
D O I
10.1109/ACCESS.2024.3402066
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Radiology reports are a valuable source of textual information used to improve clinical care and support research. In recent years, deep learning techniques have been shown to be effective in classifying radiology reports. This article investigates the use of deep learning techniques with attention mechanisms to achieve better performance in the classification of radiology reports. We focus on various Natural Language Processing approaches, such as LSTM with Attention, BERT, and GPT-4, evaluated on a chest tomography report dataset regarding neoplastic diseases collected from an Italian hospital. In particular, we compare the results with a previous machine learning system, showing that models based on attention mechanisms can achieve higher performance. The Attention Mechanism allows us to identify the most relevant bits of text used by the model to make its predictions. We show that our model achieves state-of-the-art results on the hierarchical classification of radiology reports. Moreover, we evaluate the performance of GPT-4 on the classification of these reports in a zero-shot setup through prompt engineering, showing interesting results even with a small context and a non-English language. Our findings suggest that deep learning techniques with attention mechanisms may be successful in the classification of radiology reports even in non-English languages for which it is not possible to leverage on large text corpus.
引用
收藏
页码:69710 / 69727
页数:18
相关论文
共 50 条
  • [21] Evaluating the GPT-3.5 and GPT-4 Large Language Models for Zero-Shot Classification of South African Violent Event Data
    Kotze, Eduan
    Senekal, Burgert A.
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS, ICABCD 2024, 2024,
  • [22] Using BERT Models to Label Radiology Reports
    Zech, John R.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (04)
  • [23] The performance of the multimodal large language model GPT-4 on the European board of radiology examination sample test
    Besler, Muhammed Said
    JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (08) : 927 - 927
  • [24] ADAPTING GPT, GPT-2 AND BERT LANGUAGE MODELS FOR SPEECH RECOGNITION
    Zheng, Xianrui
    Zhang, Chao
    Woodland, Philip C.
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 162 - 168
  • [25] The Emotional Intelligence of the GPT-4 Large Language Model
    Vzorin, Gleb D.
    Bukinich, Alexey M.
    Sedykh, Anna V.
    Vetrova, Irina I.
    Sergienko, Elena A.
    PSYCHOLOGY IN RUSSIA-STATE OF THE ART, 2024, 17 (02): : 85 - 99
  • [26] GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination
    Hirano, Yuichiro
    Hanaoka, Shouhei
    Nakao, Takahiro
    Miki, Soichiro
    Kikuchi, Tomohiro
    Nakamura, Yuta
    Nomura, Yukihiro
    Yoshikawa, Takeharu
    Abe, Osamu
    JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (08) : 918 - 926
  • [27] ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology?
    Egli, Adrian
    CLINICAL INFECTIOUS DISEASES, 2023, 77 (09) : 1322 - 1328
  • [28] Exploring the capabilities of large language models for the generation of safety cases: the case of GPT-4
    Sivakumar, Mithila
    Belle, Alvine Boaye
    Shan, Jinjun
    Shahandashti, Kimya Khakzad
    32ND INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS, REW 2024, 2024, : 35 - 45
  • [29] Evaluating capabilities of large language models: Performance of GPT-4 on surgical knowledge assessments
    Beaulieu-Jones, Brendin R.
    Berrigan, Margaret T.
    Shah, Sahaj
    Marwaha, Jayson S.
    Lai, Shuo-Lun
    Brat, Gabriel A.
    SURGERY, 2024, 175 (04) : 936 - 942
  • [30] Comparing GPT-3.5 and GPT-4 Accuracy and Drift in Radiology Diagnosis Please Cases
    Li, David
    Gupta, Kartik
    Bhaduri, Mousumi
    Sathiadoss, Paul
    Bhatnagar, Sahir
    Chong, Jaron
    RADIOLOGY, 2024, 310 (01)