Classifying Cancer Stage with Open-Source Clinical Large Language Models

被引:0
|
作者
Chang, Chia-Hsuan [1 ]
Lucas, Mary M. [1 ]
Lu-Yao, Grace [2 ]
Yang, Christopher C. [1 ]
机构
[1] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA
[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Dept Med Oncol, Philadelphia, PA 19107 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICHI61247.2024.00018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer stage classification is important for making treatment and care management plans for oncology patients. Information on staging is often included in unstructured form in clinical, pathology, radiology and other free-text reports in the electronic health record system, requiring extensive work to parse and obtain. To facilitate the extraction of this information, previous NLP approaches rely on labeled training datasets, which are labor-intensive to prepare. In this study, we demonstrate that without any labeled training data, open-source clinical large language models (LLMs) can extract pathologic tumor-node-metastasis (pTNM) staging information from real-world pathology reports. Our experiments compare LLMs and a BERT-based model fine-tuned using the labeled data. Our findings suggest that while LLMs still exhibit subpar performance in Tumor (T) classification, with the appropriate adoption of prompting strategies, they can achieve comparable performance on Metastasis (M) classification and improved performance on Node (N) classification.
引用
收藏
页码:76 / 82
页数:7
相关论文
共 50 条
  • [31] Enhancing Commit Message Categorization in Open-Source Repositories Using Structured Taxonomy and Large Language Models
    Al-razgan, Muna
    Alaqil, Manal
    Almuwayshir, Ruba
    Alhijji, Zamzam
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2024, 4 (04): : 2950 - 2968
  • [32] TeenyTinyLlama: Open-source tiny language models trained in Brazilian Portuguese
    Correa, Nicholas Kluge
    Falk, Sophia
    Fatimah, Shiza
    Sen, Aniket
    De Oliveira, Nythamar
    MACHINE LEARNING WITH APPLICATIONS, 2024, 16
  • [33] OPEN-SOURCE LANGUAGE AI CHALLENGES BIG TECH'S MODELS
    Gibney, Elizabeth
    NATURE, 2022, 606 (7916) : 850 - 851
  • [34] Open-source language AI challenges big tech’s models
    Elizabeth Gibney
    Nature, 2022, 606 : 850 - 851
  • [35] Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study
    Ehrett, Carl
    Hegde, Sudeep
    Andre, Kwame
    Liu, Dixizi
    Wilson, Timothy
    JMIR MEDICAL EDUCATION, 2024, 10
  • [36] FaultLines - Evaluating the Efficacy of Open-Source Large Language Models for Fault Detection in Cyber-Physical Systems
    Muehlburger, Herbert
    Wotawa, Franz
    2024 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2024, : 47 - 54
  • [37] RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
    Lu, Yao
    Liu, Shang
    Zhang, Qijun
    Xie, Zhiyao
    29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 722 - 727
  • [38] Archetypes of open-source business models
    Estelle Duparc
    Frederik Möller
    Ilka Jussen
    Maleen Stachon
    Sükran Algac
    Boris Otto
    Electronic Markets, 2022, 32 : 727 - 745
  • [39] Archetypes of open-source business models
    Duparc, Estelle
    Moeller, Frederik
    Jussen, Ilka
    Stachon, Maleen
    Algac, Sukran
    Otto, Boris
    ELECTRONIC MARKETS, 2022, 32 (02) : 727 - 745
  • [40] PMC-LLaMA: toward building open-source language models for medicine
    Wu, Chaoyi
    Lin, Weixiong
    Zhang, Xiaoman
    Zhang, Ya
    Xie, Weidi
    Wang, Yanfeng
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1833 - 1843