Classifying Cancer Stage with Open-Source Clinical Large Language Models

被引:0
|
作者
Chang, Chia-Hsuan [1 ]
Lucas, Mary M. [1 ]
Lu-Yao, Grace [2 ]
Yang, Christopher C. [1 ]
机构
[1] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA
[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Dept Med Oncol, Philadelphia, PA 19107 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICHI61247.2024.00018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer stage classification is important for making treatment and care management plans for oncology patients. Information on staging is often included in unstructured form in clinical, pathology, radiology and other free-text reports in the electronic health record system, requiring extensive work to parse and obtain. To facilitate the extraction of this information, previous NLP approaches rely on labeled training datasets, which are labor-intensive to prepare. In this study, we demonstrate that without any labeled training data, open-source clinical large language models (LLMs) can extract pathologic tumor-node-metastasis (pTNM) staging information from real-world pathology reports. Our experiments compare LLMs and a BERT-based model fine-tuned using the labeled data. Our findings suggest that while LLMs still exhibit subpar performance in Tumor (T) classification, with the appropriate adoption of prompting strategies, they can achieve comparable performance on Metastasis (M) classification and improved performance on Node (N) classification.
引用
收藏
页码:76 / 82
页数:7
相关论文
共 50 条
  • [41] Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library
    Tarride, Solene
    Schneider, Yoann
    Generali-Lince, Marie
    Boillet, Melodie
    Abadie, Bastien
    Kermorvant, Christopher
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 387 - 404
  • [42] Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data
    Chen, Yuhao
    Wang, Zhimu
    Zulkernine, Farhana
    2024 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH, ICDH 2024, 2024, : 126 - 128
  • [43] Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models
    Salam, Babak
    Stuewe, Claire
    Nowak, Sebastian
    Sprinkart, Alois M.
    Theis, Maike
    Kravchenko, Dmitrij
    Mesropyan, Narine
    Dell, Tatjana
    Endler, Christoph
    Pieper, Claus C.
    Kuetting, Daniel L.
    Luetkens, Julian A.
    Isaak, Alexander
    EUROPEAN RADIOLOGY, 2025,
  • [44] Harnessing Large Language Models for Simulink Toolchain Testing and Developing Diverse Open-Source Corpora of Simulink Models for Metric and Evolution Analysis
    Shrestha, Sohil Lal
    PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 1541 - 1545
  • [45] Benchmarking Open-Source Large Language Models on Code-Switched Tagalog-English Retrieval Augmented Generation
    Adoptante, Aunhel John M.
    Castro, Jasper Adrian Dwight, V
    Medrana, Micholo Lanz B.
    Ocampo, Alyssa Patricia B.
    Peramo, Elmer C.
    Miranda, Melissa Ruth M.
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2025, 16 (02) : 233 - 242
  • [46] Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models
    Sicari, Sabrina
    Cevallos, M. Jesus F.
    Rizzardi, Alessandra
    Coen-porisini, Alberto
    ACM COMPUTING SURVEYS, 2025, 57 (04)
  • [47] Text2VQL: Teaching a Model Query Language to Open-Source Language Models with ChatGPT
    Lopez, Jose Antonio Hernandez
    Foldiak, Mate
    Varro, Daniel
    27TH INTERNATIONAL ACM/IEEE CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS, 2024, : 13 - 24
  • [48] OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks
    Sharma, Utsav
    Wu, Bing-Yue
    Kankipati, Sai Rahul Dhanvi
    Chhabria, Vidya A.
    Rovinski, Austin
    PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
  • [49] OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks
    Sharma, Utsav
    Wu, Bing-Yue
    Kankipati, Sai Rahul Dhanvi
    Chhabria, Vidya A.
    Rovinski, Austin
    2024 ACM/IEEE 6TH SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
  • [50] Classifying code comments in Java']Java open-source software systems
    Pascarella, Luca
    Bacchelli, Alberto
    2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, : 227 - 237