Classifying Cancer Stage with Open-Source Clinical Large Language Models

被引：0

作者：

Chang, Chia-Hsuan ^{[1
]}

Lucas, Mary M. ^{[1
]}

Lu-Yao, Grace ^{[2
]}

Yang, Christopher C. ^{[1
]}

机构：

[1] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA

[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Dept Med Oncol, Philadelphia, PA 19107 USA

来源：

2024 IEEE 12TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS, ICHI 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/ICHI61247.2024.00018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cancer stage classification is important for making treatment and care management plans for oncology patients. Information on staging is often included in unstructured form in clinical, pathology, radiology and other free-text reports in the electronic health record system, requiring extensive work to parse and obtain. To facilitate the extraction of this information, previous NLP approaches rely on labeled training datasets, which are labor-intensive to prepare. In this study, we demonstrate that without any labeled training data, open-source clinical large language models (LLMs) can extract pathologic tumor-node-metastasis (pTNM) staging information from real-world pathology reports. Our experiments compare LLMs and a BERT-based model fine-tuned using the labeled data. Our findings suggest that while LLMs still exhibit subpar performance in Tumor (T) classification, with the appropriate adoption of prompting strategies, they can achieve comparable performance on Metastasis (M) classification and improved performance on Node (N) classification.

引用

页码：76 / 82

页数：7

共 50 条

[41] Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library
Tarride, Solene
Schneider, Yoann
Generali-Lince, Marie
Boillet, Melodie
Abadie, Bastien
Kermorvant, Christopher
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 387 - 404
[42] Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data
Chen, Yuhao
Wang, Zhimu
Zulkernine, Farhana
2024 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH, ICDH 2024, 2024, : 126 - 128
[43] Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models
Salam, Babak
Stuewe, Claire
Nowak, Sebastian
Sprinkart, Alois M.
Theis, Maike
Kravchenko, Dmitrij
Mesropyan, Narine
Dell, Tatjana
Endler, Christoph
Pieper, Claus C.
Kuetting, Daniel L.
Luetkens, Julian A.
Isaak, Alexander
EUROPEAN RADIOLOGY, 2025,
[44] Harnessing Large Language Models for Simulink Toolchain Testing and Developing Diverse Open-Source Corpora of Simulink Models for Metric and Evolution Analysis
Shrestha, Sohil Lal
PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 1541 - 1545
[45] Benchmarking Open-Source Large Language Models on Code-Switched Tagalog-English Retrieval Augmented Generation
Adoptante, Aunhel John M.
Castro, Jasper Adrian Dwight, V
Medrana, Micholo Lanz B.
Ocampo, Alyssa Patricia B.
Peramo, Elmer C.
Miranda, Melissa Ruth M.
JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2025, 16 (02) : 233 - 242
[46] Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models
Sicari, Sabrina
Cevallos, M. Jesus F.
Rizzardi, Alessandra
Coen-porisini, Alberto
ACM COMPUTING SURVEYS, 2025, 57 (04)
[47] Text2VQL: Teaching a Model Query Language to Open-Source Language Models with ChatGPT
Lopez, Jose Antonio Hernandez
Foldiak, Mate
Varro, Daniel
27TH INTERNATIONAL ACM/IEEE CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS, 2024, : 13 - 24
[48] OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks
Sharma, Utsav
Wu, Bing-Yue
Kankipati, Sai Rahul Dhanvi
Chhabria, Vidya A.
Rovinski, Austin
PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
[49] OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks
Sharma, Utsav
Wu, Bing-Yue
Kankipati, Sai Rahul Dhanvi
Chhabria, Vidya A.
Rovinski, Austin
2024 ACM/IEEE 6TH SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
[50] Classifying code comments in Java']Java open-source software systems
Pascarella, Luca
Bacchelli, Alberto
2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, : 227 - 237

← 1 2 3 4 5 →