Constructing synthetic datasets with generative artificial intelligence to train large language models to classify acute renal failure from clinical notes

被引：1

作者：

Litake, Onkar ^{[1
]}

Park, Brian H. ^{[1
]}

Tully, Jeffrey L. ^{[1
]}

Gabriel, Rodney A. ^{[1
,2
]}

机构：

[1] Univ Calif San Diego, Dept Anesthesiol, Div Perioperat Informat, 9400 Campus Point Dr, La Jolla, CA 92037 USA

[2] Univ Calif San Diego Hlth, Dept Biomed Informat, La Jolla, CA 92037 USA

来源：

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION | 2024年 / 31卷 / 06期

关键词：

large language models; artificial intelligence; generative AI; ChatGPT;

D O I：

10.1093/jamia/ocae081

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Objectives To compare performances of a classifier that leverages language models when trained on synthetic versus authentic clinical notes.Materials and Methods A classifier using language models was developed to identify acute renal failure. Four types of training data were compared: (1) notes from MIMIC-III; and (2, 3, and 4) synthetic notes generated by ChatGPT of varied text lengths of 15 (GPT-15 sentences), 30 (GPT-30 sentences), and 45 (GPT-45 sentences) sentences, respectively. The area under the receiver operating characteristics curve (AUC) was calculated from a test set from MIMIC-III.Results With RoBERTa, the AUCs were 0.84, 0.80, 0.84, and 0.76 for the MIMIC-III, GPT-15, GPT-30- and GPT-45 sentences training sets, respectively.Discussion Training language models to detect acute renal failure from clinical notes resulted in similar performances when using synthetic versus authentic training data.Conclusion The use of training data derived from protected health information may not be needed.

引用

页码：1404 / 1410

页数：7

共 50 条

[21] DEVELOPMENT OF THE CHATGPT, GENERATIVE ARTIFICIAL INTELLIGENCE AND NATURAL LARGE LANGUAGE MODELS FOR ACCOUNTABLE REPORTING AND USE (CANGARU) GUIDELINES
Cacciamani, Giovanni E.
Eppler, Michael B.
Ganjavi, Conner
Pekan, Asli
Biedermann, Brett
Collins, Gary S.
Gill, Inderbir S.
arXiv, 2023,
[22] Taking the next step with generative artificial intelligence: The transformative role of multimodal large language models in science education
Bewersdorff, Arne
Hartmann, Christian
Hornberger, Marie
Sealer, Kathrin
Bannert, Maria
Kasneci, Enkelejda
Kasneci, Gjergji
Zhai, Xiaoming
Nerdel, Claudia
LEARNING AND INDIVIDUAL DIFFERENCES, 2025, 118
[23] Intelligent design and optimization system for shear wall structures based on large language models and generative artificial intelligence
Qin, Sizhong
Guan, Hong
Liao, Wenjie
Gu, Yi
Zheng, Zhe
Xue, Hongjing
Lu, Xinzheng
JOURNAL OF BUILDING ENGINEERING, 2024, 95
[24] Use of Large Language Models and Artificial Intelligence Tools in Works Submitted to Journal of Clinical Oncology
Miller, Kathy
Gunn, Emilie
Cochran, Angela
Burstein, Hal
Friedberg, Jonathan W. W.
Wheeler, Stephanie
Frankel, Paul
JOURNAL OF CLINICAL ONCOLOGY, 2023, 41 (19) : 3480 - +
[25] Use of Generative Artificial Intelligence, Including Large Language Models Such as ChatGPT, in Scientific Publications: Policies of KJR and Prominent Authorities
Park, Seong Ho
KOREAN JOURNAL OF RADIOLOGY, 2023, 24 (08) : 715 - 718
[26] Evidence-Based Potential of Generative Artificial Intelligence Large Language Models on Dental Avulsion: ChatGPT Versus Gemini
Kaplan, Taibe Tokgoz
Cankar, Muhammet
DENTAL TRAUMATOLOGY, 2025, 41 (02) : 178 - 186
[27] Artificial intelligence in clinical pharmacology: A case study and scoping review of large language models and bioweapon potential
Rubinic, Igor
Kurtov, Marija
Rubinic, Ivan
Likic, Robert
Dargan, Paul I.
Wood, David M.
BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2024, 90 (03) : 620 - 628
[28] Medical Metaverse, Part 2: Artificial Intelligence Algorithms and Large Language Models in Psychiatry and Clinical Neurosciences
Lopez-Ojeda, Wilfredo
Hurley, Robin A.
JOURNAL OF NEUROPSYCHIATRY AND CLINICAL NEUROSCIENCES, 2023, 35 (04) : 316 - 320
[29] Value-based Healthcare: Can Generative Artificial Intelligence and Large Language Models be a Catalyst for Value-based Healthcare?
Jayakumar, Prakash
Nijhuis, Koen D. Oude
Oosterhoff, Jacobien H. F.
Bozic, Kevin J.
CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2023, 481 (10) : 1890 - 1894
[30] Focus: artificial intelligence in medicine-Legal aspects of using large language models in clinical practice
Weicken, Eva
Mittermaier, Mirja
Hoeren, Thomas
Kliesch, Juliana
Wiegand, Thomas
Witzenrath, Martin
Ballhausen, Miriam
Karagiannidis, Christian
Sander, Leif Erik
Groeschel, Matthias I.
INNERE MEDIZIN, 2025,

← 1 2 3 4 5 →