A survey of datasets in medicine for large language models

被引:0
|
作者
Zhang, Deshiwei [1 ]
Xue, Xiaojuan [2 ]
Gao, Peng [3 ]
Jin, Zhijuan [4 ]
Hu, Menghan [2 ]
Wu, Yue [5 ]
Ying, Xiayang [6 ]
机构
[1] Southeast Univ, Sch Civil Engn, Nanjing 210096, Jiangsu, Peoples R China
[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, 500 Dongchuan Rd, Shanghai 200241, Peoples R China
[3] Tongji Univ, Shanghai Peoples Hosp 10, Sch Med, Dept Ophthalmol, Shanghai 200072, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai Childrens Med Ctr, Sch Med, Dept Dev & Behav Pediat, Shanghai 200127, Peoples R China
[5] Shanghai Jiao Tong Univ, Peoples Hosp 9, Sch Med, Dept Ophthalmol, Shanghai 200011, Peoples R China
[6] Shanghai Jiao Tong Univ, Ruijin Hosp, Pancreat Dis Ctr, Sch Med,Dept Gen Surg, 197 Ruijin 2nd Rd, Shanghai 200001, Peoples R China
来源
INTELLIGENCE & ROBOTICS | 2024年 / 4卷 / 04期
关键词
Large language models (LLMs); NLP; dataset in medicine; Q&A system in medicine;
D O I
10.20517/ir.2024.27
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advent of models such as ChatGPT and other models, large language models (LLMs) have demonstrated unprecedented capabilities in understanding and generating natural language, presenting novel opportunities and challenges within the medicine domain. While there have been many studies focusing on the employment of LLMs in medicine, comprehensive reviews of the datasets utilized in this field remain scarce. This survey seeks to address this gap by providing a comprehensive overview of the datasets in medicine fueling LLMs, highlighting their unique characteristics and the critical roles they play at different stages of LLMs' development: pre-training, fine-tuning, and evaluation. Ultimately, this survey aims to underline the significance of datasets in realizing the full potential of LLMs to innovate and improve healthcare outcomes.
引用
收藏
页码:457 / 478
页数:22
相关论文
共 50 条
  • [1] Large language models for medicine: a survey
    Zheng, Yanxin
    Gan, Wensheng
    Chen, Zefeng
    Qi, Zhenlian
    Liang, Qian
    Yu, Philip S.
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (02) : 1015 - 1040
  • [2] A comprehensive survey of large language models and multimodal large models in medicine
    Xiao, Hanguang
    Zhou, Feizhong
    Liu, Xingyue
    Liu, Tianqi
    Li, Zhipeng
    Liu, Xin
    Huang, Xiaoxuan
    INFORMATION FUSION, 2025, 117
  • [3] Large language models in medicine
    Thirunavukarasu, Arun James
    Ting, Darren Shu Jeng
    Elangovan, Kabilan
    Gutierrez, Laura
    Tan, Ting Fang
    Ting, Daniel Shu Wei
    NATURE MEDICINE, 2023, 29 (08) : 1930 - 1940
  • [4] Large language models in medicine
    Arun James Thirunavukarasu
    Darren Shu Jeng Ting
    Kabilan Elangovan
    Laura Gutierrez
    Ting Fang Tan
    Daniel Shu Wei Ting
    Nature Medicine, 2023, 29 : 1930 - 1940
  • [5] A Comprehensive Survey of Datasets for Large Language Model Evaluation
    Lu, Yuting
    Sun, Chao
    Yan, Yuchao
    Zhu, Hegong
    Song, Dongdong
    Peng, Qing
    Yu, Li
    Wang, Xiaozheng
    Jiang, Jian
    Ye, Xiaolong
    2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 330 - 336
  • [6] Large language models for science and medicine
    Telenti, Amalio
    Auli, Michael
    Hie, Brian L.
    Maher, Cyrus
    Saria, Suchi
    Ioannidis, John P. A.
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2024, 54 (06)
  • [7] Large Language Models in Finance: A Survey
    Li, Yinheng
    Wang, Shaofei
    Ding, Han
    Chen, Hang
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 374 - 382
  • [8] Explainability for Large Language Models: A Survey
    Zhao, Haiyan
    Chen, Hanjie
    Yang, Fan
    Liu, Ninghao
    Deng, Huiqi
    Cai, Hengyi
    Wang, Shuaiqiang
    Yin, Dawei
    Du, Mengnan
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (02)
  • [9] A survey on multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Li, Ke
    Sun, Xing
    Xu, Tong
    Chen, Enhong
    NATIONAL SCIENCE REVIEW, 2024, 11 (12)
  • [10] Large language models in law: A survey
    Lai, Jinqi
    Gan, Wensheng
    Wu, Jiayang
    Qi, Zhenlian
    Yu, Philip S.
    AI OPEN, 2024, 5 : 181 - 196