Pre-trained Language Models in Biomedical Domain: A Systematic Survey

被引:32
|
作者
Wang, Benyou [1 ]
Xie, Qianqian [2 ]
Pei, Jiahuan [3 ]
Chen, Zhihong [1 ]
Tiwari, Prayag [4 ]
Li, Zhao [5 ]
Fu, Jie [6 ]
机构
[1] Chinese Univ Hong Kong, Shenzhen, Peoples R China
[2] Univ Manchester, Dept Comp Sci, Manchester, Lancs, England
[3] Univ Amsterdam, Amsterdam, Netherlands
[4] Halmstad Univ, Sch Informat Technol, Halmstad, Sweden
[5] Univ Texas Hlth Sci Ctr Houston, Houston, TX 77030 USA
[6] Univ Montreal, Montreal, PQ, Canada
关键词
Biomedical domain; pre-trained language models; natural language processing; TRANSFORMERS; RESOURCE; CORPUS;
D O I
10.1145/3611651
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing tasks. This also benefits the biomedical domain: researchers from informatics, medicine, and computer science communities propose various PLMs trained on biomedical datasets, e.g., biomedical text, electronic health records, protein, and DNA sequences for various biomedical tasks. However, the cross-discipline characteristics of biomedical PLMs hinder their spreading among communities; some existing works are isolated from each other without comprehensive comparison and discussions. It is nontrivial to make a survey that not only systematically reviews recent advances in biomedical PLMs and their applications but also standardizes terminology and benchmarks. This article summarizes the recent progress of pre-trained language models in the biomedical domain and their applications in downstream biomedical tasks. Particularly, we discuss the motivations of PLMs in the biomedical domain and introduce the key concepts of pre-trained language models. We then propose a taxonomy of existing biomedical PLMs that categorizes them from various perspectives systematically. Plus, their applications in biomedical downstream tasks are exhaustively discussed, respectively. Last, we illustrate various limitations and future trends, which aims to provide inspiration for the future research.
引用
收藏
页数:52
相关论文
共 50 条
  • [1] Pre-trained language models with domain knowledge for biomedical extractive summarization
    Xie Q.
    Bishop J.A.
    Tiwari P.
    Ananiadou S.
    Knowledge-Based Systems, 2022, 252
  • [2] Pre-trained language models in medicine: A survey *
    Luo, Xudong
    Deng, Zhiqi
    Yang, Binxia
    Luo, Michael Y.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
  • [3] A Systematic Survey of Chemical Pre-trained Models
    Xia, Jun
    Zhu, Yanqiao
    Du, Yuanqi
    Li, Stan Z.
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6787 - 6795
  • [4] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
  • [5] MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
    Thangarasa, Vithursan
    Salem, Mahmoud
    Saxena, Shreyas
    Leong, Kevin
    Hestness, Joel
    Lie, Sean
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 214 - 230
  • [6] Continual knowledge infusion into pre-trained biomedical language models
    Jha, Kishlay
    Zhang, Aidong
    BIOINFORMATICS, 2022, 38 (02) : 494 - 502
  • [7] Pre-trained Biomedical Language Models for Clinical NLP in Spanish
    Pio Carrino, Casimiro
    Llop, Joan
    Pamies, Marc
    Gutierrez-Fandino, Asier
    Armengol-Estape, Jordi
    Silveira-Ocampo, Joaquin
    Aitor Gonzalez-Agirre, Alfonso Valencia
    Villegas, Marta
    PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), 2022, : 193 - 199
  • [8] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [9] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    Science China(Technological Sciences), 2020, 63 (10) : 1872 - 1897
  • [10] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    Science China(Technological Sciences), 2020, (10) : 1872 - 1897