Transformer-based Named Entity Recognition for Clinical Cancer Drug Toxicity by Positive-unlabeled Learning and KL Regularizers

被引:1
|
作者
Xie, Weixin [1 ]
Xu, Jiayu [1 ]
Zhao, Chengkui [1 ,2 ]
Li, Jin [1 ,3 ]
Han, Shuangze [1 ]
Shao, Tianyu [4 ]
Wang, Limei [3 ]
Feng, Weixing [1 ]
机构
[1] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin 150001, Peoples R China
[2] Shanghai Unicar Therapy Biomed Technol Co Ltd, Shanghai, Peoples R China
[3] Hainan Med Univ, Coll Biomed Informat & Engn, Bioinformat Major Dis Sci Innovat Grp, Key Lab Trop Translat Med,Minist Educ, Haikou, Peoples R China
[4] Chongqing Univ Posts & Telecommun, Sch Software Engn, Chongqing, Peoples R China
基金
黑龙江省自然科学基金;
关键词
KL regularizers; clinical drug toxicity; named entity recognition (NER); positive-unlabeled learning (PULearning); adaptive sampling; cancer drug; INFORMATION; ALGORITHM; CORPUS;
D O I
10.2174/0115748936278299231213045441
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background With increasing rates of polypharmacy, the vigilant surveillance of clinical drug toxicity has emerged as an important concern. Named Entity Recognition (NER) stands as an indispensable undertaking, essential for the extraction of valuable insights regarding drug safety from the biomedical literature. In recent years, significant advancements have been achieved in the deep learning models on NER tasks. Nonetheless, the effectiveness of these NER techniques relies on the availability of substantial volumes of annotated data, which is labor-intensive and inefficient.Methods This study introduces a novel approach that diverges from the conventional reliance on manually annotated data. It employs a transformer-based technique known as Positive-Unlabeled Learning (PULearning), which incorporates adaptive learning and is applied to the clinical cancer drug toxicity corpus. To improve the precision of prediction, we employ relative position embeddings within the transformer encoder. Additionally, we formulate a composite loss function that integrates two Kullback-Leibler (KL) regularizers to align with PULearning assumptions. The outcomes demonstrate that our approach attains the targeted performance for NER tasks, solely relying on unlabeled data and named entity dictionaries.Conclusion Our model achieves an overall NER performance with an F1 of 0.819. Specifically, it attains F1 of 0.841, 0.801 and 0.815 for DRUG, CANCER, and TOXI entities, respectively. A comprehensive analysis of the results validates the effectiveness of our approach in comparison to existing PULearning methods on biomedical NER tasks. Additionally, a visualization of the associations among three identified entities is provided, offering a valuable reference for querying their interrelationships.
引用
收藏
页码:738 / 751
页数:14
相关论文
共 29 条
  • [21] How to Improve E-commerce Search Engines? Evaluating Transformer-Based Named Entity Recognition on German Product Datasets
    Denisov, Sergej
    Baumer, Frederik S.
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2021, 2021, 1486 : 353 - 366
  • [22] Named Entity Recognition and Relation Extraction for COVID-19: Explainable Active Learning with Word2vec Embeddings and Transformer-Based BERT Models
    Arguello-Casteleiro, M.
    Maroto, N.
    Wroe, C.
    Torrado, C. Sevillano
    Henson, C.
    Des-Diz, J.
    Fernandez-Prieto, M. J.
    Furmston, T.
    Fernandez, D. Maseda
    Kulshrestha, M.
    Stevens, R.
    Keane, J.
    Peters, S.
    ARTIFICIAL INTELLIGENCE XXXVIII, 2021, 13101 : 158 - 163
  • [23] Chinese Clinical Named Entity Recognition Based on Stroke ELMo and Multi-Task Learning
    Luo L.
    Yang Z.-H.
    Song Y.-W.
    Li N.
    Lin H.-F.
    Yang, Zhi-Hao (yangzh@dlut.edu.cn), 1943, Science Press (43): : 1943 - 1957
  • [24] Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning
    Zhang, Yang
    Qiu, Yang
    Cui, Yuxin
    Liu, Shichao
    Zhang, Wen
    METHODS, 2020, 179 : 37 - 46
  • [25] Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
    Gong, Lejun
    Zhang, Zhifei
    Chen, Shiqi
    JOURNAL OF HEALTHCARE ENGINEERING, 2020, 2020
  • [26] External Knowledge-Based Weakly Supervised Learning Approach on Chinese Clinical Named Entity Recognition
    Duan, Yeheng
    Ma, Long-Long
    Han, Xianpei
    Sun, Le
    Dong, Bin
    Jiang, Shanshan
    SEMANTIC TECHNOLOGY, JIST 2019: PROCEEDINGS, 2020, 12032 : 336 - 352
  • [27] An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records
    Li, Luqi
    Zhao, Jie
    Hou, Li
    Zhai, Yunkai
    Shi, Jinming
    Cui, Fangfang
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [28] An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records
    Luqi Li
    Jie Zhao
    Li Hou
    Yunkai Zhai
    Jinming Shi
    Fangfang Cui
    BMC Medical Informatics and Decision Making, 19
  • [29] Language inference-based learning for Low-Resource Chinese clinical named entity recognition using language model
    Cui, Zhaojian
    Yu, Kai
    Yuan, Zhenming
    Dong, Xiaofeng
    Luo, Weibin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149