Transformer-based Named Entity Recognition for Clinical Cancer Drug Toxicity by Positive-unlabeled Learning and KL Regularizers

被引:1
|
作者
Xie, Weixin [1 ]
Xu, Jiayu [1 ]
Zhao, Chengkui [1 ,2 ]
Li, Jin [1 ,3 ]
Han, Shuangze [1 ]
Shao, Tianyu [4 ]
Wang, Limei [3 ]
Feng, Weixing [1 ]
机构
[1] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin 150001, Peoples R China
[2] Shanghai Unicar Therapy Biomed Technol Co Ltd, Shanghai, Peoples R China
[3] Hainan Med Univ, Coll Biomed Informat & Engn, Bioinformat Major Dis Sci Innovat Grp, Key Lab Trop Translat Med,Minist Educ, Haikou, Peoples R China
[4] Chongqing Univ Posts & Telecommun, Sch Software Engn, Chongqing, Peoples R China
基金
黑龙江省自然科学基金;
关键词
KL regularizers; clinical drug toxicity; named entity recognition (NER); positive-unlabeled learning (PULearning); adaptive sampling; cancer drug; INFORMATION; ALGORITHM; CORPUS;
D O I
10.2174/0115748936278299231213045441
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background With increasing rates of polypharmacy, the vigilant surveillance of clinical drug toxicity has emerged as an important concern. Named Entity Recognition (NER) stands as an indispensable undertaking, essential for the extraction of valuable insights regarding drug safety from the biomedical literature. In recent years, significant advancements have been achieved in the deep learning models on NER tasks. Nonetheless, the effectiveness of these NER techniques relies on the availability of substantial volumes of annotated data, which is labor-intensive and inefficient.Methods This study introduces a novel approach that diverges from the conventional reliance on manually annotated data. It employs a transformer-based technique known as Positive-Unlabeled Learning (PULearning), which incorporates adaptive learning and is applied to the clinical cancer drug toxicity corpus. To improve the precision of prediction, we employ relative position embeddings within the transformer encoder. Additionally, we formulate a composite loss function that integrates two Kullback-Leibler (KL) regularizers to align with PULearning assumptions. The outcomes demonstrate that our approach attains the targeted performance for NER tasks, solely relying on unlabeled data and named entity dictionaries.Conclusion Our model achieves an overall NER performance with an F1 of 0.819. Specifically, it attains F1 of 0.841, 0.801 and 0.815 for DRUG, CANCER, and TOXI entities, respectively. A comprehensive analysis of the results validates the effectiveness of our approach in comparison to existing PULearning methods on biomedical NER tasks. Additionally, a visualization of the associations among three identified entities is provided, offering a valuable reference for querying their interrelationships.
引用
收藏
页码:738 / 751
页数:14
相关论文
共 29 条
  • [1] Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning
    Peng, Minlong
    Xing, Xiaoyu
    Zhang, Qi
    Fu, Jinlan
    Huang, Xuanjing
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2409 - 2419
  • [2] An MRC and adaptive positive-unlabeled learning framework for incompletely labeled named entity recognition
    Zhang, Fu
    Ma, Liangdong
    Wang, Jiapeng
    Cheng, Jingwei
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (11) : 9580 - 9597
  • [3] Transformer-Based Named Entity Recognition for Parsing Clinical Trial Eligibility Criteria
    Tian, Shubo
    Erdengasileng, Arslan
    Yang, Xi
    Guo, Yi
    Wu, Yonghui
    Zhang, Jinfeng
    Bian, Jiang
    He, Zhe
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [4] Biometric identity recognition based on contrastive positive-unlabeled learning
    Sun, Le
    Hua, Yiwen
    Muhammad, Ghulam
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 83
  • [5] Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning
    Zhang, Hanchu
    Hennig, Leonhard
    Alt, Christoph
    Hu, Changjian
    Meng, Yao
    Wang, Chao
    WORKSHOP ON E-COMMERCE AND NLP (ECNLP 3), 2020, : 1 - 6
  • [6] Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes
    Hameed, Pathima Nusrath
    Verspoor, Karin
    Kusljic, Snezana
    Halgamuge, Saman
    BMC BIOINFORMATICS, 2017, 18
  • [7] Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes
    Pathima Nusrath Hameed
    Karin Verspoor
    Snezana Kusljic
    Saman Halgamuge
    BMC Bioinformatics, 18
  • [8] Transformer-based approach for joint handwriting and named entity recognition in historical document
    Rouhou, Ahmed Cheikh
    Dhiaf, Marwa
    Kessentini, Yousri
    Ben Salem, Sinda
    PATTERN RECOGNITION LETTERS, 2022, 155 : 128 - 134
  • [9] Named Entity Recognition in Cyber Threat Intelligence Using Transformer-based Models
    Evangelatos, Pavlos
    Iliou, Christos
    Mavropoulos, Thanassis
    Apostolou, Konstantinos
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 348 - 353
  • [10] Enhanced Chinese Named Entity Recognition with Transformer-Based Multi-feature Fusion
    Zhang, Xiaoli
    Zhang, Quan
    Liang, Kun
    Wang, Haoyu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 : 132 - 141