Robust Chinese Clinical Named Entity Recognition with information bottleneck and adversarial training

被引:0
|
作者
He, Yunfei [1 ]
Zhang, Zhiqiang [2 ]
Shen, Jinlong [1 ]
Li, Yuling [1 ]
Zhang, Yiwen [3 ]
Ding, Weiping [4 ,5 ]
Yang, Fei [1 ]
机构
[1] Anhui Med Univ, Sch Biomed Engn, Hefei 230601, Anhui, Peoples R China
[2] Bengbu First Peoples Hosp, Med Equipment Engn Dept, Bengbu 233000, Anhui, Peoples R China
[3] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[4] Nantong Univ, Sch Artificial Intelligence & Comp Sci, Nantong 226019, Jiangsu, Peoples R China
[5] City Univ Macau, Fac Data Sci, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese Clinical Named Entity Recognition; Multifaceted text representation; Information bottleneck; Hilbert-Schmidt independence criterion; Adversarial training; NETWORKS;
D O I
10.1016/j.asoc.2024.112409
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chinese Clinical Named Entity Recognition (CCNER) aims to extract entities with specific medical significance from Chinese clinical texts, which is an important part of medical data mining. Some existing CCNER models may assume perfect text data and design complex models to improve their accuracy. However, due to the complexity of Chinese clinical entity semantics and the professionalism of annotation, Chinese clinical texts are prone to contain irregular misrepresentations and sparse entity labeling. That would lead to noisy or incomplete text features extracted by CCNER, seriously threatening the robustness of recognition in real-world scenarios. To address these problems, we propose the Robust Chinese Clinical Named Entity Recognition model (RCCNER). RCCNER comprises three essential components: multifaceted text representation, robust feature extraction, and robust model training. For multifaceted text representation, the model enhances consistency and collaboration between feature representations by integrating word embedding, radical embedding, and dictionary embedding to help withstand textual noise. Then, guided by the information bottleneck and the Hilbert-Schmidt independence criterion, robust feature extraction compresses the dependency between text representation and extracted features, while enhancing the dependency between extracted features and labels, which consequently provides reliable text features for robust recognition. The robust model training aspect leverages adversarial training to diminish RCCNER's sensitivity to noise disturbances and sparse entity labeling, thereby reinforcing its robustness in entity recognition. RCCNER collaboratively enhances the noise immunity through text representation, text feature extraction and model training. Several experiments on two popular public datasets validate the effectiveness and robustness of RCCNER.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Chinese Governmental Named Entity Recognition
    Liu, Qi
    Wang, Dong
    Zhou, Meilin
    Li, Peng
    Qi, Baoyuan
    Bin Wang
    INFORMATION RETRIEVAL TECHNOLOGY (AIRS 2018), 2018, 11292 : 16 - 28
  • [32] Named Entity Recognition Model of Traditional Chinese Medicine Medical Texts based on Contextual Semantic Enhancement and Adversarial Training
    Ma, Yuekun
    Wen, Moyan
    Liu, He
    IAENG International Journal of Computer Science, 2024, 51 (08) : 1137 - 1143
  • [33] Enriching Word Information Representation for Chinese Cybersecurity Named Entity Recognition
    Yang, Dongying
    Lian, Tao
    Zheng, Wen
    Zhao, Cai
    NEURAL PROCESSING LETTERS, 2023, 55 (06) : 7689 - 7707
  • [34] A review of Chinese named entity recognition
    Cheng, Jieren
    Liu, Jingxin
    Xu, Xinbin
    Xia, Dongwan
    Liu, Le
    Sheng, Victor S.
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (06): : 2012 - 2030
  • [35] A Chinese named entity recognition method combined with relative position information
    Gan, Ling
    Huang, Chengming
    2021 ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE (ACCTCS 2021), 2021, : 250 - 254
  • [36] Chinese Named Entity Recognition Based on External Knowledge and Position Information
    Li, Yuan
    Luosang, Gadeng
    Jiang, Weili
    Computer Engineering and Applications, 2024, 60 (22) : 162 - 171
  • [37] Enriching Word Information Representation for Chinese Cybersecurity Named Entity Recognition
    Dongying Yang
    Tao Lian
    Wen Zheng
    Cai Zhao
    Neural Processing Letters, 2023, 55 : 7689 - 7707
  • [38] MISS: Multiple information span scoring for Chinese named entity recognition
    Yang, Liyi
    Xing, Shuli
    Mao, Guojun
    COMPUTER SPEECH AND LANGUAGE, 2025, 92
  • [39] The interactive fusion of characters and lexical information for Chinese named entity recognition
    Wang, Ye
    Wang, Zheng
    Yu, Hong
    Wang, Guoyin
    Lei, Dajiang
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (09)
  • [40] Exploiting Hybrid Subword Information for Chinese Historical Named Entity Recognition
    Yan, Chengxi
    Wang, Jun
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4795 - 4801