Robust Chinese Clinical Named Entity Recognition with information bottleneck and adversarial training

被引:0
|
作者
He, Yunfei [1 ]
Zhang, Zhiqiang [2 ]
Shen, Jinlong [1 ]
Li, Yuling [1 ]
Zhang, Yiwen [3 ]
Ding, Weiping [4 ,5 ]
Yang, Fei [1 ]
机构
[1] Anhui Med Univ, Sch Biomed Engn, Hefei 230601, Anhui, Peoples R China
[2] Bengbu First Peoples Hosp, Med Equipment Engn Dept, Bengbu 233000, Anhui, Peoples R China
[3] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[4] Nantong Univ, Sch Artificial Intelligence & Comp Sci, Nantong 226019, Jiangsu, Peoples R China
[5] City Univ Macau, Fac Data Sci, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese Clinical Named Entity Recognition; Multifaceted text representation; Information bottleneck; Hilbert-Schmidt independence criterion; Adversarial training; NETWORKS;
D O I
10.1016/j.asoc.2024.112409
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chinese Clinical Named Entity Recognition (CCNER) aims to extract entities with specific medical significance from Chinese clinical texts, which is an important part of medical data mining. Some existing CCNER models may assume perfect text data and design complex models to improve their accuracy. However, due to the complexity of Chinese clinical entity semantics and the professionalism of annotation, Chinese clinical texts are prone to contain irregular misrepresentations and sparse entity labeling. That would lead to noisy or incomplete text features extracted by CCNER, seriously threatening the robustness of recognition in real-world scenarios. To address these problems, we propose the Robust Chinese Clinical Named Entity Recognition model (RCCNER). RCCNER comprises three essential components: multifaceted text representation, robust feature extraction, and robust model training. For multifaceted text representation, the model enhances consistency and collaboration between feature representations by integrating word embedding, radical embedding, and dictionary embedding to help withstand textual noise. Then, guided by the information bottleneck and the Hilbert-Schmidt independence criterion, robust feature extraction compresses the dependency between text representation and extracted features, while enhancing the dependency between extracted features and labels, which consequently provides reliable text features for robust recognition. The robust model training aspect leverages adversarial training to diminish RCCNER's sensitivity to noise disturbances and sparse entity labeling, thereby reinforcing its robustness in entity recognition. RCCNER collaboratively enhances the noise immunity through text representation, text feature extraction and model training. Several experiments on two popular public datasets validate the effectiveness and robustness of RCCNER.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training
    Wang, Hao
    Zhou, Lekai
    Duan, Jianyong
    He, Li
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [22] Adversarial Multi-task Learning for Efficient Chinese Named Entity Recognition
    Yan, Yibo
    Zhu, Peng
    Cheng, Dawei
    Yang, Fangzhou
    Luo, Yifeng
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (07)
  • [23] Chinese named entity recognition in the furniture domain based on ERNIE and adversarial learning
    Song, Yang
    Jia, Yanhe
    Zhang, Jian
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2024,
  • [24] Adversarial Training Lattice LSTM for Named Entity Recognition of Rail Fault Texts
    Su, Shuai
    Qu, Jia
    Cao, Yuan
    Li, Ruoqing
    Wang, Guang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 21201 - 21215
  • [25] Adversarial Adaptation for French Named Entity Recognition
    Choudhry, Arjun
    Khatri, Inder
    Gupta, Pankaj
    Gupta, Aaryan
    Nicol, Maxime
    Meurs, Marie-Jean
    Vishwakarma, Dinesh Kumar
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 386 - 395
  • [26] SeqAttack: On Adversarial Attacks for Named Entity Recognition
    Simoncini, Walter
    Spanakis, Gerasimos
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2021, : 308 - 318
  • [27] Robust Chinese Named Entity Recognition Based on Fusion Graph Embedding
    Song, Xuhui
    Yu, Hongtao
    Li, Shaomei
    Wang, Huansha
    ELECTRONICS, 2023, 12 (03)
  • [28] A comprehensive study of named entity recognition in Chinese clinical text
    Lei, Jianbo
    Tang, Buzhou
    Lu, Xueqin
    Gao, Kaihua
    Jiang, Min
    Xu, Hua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 808 - 814
  • [29] Chinese Clinical Named Entity Recognition with ALBERT and MHA Mechanism
    Li, Dongmei
    Long, Jiao
    Qu, Jintao
    Zhang, Xiaoping
    EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE, 2022, 2022
  • [30] Unsupervised cross-domain named entity recognition using entity-aware adversarial training
    Peng, Qi
    Zheng, Changmeng
    Cai, Yi
    Wang, Tao
    Xie, Haoran
    Li, Qing
    NEURAL NETWORKS, 2021, 138 (138) : 68 - 77