Robust Chinese Clinical Named Entity Recognition with information bottleneck and adversarial training

被引:0
|
作者
He, Yunfei [1 ]
Zhang, Zhiqiang [2 ]
Shen, Jinlong [1 ]
Li, Yuling [1 ]
Zhang, Yiwen [3 ]
Ding, Weiping [4 ,5 ]
Yang, Fei [1 ]
机构
[1] Anhui Med Univ, Sch Biomed Engn, Hefei 230601, Anhui, Peoples R China
[2] Bengbu First Peoples Hosp, Med Equipment Engn Dept, Bengbu 233000, Anhui, Peoples R China
[3] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[4] Nantong Univ, Sch Artificial Intelligence & Comp Sci, Nantong 226019, Jiangsu, Peoples R China
[5] City Univ Macau, Fac Data Sci, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese Clinical Named Entity Recognition; Multifaceted text representation; Information bottleneck; Hilbert-Schmidt independence criterion; Adversarial training; NETWORKS;
D O I
10.1016/j.asoc.2024.112409
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chinese Clinical Named Entity Recognition (CCNER) aims to extract entities with specific medical significance from Chinese clinical texts, which is an important part of medical data mining. Some existing CCNER models may assume perfect text data and design complex models to improve their accuracy. However, due to the complexity of Chinese clinical entity semantics and the professionalism of annotation, Chinese clinical texts are prone to contain irregular misrepresentations and sparse entity labeling. That would lead to noisy or incomplete text features extracted by CCNER, seriously threatening the robustness of recognition in real-world scenarios. To address these problems, we propose the Robust Chinese Clinical Named Entity Recognition model (RCCNER). RCCNER comprises three essential components: multifaceted text representation, robust feature extraction, and robust model training. For multifaceted text representation, the model enhances consistency and collaboration between feature representations by integrating word embedding, radical embedding, and dictionary embedding to help withstand textual noise. Then, guided by the information bottleneck and the Hilbert-Schmidt independence criterion, robust feature extraction compresses the dependency between text representation and extracted features, while enhancing the dependency between extracted features and labels, which consequently provides reliable text features for robust recognition. The robust model training aspect leverages adversarial training to diminish RCCNER's sensitivity to noise disturbances and sparse entity labeling, thereby reinforcing its robustness in entity recognition. RCCNER collaboratively enhances the noise immunity through text representation, text feature extraction and model training. Several experiments on two popular public datasets validate the effectiveness and robustness of RCCNER.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Exploiting Hybrid Subword Information for Chinese Historical Named Entity Recognition
    Yan, Chengxi
    Wang, Jun
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4795 - 4801
  • [42] Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism
    Cao, Pengfei
    Chen, Yubo
    Liu, Kang
    Zhao, Jun
    Liu, Shengping
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 182 - 192
  • [43] Chinese Named Entity Recognition in the Ship News Field Based on Adversarial Transfer Learning
    Zhu, Zhihong
    Zhang, Weiwen
    Zhang, Hongbin
    Cheng, Lianglun
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 562 - 567
  • [44] Cross domains adversarial learning for Chinese named entity recognition for online medical consultation
    Wen, Guihua
    Chen, Hehong
    Li, Huihui
    Hu, Yang
    Li, Yanghui
    Wang, Changjun
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 112
  • [45] Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition
    Ghaddar, Abbas
    Langlais, Philippe
    Rashid, Ahmad
    Rezagholizadeh, Mehdi
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 586 - 604
  • [46] Multimodal Named Entity Recognition with Bottleneck Fusion and Contrastive Learning
    Wang, Peng
    Chen, Xiaohang
    Shang, Ziyu
    Ke, Wenjun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (04) : 545 - 555
  • [47] Adversarial Active Learning for Named Entity Recognition in Cybersecurity
    Li, Tao
    Hu, Yongjin
    Ju, Ankang
    Hu, Zhuoran
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 66 (01): : 407 - 420
  • [48] GeoNER: Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training
    Ma, Kai
    Hu, Xinxin
    Tian, Miao
    Tan, Yongjian
    Zheng, Shuai
    Tao, Liufeng
    Qiu, Qinjun
    ACTA GEOLOGICA SINICA-ENGLISH EDITION, 2024, 98 (05) : 1404 - 1417
  • [49] Textual adversarial attacks in cybersecurity named entity recognition
    Jiang, Tian
    Liu, Yunqi
    Cui, Xiaohui
    COMPUTERS & SECURITY, 2025, 150
  • [50] Adversarial Named Entity Recognition with POS label embedding
    Bai, Yuxuan
    Wang, Yu
    Xia, Bin
    Li, Yun
    Zhu, Ziye
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,