Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

被引:0
|
作者
Meng, Yu [1 ]
Zhang, Yunyi [1 ]
Huang, Jiaxin [1 ]
Wang, Xuan [1 ]
Zhang, Yu [1 ]
Ji, Heng [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base. The biggest challenge of distantlysupervised NER is that the distant supervision may induce incomplete and noisy labels, rendering the straightforward application of supervised learning ineffective. In this paper, we propose (1) a noise-robust learning scheme comprised of a new loss function and a noisy label removal step, for training NER models on distantly-labeled data, and (2) a self-training method that uses contextualized augmentations created by pre-trained language models to improve the generalization ability of the NER model. On three benchmark datasets, our method achieves superior performance, outperforming existing distantlysupervised NER models by significant margins(1).
引用
收藏
页码:10367 / 10378
页数:12
相关论文
共 50 条
  • [21] Improving Distantly-Supervised Named Entity Recognition for Traditional Chinese Medicine Text via a Novel Back-Labeling Approach
    Zhang, Dezheng
    Xia, Chao
    Xu, Cong
    Jia, Qi
    Yang, Shibing
    Luo, Xiong
    Xie, Yonghong
    IEEE ACCESS, 2020, 8 : 145413 - 145421
  • [22] Self-Training With Double Selectors for Low-Resource Named Entity Recognition
    Fu, Yingwen
    Lin, Nankai
    Yu, Xiaohui
    Jiang, Shengyi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1265 - 1275
  • [23] Robust Semi-Supervised Traffic Sign Recognition via Self-Training and Weakly-Supervised Learning
    Nartey, Obed Tettey
    Yang, Guowu
    Asare, Sarpong Kwadwo
    Wu, Jinzhao
    Frempong, Lady Nadia
    SENSORS, 2020, 20 (09)
  • [24] Robust semi-supervised traffic sign recognition via self-training and weakly-supervised learning
    Nartey, Obed Tettey
    Yang, Guowu
    Asare, Sarpong Kwadwo
    Wu, Jinzhao
    Frempong, Lady Nadia
    Sensors (Switzerland), 2020, 20 (09):
  • [25] A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition
    Zhu, Qiu-Shi
    Zhang, Jie
    Zhang, Zi-Qiang
    Dai, Li-Rong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1927 - 1939
  • [26] Distantly Supervised Biomedical Relation Extraction via Negative Learning and Noisy Student Self-Training
    Dai, Yuanfei
    Zhang, Bin
    Wang, Shiping
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 1697 - 1708
  • [27] Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition
    Huang, Ganghong
    Zhong, Jiang
    Wang, Chen
    Dai, Qizhu
    Li, Rongzhen
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 91 - 103
  • [28] Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning
    Zhou, Kang
    Li, Yuepei
    Li, Qi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7198 - 7211
  • [29] Variety-aware GAN and online learning augmented self-training model for knowledge graph entity alignment
    Qian, Ye
    Pan, Li
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (05)
  • [30] Semi-Supervised Learning for Named Entity Recognition Using Weakly Labeled Training Data
    Zafarian, Atefeh
    Rokni, Ali
    Khadivi, Shahram
    Ghiasifard, Sonia
    2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 129 - 135