Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

被引:0
|
作者
Meng, Yu [1 ]
Zhang, Yunyi [1 ]
Huang, Jiaxin [1 ]
Wang, Xuan [1 ]
Zhang, Yu [1 ]
Ji, Heng [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base. The biggest challenge of distantlysupervised NER is that the distant supervision may induce incomplete and noisy labels, rendering the straightforward application of supervised learning ineffective. In this paper, we propose (1) a noise-robust learning scheme comprised of a new loss function and a noisy label removal step, for training NER models on distantly-labeled data, and (2) a self-training method that uses contextualized augmentations created by pre-trained language models to improve the generalization ability of the NER model. On three benchmark datasets, our method achieves superior performance, outperforming existing distantlysupervised NER models by significant margins(1).
引用
收藏
页码:10367 / 10378
页数:12
相关论文
共 50 条
  • [11] Software Entity Recognition with Noise-Robust Learning
    Tai Nguyen
    Di, Yifeng
    Lee, Joohan
    Chen, Muhao
    Zhang, Tianyi
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 484 - 496
  • [12] Distantly Supervised Named Entity Recognition with Self-Adaptive Label Correction
    Nie, Binling
    Li, Chenyang
    APPLIED SCIENCES-BASEL, 2022, 12 (15):
  • [13] Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning
    Peng, Minlong
    Xing, Xiaoyu
    Zhang, Qi
    Fu, Jinlan
    Huang, Xuanjing
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2409 - 2419
  • [14] Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model
    Zhang, Wenkai
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    Liu, Huidan
    Wei, Zhicheng
    Yuan, Nicholas Jing
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14481 - 14488
  • [15] Self-training and co-training applied to Spanish Named Entity Recognition
    Kozareva, Z
    Bonev, B
    Montoyo, A
    MICAI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3789 : 770 - 779
  • [16] A pre-training and self-training approach for biomedical named entity recognition
    Gao, Shang
    Kotevska, Olivera
    Sorokine, Alexandre
    Christian, J. Blair
    PLOS ONE, 2021, 16 (02):
  • [17] Learning with Noise: Improving Distantly-Supervised Fine-grained Entity Typing via Automatic Relabeling
    Zhang, Haoyu
    Long, Dingkun
    Xu, Guangwei
    Zhu, Muhua
    Xie, Pengjun
    Huang, Fei
    Wang, Ji
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3808 - 3815
  • [18] A Self-training Approach for Few-Shot Named Entity Recognition
    Qian, Yudong
    Zheng, Weiguo
    WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 : 183 - 191
  • [19] A NOISE-ROBUST SELF-SUPERVISED PRE-TRAINING MODEL BASED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEECH RECOGNITION
    Zhu, Qiu-Shi
    Zhang, Jie
    Zhang, Zi-Qiang
    Wu, Ming-Hui
    Fang, Xin
    Dai, Li-Rong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3174 - 3178
  • [20] Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning
    Zhou, Ran
    Li, Xin
    Bing, Lidong
    Cambria, Erik
    Miao, Chunyan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4018 - 4031