Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

被引:0
|
作者
Meng, Yu [1 ]
Zhang, Yunyi [1 ]
Huang, Jiaxin [1 ]
Wang, Xuan [1 ]
Zhang, Yu [1 ]
Ji, Heng [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base. The biggest challenge of distantlysupervised NER is that the distant supervision may induce incomplete and noisy labels, rendering the straightforward application of supervised learning ineffective. In this paper, we propose (1) a noise-robust learning scheme comprised of a new loss function and a noisy label removal step, for training NER models on distantly-labeled data, and (2) a self-training method that uses contextualized augmentations created by pre-trained language models to improve the generalization ability of the NER model. On three benchmark datasets, our method achieves superior performance, outperforming existing distantlysupervised NER models by significant margins(1).
引用
收藏
页码:10367 / 10378
页数:12
相关论文
共 50 条
  • [41] Transfer Learning for Named Entity Recognition in Setswana Language Using CNN-BiLSTM Model
    Chabalala, Shumile
    Ojo, Sunday O.
    Owolawi, Pius A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (02) : 472 - 481
  • [42] Semi-Supervised Learning Approach for Indonesian Named Entity Recognition (NER) Using Co-Training Algorithm
    Aryoyudanta, Bayu
    Adji, Teguh Bharata
    Llidayah, Lndriana
    2016 INTERNATIONAL SEMINAR ON INTELLIGENT TECHNOLOGY AND ITS APPLICATIONS (ISITIA): RECENT TRENDS IN INTELLIGENT COMPUTATIONAL TECHNOLOGIES FOR SUSTAINABLE ENERGY, 2016, : 7 - 11
  • [43] Semi-supervised deep learning based named entity recognition model to parse education section of resumes
    Gaur, Bodhvi
    Saluja, Gurpreet Singh
    Sivakumar, Hamsa Bharathi
    Singh, Sanjay
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (11): : 5705 - 5718
  • [44] Semi-supervised deep learning based named entity recognition model to parse education section of resumes
    Bodhvi Gaur
    Gurpreet Singh Saluja
    Hamsa Bharathi Sivakumar
    Sanjay Singh
    Neural Computing and Applications, 2021, 33 : 5705 - 5718
  • [45] Language inference-based learning for Low-Resource Chinese clinical named entity recognition using language model
    Cui, Zhaojian
    Yu, Kai
    Yuan, Zhenming
    Dong, Xiaofeng
    Luo, Weibin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149
  • [46] Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition
    Chen, Po-Kai
    Fu, Li-Yeh
    Chen, Cheng-Kai
    Lin, Yi-Xing
    Chen, Chih-Ping
    Huang, Chien-Lin
    Wang, Jia-Ching
    2024 IEEE TENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, ICCE 2024, 2024, : 387 - 391
  • [47] A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records
    Cai, Xiaoling
    Dong, Shoubin
    Hu, Jinlong
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)
  • [48] A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records
    Xiaoling Cai
    Shoubin Dong
    Jinlong Hu
    BMC Medical Informatics and Decision Making, 19
  • [49] Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
    Zhang, Min
    Geng, Guohua
    Chen, Jing
    ENTROPY, 2020, 22 (02)
  • [50] IFF-WAV2VEC: Noise Robust Low-Resource Speech Recognition Based on Self-supervised Learning and Interactive Feature Fusion
    Cao, Jing
    Qian, Zhaopeng
    Yu, Chongchong
    Xie, Tao
    PROCEEDINGS OF 2023 6TH ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE, AICCC 2023, 2023, : 232 - 237