A Class-Rebalancing Self-Training Framework for Distantly-Supervised Named Entity Recognition

被引:0
|
作者
Li, Qi [1 ,2 ]
Xie, Tingyu [1 ,2 ]
Peng, Peng [2 ]
Wang, Hongwei [1 ,2 ]
Wang, Gaoang [1 ,2 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ, ZJU UIUC Inst, Hangzhou, Zhejiang, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Distant supervision reduces the reliance on human annotation in the named entity recognition tasks. The class-level imbalanced distant annotation is a realistic and unexplored problem, and the popular method of self-training can not handle class-level imbalanced learning. More importantly, self-training is dominated by the high-performance class in selecting candidates, and deteriorates the low-performance class with the bias of generated pseudo label. To address the class-level imbalance performance, we propose a class-rebalancing self-training framework for improving the distantly-supervised named entity recognition. In candidate selection, a class-wise flexible threshold is designed to fully explore other classes besides the high-performance class. In label generation, injecting the distant label, a hybrid pseudo label is adopted to provide straight semantic information for the low-performance class. Experiments on five flat and two nested datasets show that our model achieves state-of-the-art results. We also conduct extensive research to analyze the effectiveness of the flexible threshold and the hybrid pseudo label.
引用
收藏
页码:11054 / 11068
页数:15
相关论文
共 50 条
  • [1] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
    Meng, Yu
    Zhang, Yunyi
    Huang, Jiaxin
    Wang, Xuan
    Zhang, Yu
    Ji, Heng
    Han, Jiawei
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 10367 - 10378
  • [2] CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
    Wei, Chen
    Sohn, Kihyuk
    Mellina, Clayton
    Yuille, Alan
    Yang, Fan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10852 - 10861
  • [3] Semi-supervised Object Detection with Adaptive Class-Rebalancing Self-Training
    Zhang, Fangyuan
    Pan, Tianxiang
    Wang, Bin
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3252 - 3261
  • [4] Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning
    Zhang, Xinghua
    Yu, Bowen
    Liu, Tingwen
    Zhang, Zhenyu
    Sheng, Jiawei
    Xue Mengge
    Xu, Hongbo
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 10746 - 10757
  • [5] Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning
    Zhang, Xinghua
    Yu, Bowen
    Liu, Tingwen
    Zhang, Zhenyu
    Sheng, Jiawei
    Xue, Mengge
    Xu, Hongbo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1518 - 1529
  • [6] Noise-Robust Training with Dynamic Loss and Contrastive Learning for Distantly-Supervised Named Entity Recognition
    Ma, Zhiyuan
    Du, Jintao
    Zhou, Shuheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10119 - 10128
  • [7] SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition
    Si, Shuzheng
    Cai, Zefan
    Zeng, Shuang
    Feng, Guoqiang
    Lin, Jiaxing
    Chang, Baobao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3883 - 3896
  • [8] Class-Imbalanced-Aware Distantly Supervised Named Entity Recognition
    Mao, Yuren
    Hao, Yu
    Liu, Weiwei
    Lin, Xuemin
    Cao, Xin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12117 - 12129
  • [9] A class-rebalancing self-training semisupervised learning for imbalanced data lithology identification
    Yin, Shitao
    Lin, Xiaochun
    Zhang, Zhifeng
    Li, Xiang
    GEOPHYSICS, 2024, 89 (01) : WA1 - WA11
  • [10] A class-rebalancing self-training semisupervised learning for imbalanced data lithology identification
    Yin S.
    Lin X.
    Zhang Z.
    Li X.
    Geophysics, 2023, 89 (01) : WA1 - WA11