Semi-supervised geological disasters named entity recognition using few labeled data

被引:8
|
作者
Lei, Xinya [1 ,2 ]
Song, Weijing [1 ,2 ]
Fan, Runyu [1 ,2 ]
Feng, Ruyi [1 ,2 ]
Wang, Lizhe [1 ,2 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Geological disasters named entity recognition; Semi-supervised learning; Self-training; Pre-trained BERT model; Named entity recognition;
D O I
10.1007/s10707-022-00474-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The geological disasters Named Entity Recognition (NER) method aims to recognize entities reflecting disaster event information in unstructured texts to construct a geohazard knowledge graph that can provide a reference for disaster emergency response. Without training on large-scale labeled data, current NER methods based on deep learning models cannot identify specific geological disaster entities from geological disaster situation reports. However, manually labeling geohazard situation reports is tedious and time-consuming. As a result, we present Semi-GDNER, a semi-supervised geological disasters NER approach that can effectively extract six kinds of geological disaster entities when a few manually labeled and unlabeled in-domain data are available. It is divided into two stages: (1) transferring the parameters of the pre-trained BERT-base model to the BERT layer of the backbone model BERT-BiLSTM-CRF and training the backbone model with a few labeled data; (2) continuing training the backbone model by expanding the training set with unlabeled data using a self-training (ST) strategy. To reduce noise in the second stage, we select the pseudo-labeled samples with high confidence to join the training set in each ST iteration. Experiments on our constructed Geological Disaster NER data show that our approach achieves a higher F1 (0.88) than other NER approaches (including five supervised NER approaches and a semi-supervised NER approach using the ST strategy of expanding the training set with all pseudo-labeled data), demonstrating the effectiveness of our approach. Furthermore, experiments on four general Chinese NER datasets show that the framework of our approach is transferable.
引用
收藏
页码:263 / 288
页数:26
相关论文
共 50 条
  • [41] Semi-supervised learning using multiple clusterings with limited labeled data
    Forestier, Germain
    Wemmert, Cedric
    INFORMATION SCIENCES, 2016, 361 : 48 - 65
  • [42] Semi-supervised Emotion Recognition using Inconsistently Annotated Data
    Happy, S. L.
    Dantcheva, Antitza
    Bremond, Francois
    2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 286 - 293
  • [43] Semi-supervised Entity Recognition for Power Dispatching Knowledge Modeling
    Wang K.
    Zhao G.
    Gong X.
    Liu J.
    Wang M.
    Yu D.
    Li S.
    Dianwang Jishu/Power System Technology, 2023, 47 (09): : 3855 - 3863
  • [44] A Chinese named entity recognition method for landslide geological disasters based on deep learning
    Yang, Banghui
    Zhou, Chunlei
    Li, Suju
    Wang, Yuzhu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139
  • [45] Barely-Supervised Learning: Semi-supervised Learning with Very Few Labeled Images
    Lucas, Thomas
    Weinzaepfel, Philippe
    Rogez, Gregory
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1881 - 1889
  • [46] Semi-Supervised Audio Classification with Partially Labeled Data
    Gururani, Siddharth
    Lerch, Alexander
    23RD IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2021), 2021, : 111 - 114
  • [47] Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
    Jiang, Haoming
    Zhang, Danqing
    Cao, Tianyu
    Yin, Bing
    Zhao, Tuo
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1775 - 1789
  • [48] Combining Proper Name-Coreference with Conditional Random Fields for Semi-supervised Named Entity Recognition in Vietnamese Text
    Sam, Rathany Chan
    Huong Thanh Le
    Thuy Thanh Nguyen
    Thien Huu Nguyen
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 512 - 524
  • [49] Medical Named Entity Recognition Using Weakly Supervised Learning
    Long-Long Ma
    Jie Yang
    Bo An
    Shuaikang Liu
    Gaijuan Huang
    Cognitive Computation, 2022, 14 : 1068 - 1079
  • [50] Medical Named Entity Recognition Using Weakly Supervised Learning
    Ma, Long-Long
    Yang, Jie
    An, Bo
    Liu, Shuaikang
    Huang, Gaijuan
    COGNITIVE COMPUTATION, 2022, 14 (03) : 1068 - 1079