A Web Semantic-Based Text Analysis Approach for Enhancing Named Entity Recognition Using PU-Learning and Negative Sampling

被引:0
|
作者
Zhang, Shunqin [1 ]
Zhang, Sanguo [1 ]
He, Wenduo [2 ]
Zhang, Xuan [2 ]
机构
[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Network Sci & Cyberspace INSC, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Negative Sampling; NER; PU-Learning; Robustness; Self-Denoising; Token-Level; Two-Step Procedure; Unlabeled Entity Problem; CLASSIFICATION;
D O I
10.4018/IJSWIS.335113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The NER task is largely developed based on well-annotated data. However, in many scenarios, the entities may not be fully annotated, leading to serious performance degradation. To address this issue, the authors propose a robust NER approach that combines a novel PU-learning algorithm and negative sampling. Unlike many existing studies, the proposed method adopts a two-step procedure for handling unlabeled entities, thereby enhancing its capability to mitigate the impact of such entities. Moreover, this algorithm demonstrates high versatility and can be integrated into any token-level NER model with ease. The effectiveness of the proposed method is verified on several classic NER models and datasets, demonstrating its strong ability to handle unlabeled entities. Finally, the authors achieve competitive performances on synthetic and real-world datasets.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Identifying Text Reuse Using WordNet-based Extended Named Entity Recognition
    Lee, Eunji
    Kim, Pankoo
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 199 - 202
  • [32] A Named Entity Recognition Approach for Electronic Medical Records Using BERT Semantic Enhancement and BiLSTM
    Lai, Xuewei
    Jie, Qingqing
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2023, 19 (01)
  • [33] GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text
    Zhu, Qile
    Li, Xiaolin
    Conesa, Ana
    Pereira, Cecile
    BIOINFORMATICS, 2018, 34 (09) : 1547 - 1554
  • [34] Ensemble Learning of Named Entity Recognition Algorithms using Multilayer Perceptron for the Multilingual Web of Data
    Speck, Rene
    Ngomo, Axel-Cyrille Ngonga
    K-CAP 2017: PROCEEDINGS OF THE KNOWLEDGE CAPTURE CONFERENCE, 2017,
  • [35] Named Entity Recognition for Amharic Using Stack-Based Deep Learning
    Sikdar, Utpal Kumar
    Gambac, Bjorn
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 276 - 287
  • [36] A Deep Learning Based Approach for Biomedical Named Entity Recognition Using Multitasking Transfer Learning with BiLSTM, BERT and CRF
    Pooja H.
    Jagadeesh M.P.P.
    SN Computer Science, 5 (5)
  • [37] Enhancing Legal Named Entity Recognition Using RoBERTa-GCN with CRF: A Nuanced Approach for Fine-Grained Entity Recognition
    Jain, Arihant
    Sharma, Raksha
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 261 - 267
  • [38] Named Entity Recognition in Mammography Radiology Reports using a Multilingual Transfer Learning Approach
    Salazar Cabrera, Esteban Ricardo
    Santos Diaz, Alejandro
    Menasalvas, Ernesitina
    Tamez Pena, Jose Gerardo
    Robles, Victor
    2024 IEEE 37TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS 2024, 2024, : 273 - 277
  • [39] Visual Semantic-Based Representation Learning Using Deep CNNs for Scene Recognition
    Gupta, Shikha
    Sharma, Krishan
    Dinesh, Dileep Aroor
    Thenkanidiyoor, Veena
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (02)
  • [40] A Comparative Study of Biomedical Named Entity Recognition Methods Based Machine Learning Approach
    Rais, Mohammed
    Lachkar, Abdelmonaime
    Lachkar, Abdelhamid
    El Alaoui Ouatik, Said
    2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 329 - 334