A Web Semantic-Based Text Analysis Approach for Enhancing Named Entity Recognition Using PU-Learning and Negative Sampling

被引:0
|
作者
Zhang, Shunqin [1 ]
Zhang, Sanguo [1 ]
He, Wenduo [2 ]
Zhang, Xuan [2 ]
机构
[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Network Sci & Cyberspace INSC, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Negative Sampling; NER; PU-Learning; Robustness; Self-Denoising; Token-Level; Two-Step Procedure; Unlabeled Entity Problem; CLASSIFICATION;
D O I
10.4018/IJSWIS.335113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The NER task is largely developed based on well-annotated data. However, in many scenarios, the entities may not be fully annotated, leading to serious performance degradation. To address this issue, the authors propose a robust NER approach that combines a novel PU-learning algorithm and negative sampling. Unlike many existing studies, the proposed method adopts a two-step procedure for handling unlabeled entities, thereby enhancing its capability to mitigate the impact of such entities. Moreover, this algorithm demonstrates high versatility and can be integrated into any token-level NER model with ease. The effectiveness of the proposed method is verified on several classic NER models and datasets, demonstrating its strong ability to handle unlabeled entities. Finally, the authors achieve competitive performances on synthetic and real-world datasets.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Domain Specific Entity Recognition With Semantic-Based Deep Learning Approach
    Ngo, Quoc Hung
    Kechadi, Tahar
    Le-Khac, Nhien-An
    IEEE ACCESS, 2021, 9 : 152892 - 152902
  • [2] Semantic Crawling: an Approach based on Named Entity Recognition
    Di Pietro, Giulia
    Aliprandi, Carlo
    De Luca, Antonio E.
    Raffaelli, Matteo
    Soru, Tiziana
    2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), 2014, : 695 - 699
  • [3] Simple Semantic-based Data Augmentation for Named Entity Recognition in Biomedical Texts
    Phan, Uyen T. P.
    Nguyen, Nhung T. H.
    PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), 2022, : 123 - 129
  • [4] PE-PUC: A graph based PU-Learning approach for text classification
    Yu, Shuang
    Li, Chunping
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2007, 4571 : 574 - +
  • [5] Active Learning-Based Approach for Named Entity Recognition on Short Text Streams
    Cuong Van Tran
    Tuong Tri Nguyen
    Dinh Tuyen Hoang
    Hwang, Dosam
    Ngoc Thanh Nguyen
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, MISSI 2016, 2017, 506 : 321 - 330
  • [6] Active learning approach using a modified least confidence sampling strategy for named entity recognition
    Ankit Agrawal
    Sarsij Tripathi
    Manu Vardhan
    Progress in Artificial Intelligence, 2021, 10 : 113 - 128
  • [7] Active learning approach using a modified least confidence sampling strategy for named entity recognition
    Agrawal, Ankit
    Tripathi, Sarsij
    Vardhan, Manu
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2021, 10 (02) : 113 - 128
  • [8] Named Entity Recognition in Crime Using Machine Learning Approach
    Shabat, Hafedh
    Omar, Nazlia
    Rahem, Khmael
    INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2014, 2014, 8870 : 280 - 288
  • [9] Named entity recognition using hybrid machine learning approach
    Chiong, Raymond
    Wei, Wang
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 578 - 583
  • [10] Named entity recognition in crime using machine learning approach
    Shabat, Hafedh (h2005_ali@yahoo.com), 1600, Springer Verlag (8870):