Multicore based least confidence query sampling strategy to speed up active learning approach for named entity recognition

被引:0
|
作者
Ankit Agrawal
Sarsij Tripathi
Manu Vardhan
机构
[1] National Institute of Technology Raipur,
[2] Motilal Nehru National Institute of Technology Allahabad,undefined
来源
Computing | 2023年 / 105卷
关键词
Least confidence; Active learning; Named entity recognition; Speed up; 68U15; 68U01; 68W10; 68T50;
D O I
暂无
中图分类号
学科分类号
摘要
In the present era, there is a large amount of new data available readily from different sources to collect and store. One of the main problems is to label these new data for various machine learning applications correctly. The active learning approach presents a unique case of machine learning which is widely used to solve the above problem by significantly minimizing the need for labeled data. It aims to select the most appropriate samples from the unlabeled data to be correctly labeled by the oracle and is passed to train the active learner incrementally. There are several different query sampling strategies that exist using which the appropriate samples are selected. One of the main problems with the active learning approach is that it is very time-consuming. So in this research work, a new multi-core-based algorithm is proposed to speed up the active learning approach, which can utilize the complete computational resources present in the system. The experiments have been performed for the problem of named entity recognition which deals with labeling the sequences of words in an unstructured text by classifying them into pre-existing categories. The proposed algorithm is evaluated in terms of both: the performance and execution time over three named entity recognition corpus of distinct biomedical domains. The evaluation results shows considerable improvement in terms of execution time for the proposed active learning algorithm than the existing active learning approach.
引用
收藏
页码:979 / 997
页数:18
相关论文
共 39 条
  • [21] ERABQS: entity resolution based on active machine learning and balancing query strategy
    Mourad, Jabrane
    Hiba, Tabbaa
    Yassir, Rochd
    Imad, Hafidi
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (05) : 1347 - 1373
  • [22] Bagging-Based Active Learning Model for Named Entity Recognition with Distant Supervision
    Lee, Sunghee
    Song, Yeongkil
    Choi, Maengsik
    Kim, Harksoo
    2016 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2016, : 321 - 324
  • [23] Evaluation on Network Social Media Named Entity Recognition Model Based on Active Learning
    He, Guijiao
    Zhou, Yunfeng
    Zheng, Yaodong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (08)
  • [24] A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition
    Xiong, Limao
    Zhou, Jie
    Zhu, Qunxi
    Wang, Xiao
    Wu, Yuanbin
    Zhang, Qi
    Gui, Tao
    Huang, Xuanjing
    Ma, Jin
    Shan, Ying
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1375 - 1386
  • [25] A Comparative Study of Biomedical Named Entity Recognition Methods Based Machine Learning Approach
    Rais, Mohammed
    Lachkar, Abdelmonaime
    Lachkar, Abdelhamid
    El Alaoui Ouatik, Said
    2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 329 - 334
  • [26] A Web Semantic-Based Text Analysis Approach for Enhancing Named Entity Recognition Using PU-Learning and Negative Sampling
    Zhang, Shunqin
    Zhang, Sanguo
    He, Wenduo
    Zhang, Xuan
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2024, 20 (01)
  • [27] A Feature Based Simple Machine Learning Approach with Word Embeddings to Named Entity Recognition on Tweets
    Taspinar, Mete
    Ganiz, Murat Can
    Acarman, Tankut
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 254 - 259
  • [28] Integrated Deep Learning with Attention Layer Based Approach for Precise Biomedical Named Entity Recognition
    Pooja, H.
    Jagadeesh, Prabhudev M. P.
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2024, 15 (06) : 704 - 713
  • [29] Active Learning Based Named Entity Recognition and Its Application in Natural Language Coverless Information Hiding
    Sun, Huiyu
    Grishman, Ralph
    Wang, Yingchao
    JOURNAL OF INTERNET TECHNOLOGY, 2017, 18 (02): : 443 - 451
  • [30] Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning
    Zhou, Kang
    Li, Yuepei
    Li, Qi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7198 - 7211