Clustering Based Active Learning for Biomedical Named Entity Recognition

被引:0
|
作者
Han, Xu [1 ]
Kwoh, Chee Keong [1 ]
Kim, Jung-jae [2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, 50 Nanyang Ave, Singapore 639798, Singapore
[2] Inst Infocomm Res, 1 Fusionopolis Way, Singapore 138632, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recognition and extraction of biomedical names is an essential task for the biomedical information extraction. However, the preparation of large annotated corpora hinders the training of the Named Entity Recognition (NER) systems. Active learning is reducing the needed manual annotation work in supervised learning task. In this work, we propose a novel clustering based active learning method for the biomedical NER task. We show that the underlying NER system using the proposed method outperforms those with other state of the art active learning methods, including density, Gibbs error and entropy based approaches, as well as the random selection. We compare variations of our proposed method and find the optimal design of the active learning method, which is to use the vector representation of named entities, and to select documents that are representative' and informative', as well as to use the Shared Nearest Neighbor (SNN) clustering approach. In particular, the optimal variant of the proposed method achieves a deficiency gain of 36.3% over the random selection.
引用
收藏
页码:1253 / 1260
页数:8
相关论文
共 50 条
  • [1] Ensemble based Active Annotation for Biomedical Named Entity Recognition
    Verma, Mridula
    Sikdar, Utpal
    Saha, Sriparna
    Ekbal, Asif
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 973 - 978
  • [2] A CRF based Machine Learning Approach for Biomedical Named Entity Recognition
    Kanimozhi, U.
    Manjula, D.
    2017 SECOND INTERNATIONAL CONFERENCE ON RECENT TRENDS AND CHALLENGES IN COMPUTATIONAL MODELS (ICRTCCM), 2017, : 335 - 342
  • [3] Biomedical Named Entity Recognition Based on MCBERT
    Wang, Sai
    Yilahun, Hankiz
    Hamdulla, Askar
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 247 - 252
  • [4] A Deep Learning-Based Named Entity Recognition in Biomedical Domain
    Gopalakrishnan, Athira
    Soman, K. P.
    Premjith, B.
    EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 517 - 526
  • [5] Reinforcement learning based distantly supervised biomedical named entity recognition
    Bali, Manish
    Anandaraj, S. P.
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (02): : 317 - 330
  • [6] Biomedical Named Entity Recognition Based on Multi-task Learning
    Zhao, Hui
    Zhao, Di
    Meng, Jiana
    Su, Wen
    Mu, Wenxuan
    HEALTH INFORMATION PROCESSING, CHIP 2023, 2023, 1993 : 51 - 65
  • [7] Loss-based Active Learning for Named Entity Recognition
    Linh, Le Thai
    Nguyen, Minh-Tien
    Zuccon, Guido
    Demartini, Gianluca
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [8] Subsequence Based Deep Active Learning for Named Entity Recognition
    Radmard, Puria
    Fathullah, Yassir
    Lipani, Aldo
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4310 - 4321
  • [9] A Variance Based Active Learning Approach for Named Entity Recognition
    Hassanzadeh, Hamed
    Keyvanpour, MohammadReza
    INTELLIGENT COMPUTING AND INFORMATION SCIENCE, PT II, 2011, 135 : 347 - +
  • [10] Active Learning Technique for Biomedical Named Entity Extraction
    Saha, Sriparna
    Ekbal, Asif
    Verma, Mridula
    Sikdar, Utpal
    Poesio, Massimo
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 835 - 841