Biomedical Flat and Nested Named Entity Recognition: Methods, Challenges, and Advances

被引:1
|
作者
Park, Yesol [1 ]
Son, Gyujin [2 ]
Rho, Mina [1 ,2 ,3 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul 04763, South Korea
[2] Hanyang Univ, Dept Artificial Intelligence, Seoul 04763, South Korea
[3] Hanyang Univ, Dept Biomed Informat, Seoul 04763, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 20期
关键词
named entity recognition; biomedical named entity recognition; flat named entity recognition; nested named entity recognition; flat and nested named entity recognition; natural language processing; CORPUS;
D O I
10.3390/app14209302
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Biomedical named entity recognition (BioNER) aims to identify and classify biomedical entities (i.e., diseases, chemicals, and genes) from text into predefined classes. This process serves as an important initial step in extracting biomedical information from textual sources. Considering the structure of the entities it addresses, BioNER tasks are divided into two categories: flat NER, where entities are non-overlapping, and nested NER, which identifies entities embedded within another. While early studies primarily addressed flat NER, recent advances in neural models have enabled more sophisticated approaches to nested NER, gaining increasing relevance in the biomedical field, where entity relationships are often complex and hierarchically structured. This review, thus, focuses on the latest progress in large-scale pre-trained language model-based approaches, which have shown the significantly improved performance of NER. The state-of-the-art flat NER models have achieved average F1-scores of 84% on BC2GM, 89% on NCBI Disease, and 92% on BC4CHEM, while nested NER models have reached 80% on the GENIA dataset, indicating room for enhancement. In addition, we discuss persistent challenges, including inconsistencies of named entities annotated across different corpora and the limited availability of named entities of various entity types, particularly for multi-type or nested NER. To the best of our knowledge, this paper is the first comprehensive review of pre-trained language model-based flat and nested BioNER models, providing a categorical analysis among the methods and related challenges for future research and development in the field.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Efficient methods for biomedical named entity recognition
    Chan, Shing-Kit
    Lam, Wai
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 729 - 735
  • [2] A Boundary Assembling Method for Nested Biomedical Named Entity Recognition
    Chen, Yanping
    Hu, Ying
    Li, Yijing
    Huang, Ruizhang
    Qin, Yongbin
    Wu, Yuefei
    Zheng, Qinghua
    Chen, Ping
    IEEE ACCESS, 2020, 8 : 214141 - 214152
  • [3] Study of Named Entity Recognition methods in biomedical field
    Sniegula, Anna
    Poniszewska-Maranda, Aneta
    Chomatek, Lukasz
    10TH INT CONF ON EMERGING UBIQUITOUS SYST AND PERVAS NETWORKS (EUSPN-2019) / THE 9TH INT CONF ON CURRENT AND FUTURE TRENDS OF INFORMAT AND COMMUN TECHNOLOGIES IN HEALTHCARE (ICTH-2019) / AFFILIATED WORKOPS, 2019, 160 : 260 - 265
  • [4] Towards the Named Entity Recognition Methods in Biomedical Field
    Sniegula, Anna
    Poniszewska-Maranda, Aneta
    Chomatek, Lukasz
    SOFSEM 2020: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2020, 12011 : 375 - 387
  • [5] BANNER: An executable survey of advances in biomedical named entity recognition
    Department of Computer Science and Engineering, Arizona State University, United States
    不详
    Pac. Symp. Biocomputing, PSB, (652-663):
  • [6] PENNER: Pattern-enhanced Nested Named Entity Recognition in Biomedical Literature
    Wang, Xuan
    Zhang, Yu
    Li, Qi
    Wu, Cathy H.
    Han, Jiawei
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 540 - 547
  • [7] Nested Named Entity Recognition: A Survey
    Wang, Yu
    Tong, Hanghang
    Zhu, Ziye
    Li, Yun
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (06)
  • [8] A Flat-Span Contrastive Learning Method for Nested Named Entity Recognition
    Liu, Yaodi
    Zhang, Kun
    Tong, Rong
    Cai, Chenxi
    Chen, Dianying
    Wu, Xiaohe
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 37 - 42
  • [9] Computational Reproducibility of Named Entity Recognition methods in the biomedical domain
    Garcia-Serrano, Ana
    Hennig, Sebastian
    Nuernberger, Andreas
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2021, (66): : 141 - 152
  • [10] A review of biomedical named entity recognition
    Chang, Lu
    Zhang, Ruihuan
    Lv, Jia
    Zhou, Weiguang
    Bai, Yunli
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (03) : 893 - 900