Biomedical Flat and Nested Named Entity Recognition: Methods, Challenges, and Advances

被引：1

作者：

Park, Yesol ^{[1
]}

Son, Gyujin ^{[2
]}

Rho, Mina ^{[1
,2
,3
]}

机构：

[1] Hanyang Univ, Dept Comp Sci, Seoul 04763, South Korea

[2] Hanyang Univ, Dept Artificial Intelligence, Seoul 04763, South Korea

[3] Hanyang Univ, Dept Biomed Informat, Seoul 04763, South Korea

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 20期

关键词：

named entity recognition; biomedical named entity recognition; flat named entity recognition; nested named entity recognition; flat and nested named entity recognition; natural language processing; CORPUS;

D O I：

10.3390/app14209302

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Biomedical named entity recognition (BioNER) aims to identify and classify biomedical entities (i.e., diseases, chemicals, and genes) from text into predefined classes. This process serves as an important initial step in extracting biomedical information from textual sources. Considering the structure of the entities it addresses, BioNER tasks are divided into two categories: flat NER, where entities are non-overlapping, and nested NER, which identifies entities embedded within another. While early studies primarily addressed flat NER, recent advances in neural models have enabled more sophisticated approaches to nested NER, gaining increasing relevance in the biomedical field, where entity relationships are often complex and hierarchically structured. This review, thus, focuses on the latest progress in large-scale pre-trained language model-based approaches, which have shown the significantly improved performance of NER. The state-of-the-art flat NER models have achieved average F1-scores of 84% on BC2GM, 89% on NCBI Disease, and 92% on BC4CHEM, while nested NER models have reached 80% on the GENIA dataset, indicating room for enhancement. In addition, we discuss persistent challenges, including inconsistencies of named entities annotated across different corpora and the limited availability of named entities of various entity types, particularly for multi-type or nested NER. To the best of our knowledge, this paper is the first comprehensive review of pre-trained language model-based flat and nested BioNER models, providing a categorical analysis among the methods and related challenges for future research and development in the field.

引用

页数：23

共 50 条

[21] A Systematic Review on Biomedical Named Entity Recognition
Kanimozhi, U.
Manjula, D.
DATA SCIENCE ANALYTICS AND APPLICATIONS, DASAA 2017, 2018, 804 : 19 - 37
[22] Feature Importance for Biomedical Named Entity Recognition
Huggard, Hamish
Zhang, Aaron
Zhang, Edmond
Koh, Yun Sing
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 406 - 417
[23] A comparative study for biomedical named entity recognition
Wang, Xu
Yang, Chen
Guan, Renchu
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (03) : 373 - 382
[24] Biomedical Named Entity Recognition with Less Supervision
Ghiasvand, Omid
Kate, Rohit J.
2015 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2015), 2015, : 495 - 495
[25] Named Entity Recognition System for the Biomedical Domain
Sharma, Raghav
Chauhan, Deependra
Sharma, Raksha
PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2022, : 837 - 840
[26] GFNER: A Unified Global Feature-Aware Framework for Flat and Nested Named Entity Recognition
Chen, Jiayin
Chen, Xi
Pan, Shuai
Zhang, Wei
IEEE ACCESS, 2023, 11 : 55139 - 55148
[27] Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison
Song, Bosheng
Li, Fen
Liu, Yuansheng
Zeng, Xiangxiang
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
[28] A Bidirectional Iterative Algorithm for Nested Named Entity Recognition
Dadas, Slawomir
Protasiewicz, Jaroslaw
IEEE ACCESS, 2020, 8 (08): : 135091 - 135102
[29] Few-shot nested named entity recognition
Ming, Hong
Yang, Jiaoyun
Gui, Fang
Jiang, Lili
An, Ning
KNOWLEDGE-BASED SYSTEMS, 2024, 293
[30] Hierarchical Region Learning for Nested Named Entity Recognition
Long, Xinwei
Niu, Shuzi
Li, Yucheng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4788 - 4793

← 1 2 3 4 5 →