Nested Entity Recognition Approach in Chinese Medical Text

被引:0
|
作者
Yan J.-H. [1 ]
Zong C.-Q. [1 ,2 ]
Xu J.-A. [1 ]
机构
[1] School of Computer and Information Technology, Beijing Jiaotong University, Beijing
[2] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2024年 / 35卷 / 06期
关键词
boundary detection; Chinese text; entity recognition; medical field; nested entity recognition;
D O I
10.13328/j.cnki.jos.006927
中图分类号
学科分类号
摘要
Entity recognition is a key technology for information extraction. Compared with ordinary text, the entity recognition of Chinese medical text is often faced with a large number of nested entities. Previous methods of entity recognition often ignore the entity nesting rules unique to medical text and directly use sequence annotation methods. Therefore, a Chinese entity recognition method that incorporates entity nesting rules is proposed. This method transforms the entity recognition task into a joint training task of entity boundary recognition and boundary first-tail relationship recognition in the training process and filters the results by combining the entity nesting rules summarized from actual medical text in the decoding process. In this way, the recognition results are in line with the composition law of the nested combinations of inner and outer entities in the actual text. Good results have been achieved in public experiments on entity recognition of medical text. Experiments on the dataset show that the proposed method is significantly superior to the existing methods in terms of nested-type entity recognition performance, and the overall accuracy is increased by 0.5% compared with the state-of-the-art methods. © 2024 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2923 / 2935
页数:12
相关论文
共 34 条
  • [11] Sheikhshab G, Birol I, Sarkar A., In-domain context-aware token embeddings improve biomedical named entity recognition, Proc. of the 9th Int’l Workshop on Health Text Mining and Information Analysis, pp. 160-164, (2018)
  • [12] Li XN, Yan H, Qiu XP, Huang XJ., FLAT: Chinese ner using flat-lattice transformer, Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6836-6842, (2020)
  • [13] Bekoulis G, Deleu J, Demeester T, Develder C., Joint entity recognition and relation extraction as a multi-head selection problem, Expert Systems with Applications, 114, pp. 34-45, (2018)
  • [14] Yu JT, Bohnet B, Poesio M., Named entity recognition as dependency parsing, Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, pp. 6470-6476, (2020)
  • [15] Shen YL, Ma XY, Tan ZQ, Zhang S, Wang W, Lu WM., Locate and label: A two-stage identifier for nested named entity recognition, Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int’l Joint Conf. on Natural Language Processing, 1, pp. 2782-2794, (2021)
  • [16] Isozaki H, Kazawa H., Efficient support vector classifiers for named entity recognition, Proc. of the 19th Int’l Conf. on Computational Linguistics, pp. 1-7, (2002)
  • [17] Lee KJ, Hwang YS, Kim S, Rim HC., Biomedical named entity recognition using two-phase model based on SVMs, Journal of Biomedical Informatics, 37, 6, pp. 436-447, (2004)
  • [18] Ju ZF, Wang J, Zhu F., Named entity recognition from biomedical text using SVM, Proc. of the 5th Int’l Conf. on Bioinformatics and Biomedical Engineering, pp. 1-4, (2011)
  • [19] Zhou GD, Su J., Named entity recognition using an HMM-based chunk tagger, Proc. of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 473-480, (2002)
  • [20] Zhao SJ., Named entity recognition in biomedical texts using an HMM model, Proc. of the 2004 Int’l Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 87-90, (2004)