Identifying Medical Named Entities with Word Information

被引:0
|
作者
Ben Y. [1 ]
Pang X. [2 ]
机构
[1] School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan
[2] Archives of Wuhan University of Science and Technology, Wuhan
基金
中国国家自然科学基金;
关键词
Chinese Named Entity Recognition; MacBERT; Online Medical Consultation; Weighted Cross Entropy; Word Information Embedding;
D O I
10.11925/infotech.2096-3467.2022.0547
中图分类号
学科分类号
摘要
[Objective] This paper utilizes the word information to identify and infer the key clinical features in online consultation records and address the difficulty in recognizing the boundaries of named entities. [Methods] First, we constructed a new model based on MacBERT and conditional random fields. Then, we embedded the word position and part of speech as the dialogue text information by the speaker role embedding. Finally, we used the weighted multi-class cross-entropy to solve the problem of entity category imbalance. [Results] We conducted an empirical study with online consultation records from Chunyu Doctor. The F1 value of the proposed model in the named entity recognition task was 74.35%, which was nearly 2% higher than directly using the MacBERT model. [Limitations] We did not design a specific model for Chinese word segmentation. [Conclusions] Our new model with more dimensional features can effectively improve its ability to recognize key features of clinical findings. © 2023 Data Analysis and Knowledge Discovery. All rights reserved.
引用
收藏
页码:123 / 132
页数:9
相关论文
共 32 条
  • [1] Sui Chen, Research of Chinese Named Entity Recognition Based on Deep Learning, (2017)
  • [2] Collobert R, Weston J, Bottou L, Et al., Natural Language Processing (Almost) from Scratch
  • [3] Huang Z H, Xu W, Yu K., Bidirectional LSTM-CRF Models for Sequence Tagging
  • [4] Ma X Z, Hovy E., End-to-End Sequence Labeling via BiDirectional LSTM-CNNS-CRF, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1064-1074, (2016)
  • [5] Chiu J P C, Nichols E., Named Entity Recognition with Bidirectional LSTM-CNNS, Transactions of the Association for Computational Linguistics, 4, pp. 357-370, (2016)
  • [6] Rei M, Crichton G, Pyysalo S., Attending to Characters in Neural Sequence Labeling Models, Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers, pp. 309-318, (2016)
  • [7] Devlin J, Chang M W, Lee K, Et al., BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C], Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2019)
  • [8] Cui Y M, Che W X, Liu T, Et al., Revisiting Pre-Trained Models for Chinese Natural Language Processing
  • [9] Zhang Y, Yang J., Chinese NER Using Lattice LSTM, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 1554-1564, (2018)
  • [10] Li X N, Yan H, Qiu X P, Et al., FLAT: Chinese NER Using Flat-Lattice Transformer, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6836-6842, (2020)