Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

被引:19
|
作者
Chen, Xianglong [1 ]
Ouyang, Chunping [1 ]
Liu, Yongbin [1 ]
Bu, Yi [2 ]
机构
[1] Univ South China, Sch Comp, Hengyang 421001, Peoples R China
[2] Indiana Univ, Luddy Sch Informat Comp & Engn, Ctr Complex Networks & Syst Res, Bloomington, IN 47408 USA
基金
中国国家自然科学基金;
关键词
entity recognition; electronic medical records; Bi-LSTM-CRF; rules; domain dictionary;
D O I
10.3390/ijerph17082687
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2-4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] A hybrid approach for named entity recognition in Chinese electronic medical record
    Bin Ji
    Rui Liu
    Shasha Li
    Jie Yu
    Qingbo Wu
    Yusong Tan
    Jiaju Wu
    BMC Medical Informatics and Decision Making, 19
  • [22] Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT
    Chen, Peng
    Zhang, Meng
    Yu, Xiaosheng
    Li, Songpu
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [23] A hybrid approach for named entity recognition in Chinese electronic medical record
    Ji, Bin
    Liu, Rui
    Li, Shasha
    Yu, Jie
    Wu, Qingbo
    Tan, Yusong
    Wu, Jiaju
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)
  • [24] Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT
    Peng Chen
    Meng Zhang
    Xiaosheng Yu
    Songpu Li
    BMC Medical Informatics and Decision Making, 22
  • [25] Utilizing Chinese Dictionary Information in Named Entity Recognition
    Hu, Yun
    Liao, Mingxue
    Lv, Pin
    Zheng, Changwen
    COGNITIVE SYSTEMS AND SIGNAL PROCESSING, PT II, 2019, 1006 : 17 - 26
  • [26] An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records
    Li, Luqi
    Zhao, Jie
    Hou, Li
    Zhai, Yunkai
    Shi, Jinming
    Cui, Fangfang
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [27] Overview of CCKS 2020 Task 3: Named Entity Recognition and Event Extraction in Chinese Electronic Medical Records
    Li, Xia
    Wen, Qinghua
    Lin, Hu
    Jiao, Zengtao
    Zhang, Jiangtao
    DATA INTELLIGENCE, 2021, 3 (03) : 376 - 388
  • [28] An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records
    Luqi Li
    Jie Zhao
    Li Hou
    Yunkai Zhai
    Jinming Shi
    Fangfang Cui
    BMC Medical Informatics and Decision Making, 19
  • [29] Overview of CCKS 2020 Task 3: Named Entity Recognition and Event Extraction in Chinese Electronic Medical Records
    Xia Li
    Qinghua Wen
    Hu Lin
    Zengtao Jiao
    Jiangtao Zhang
    Data Intelligence, 2021, 3 (03) : 376 - 388
  • [30] Medical Named Entity Recognition with Domain Knowledge
    Pei W.
    Sun S.
    Li X.
    Lu J.
    Yang L.
    Wu Y.
    Data Analysis and Knowledge Discovery, 2023, 7 (03) : 142 - 154