Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

被引:19
|
作者
Chen, Xianglong [1 ]
Ouyang, Chunping [1 ]
Liu, Yongbin [1 ]
Bu, Yi [2 ]
机构
[1] Univ South China, Sch Comp, Hengyang 421001, Peoples R China
[2] Indiana Univ, Luddy Sch Informat Comp & Engn, Ctr Complex Networks & Syst Res, Bloomington, IN 47408 USA
基金
中国国家自然科学基金;
关键词
entity recognition; electronic medical records; Bi-LSTM-CRF; rules; domain dictionary;
D O I
10.3390/ijerph17082687
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2-4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Medical named entity recognition of Chinese electronic medical records based on stacked Bidirectional Long Short-Term Memory
    Zhu, Zhichao
    Li, Jianqiang
    Zhao, Qing
    Wei, Yu-Chih
    Jia, Yanhe
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 1930 - 1935
  • [32] Deep learning for named entity recognition on Chinese electronic medical records: Combining deep transfer learning with multitask bi-directional LSTM RNN
    Dong, Xishuang
    Chowdhury, Shanta
    Qian, Lijun
    Li, Xiangfang
    Guan, Yi
    Yang, Jinfeng
    Yu, Qiubin
    PLOS ONE, 2019, 14 (05):
  • [33] Named Entity Recognition in Chinese Electronic Medical Record Using Attention Mechanism
    Li, Menglong
    Zhang, Yu
    Huang, Mengxing
    Chen, Jing
    Feng, Wenlong
    2019 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS (ITHINGS) AND IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) AND IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) AND IEEE SMART DATA (SMARTDATA), 2019, : 649 - 654
  • [34] Chinese Electronic Medical Record Named Entity Recognition based on FastBERT method
    Tuo, Jianyong
    Liu, Zhanzhan
    Chen, Qing
    Ma, Xin
    Wang, Youqing
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 2405 - 2410
  • [35] Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records
    Fang, An
    Hu, Jiahui
    Zhao, Wanqing
    Feng, Ming
    Fu, Ji
    Feng, Shanshan
    Lou, Pei
    Ren, Huiling
    Chen, Xianlai
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [36] Named entity recognition in Chinese medical records based on cascaded conditional random field
    College of Communication Engineering, Jilin University, Changchun
    130012, China
    不详
    130032, China
    不详
    AB
    T9S3A3, Canada
    Jilin Daxue Xuebao (Gongxueban), 6 (1843-1848):
  • [37] Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records
    An Fang
    Jiahui Hu
    Wanqing Zhao
    Ming Feng
    Ji Fu
    Shanshan Feng
    Pei Lou
    Huiling Ren
    Xianlai Chen
    BMC Medical Informatics and Decision Making, 22
  • [38] Medical Named Entity Recognition from Un-labelled Medical Records based on Pre-trained Language Models and Domain Dictionary
    Wen, Chaojie
    Chen, Tao
    Jia, Xudong
    Zhu, Jiang
    DATA INTELLIGENCE, 2021, 3 (03) : 402 - 417
  • [39] IMPROVING CHINESE NAMED ENTITY RECOGNITION WITH LEXICAL INFORMATION
    Fu, Guo-Hong
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3487 - 3491
  • [40] Named entity recognition over electronic health records through a combined dictionary-based approach
    Pomares Quimbaya, Alexandra
    Sierra Munera, Alejandro
    Gonzalez Rivera, Rafael Andres
    Daza Rodriguez, Julian Camilo
    Munoz Velandia, Oscar Mauricio
    Garcia Pena, Angel Alberto
    Labbe, Cyril
    INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, CENTERIS/PROJMAN / HCIST 2016, 2016, 100 : 55 - 61