Hybrid medical named entity recognition using document structure and surrounding context

被引:0
|
作者
Mohamed Yassine Landolsi
Lotfi Ben Romdhane
Lobna Hlaoua
机构
[1] ISITCom,MARS Research Lab LR17ES05, SDM Research Group
[2] University of Sousse,undefined
来源
关键词
Medical text mining; Named entity recognition; Machine learning; Information extraction; Electronic medical records; Section identification;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, there is a huge amount of electronic medical documents created in natural language by medical specialists, containing useful information needed for several medical tasks. However, reading these documents to get some specific information is a too tiring task. Thus, extracting information automatically became an essential and a challenging task, especially Named Entity Recognition (NER). NER is crucial for extracting valuable information used in various medical tasks such as clinical decision support and drug safety surveillance. Capturing sufficient context is important for an efficient NER. In the literature, some important context information are not well exploited. Usually, a standard sequence segmentation is used, such as sentence segmentation, which may can’t cover sufficient context. In this paper, we propose a supervised NER method, called MedSINE (Medical Section Identification to enhance the Named Entity tagging), which is based on sequence tagging task using Bidirectional Long Short-Term Memory neural network with Conditional Random Field (BiLSTM-CRF). For that, we exploit layout information to segment the text on chunk sequences and to extract the parent sections of each word as features to provide sufficient context. In addition, we have used a clinical Bidirectional Encoder Representations from Transformers (BERT) word embedding, Part of Speech (PoS), and entity surrounding sequence features. Experiments were conducted on a manually annotated dataset of real Summary of Product Characteristics (SmPC) medical documents in PDF format and on the Colorado Richly Annotated Full Text (CRAFT) corpus. Our model achieved an F1-measure of 89.49%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$89.49\%$$\end{document} and 73.52%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$73.52\%$$\end{document} in terms of strict matching evaluation using the SmPC and CRAFT datasets, respectively. The results show that employing the sequence of parent sections improves the F1-measure by 4.71%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.71\%$$\end{document} in terms of strict matching evaluation.
引用
收藏
页码:5011 / 5041
页数:30
相关论文
共 50 条
  • [31] Named Entity Recognition in Online Medical Consultation Using Deep Learning
    Hu, Ze
    Li, Wenjun
    Yang, Hongyu
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [32] Named Entity Recognition in Chinese Medical Literature Using Pretraining Models
    Wang, Yu
    Sun, Yining
    Ma, Zuchang
    Gao, Lisheng
    Xu, Yang
    SCIENTIFIC PROGRAMMING, 2020, 2020
  • [33] Medical Named Entity Recognition for Indonesian Language Using Word Representations
    Rahman, Arief
    INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND DIGITAL APPLICATIONS (ICITDA 2017), 2018, 325
  • [34] USING HYBRID NEURAL NETWORK TO ADDRESS CHINESE NAMED ENTITY RECOGNITION
    Wang, Guoyu
    Cai, Yongquan
    Ge, Fujiang
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 433 - 438
  • [35] The Role of Global and Local Context in Named Entity Recognition
    Amalvy, Arthur
    Labatut, Vincent
    Dufour, Richard
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 714 - 722
  • [36] Chinese Named Entity Recognition with Inducted Context Patterns
    Pang, Wenbo
    Fan, Xiaozhong
    2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 3, PROCEEDINGS, 2009, : 608 - 611
  • [37] Context Hidden Markov Model for Named Entity Recognition
    Todorovic, Branimir T.
    Rancic, Svetozar R.
    Mulalic, Edin H.
    APPROXIMATION AND COMPUTATION: IN HONOR OF GRADIMIR V. MILOVANOVIC, 2011, 42 : 447 - +
  • [38] Learning In-context Learning for Named Entity Recognition
    Chen, Jiawei
    Lu, Yaojie
    Lin, Hongyu
    Lou, Jie
    Jia, Wei
    Dai, Dai
    Wu, Hua
    Cao, Boxi
    Han, Xianpei
    Sun, Le
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13661 - 13675
  • [39] Named Entity Recognition from Unstructured Handwritten Document Images
    Adak, Chandranath
    Chaudhuri, Bidyut B.
    Blumenstein, Michael
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 375 - 380
  • [40] A deep learning method for named entity recognition in bidding document
    Ji, Yunfei
    Tong, Chao
    Liang, Jun
    Yang, Xi
    Zhao, Zheng
    Wang, Xu
    2018 INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SCIENCE AND APPLICATION TECHNOLOGY, 2019, 1168