Comparing Different Methods for Named Entity Recognition in Portuguese Neurology Text

被引:0
|
作者
Fábio Lopes
César Teixeira
Hugo Gonçalo Oliveira
机构
[1] University of Coimbra,Center for Informatics and Systems, Department of Informatics Engineering
来源
关键词
Natural language processing; Machine learning; Named entity recognition; Portuguese clinical text;
D O I
暂无
中图分类号
学科分类号
摘要
Electronic Medical Records (EMRs) are written in an unstructured way, often using natural language. Information Extraction (IE) may be used for acquiring knowledge from such texts, including the automatic recognition of meaningful entities, through models for Named Entity Recognition (NER). However, while most work on the previous was made for English, this experience aimed at testing different methods in Portuguese text, more precisely, on the domain of Neurology, and take some conclusions. This paper comprised the comparison between Conditional Random Fields (CRF), bidirectional Long Short-term Memory - Conditional Random Fields (BiLSTM-CRF) and a BiLSTM-CRF with residual learning connections, using not only Portuguese texts from medical journals but also texts from the Coimbra Hospital and Universitary Centre (CHUC) Neurology Service. Furthermore, the performances of BiLSTM-CRF models using word embeddings (WEs) trained with clinical text and WEs trained with general language texts were compared. Deep learning models achieved F1-Scores of nearly 83% and 75%, respectively for relaxed and strict evaluation, on texts extracted from the medical journal. For texts collected from the Hospital, the same achieved F1-Scores of nearly 71% and 62%. This work concludes that deep learning models outperform the shallow learning models and that in-domain WEs get better results than general language WEs, even when the latter are trained with much more text than the former. Furthermore, the results show that it is possible to extract information from Hospital clinical texts with models trained with clinical cases extracted from medical journals, and thus openly available. Nevertheless, such results still require a healthcare technician to check if the information is well extracted.
引用
收藏
相关论文
共 50 条
  • [31] Transfer learning for Turkish named entity recognition on noisy text
    Kagan Akkaya, Emre
    Can, Burcu
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 35 - 64
  • [32] Named Entity Recognition of Chinese Text Based on Attention Mechanism
    Shen, Tong-Ping
    Dumlao, Menchita
    Meng, Qing-Quan
    Zhan, Zhong-Hua
    Journal of Network Intelligence, 2023, 8 (02): : 505 - 518
  • [33] HDCNN-CRF for Biomedical Text Named Entity Recognition
    Gao, Mingyuan
    Wei, Hao
    Chen, Fei
    Qu, Wen
    Lu, Mingyu
    PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 191 - 194
  • [34] A comprehensive study of named entity recognition in Chinese clinical text
    Lei, Jianbo
    Tang, Buzhou
    Lu, Xueqin
    Gao, Kaihua
    Jiang, Min
    Xu, Hua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 808 - 814
  • [35] Chinese Named Entity Recognition for Hazard And Operability Analysis Text
    Li, FangGuo
    Zhang, BeiKe
    Gao, Dong
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 374 - 378
  • [36] Novelty detection for text documents using named entity recognition
    Ng, Kok Wah
    Tsai, Flora S.
    Chen, Lihui
    Goh, Kiat Chong
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1663 - +
  • [37] Named Entity Recognition and Normalization in Tweets Towards Text Summarization
    Jabeen, Saima
    Shah, Sajid
    Latif, Asma
    2013 EIGHTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2013, : 223 - 227
  • [38] Named Entity Recognition Algorithms Comparison For Judicial Text Data
    Aibek, Kuralbayev
    Bobur, Mukhsimbayev
    Abay, Bekbaganbetov
    Hajiyev, Fuad
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2020), 2020,
  • [39] Persian Automatic Text Summarization Based on Named Entity Recognition
    Khademi, Mohammad Ebrahim
    Fakhredanesh, Mohammad
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2020,
  • [40] Named Entity Recognition in Vietnamese Text Using Label Propagation
    Huong Thanh Le
    Rathany Chan Sam
    Hoan Cong Nguyen
    Thuy Thanh Nguyen
    2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 366 - 370