Disambiguation of biomedical acronyms based on a bidirectional recurrent neural network of character-level features

被引:0
|
作者
Kai R. [1 ]
Na L. [1 ]
Wei X. [1 ]
Shi-Wen W. [2 ]
机构
[1] College of Computer Science, South-Central University for Nationalities, Wuhan
[2] Université Toulouse-Jean-Jaurès, Toulouse
关键词
Abbreviation; Bi-LSTM; Biomedical; WSD;
D O I
10.25103/jestr.126.13
中图分类号
学科分类号
摘要
Polysemic acronyms are very common in the field of biomedicine. These acronyms have different senses in different contexts. The ambiguity of acronyms may cause significant negative impact on the understanding of the full text by machine learning. To address the disambiguation of acronyms in the biomedical domain, most associated studies are based on methods using word-level contextual features. These methods require abundant relevant external resources for model training, and the accuracy of their disambiguation of acronyms may decrease greatly upon the lack of external resources. In this study, disambiguation of biomedical acronyms was investigated on the basis of the character-level feature model to realize the disambiguation of biomedical acronyms with largely limited external corpora. First, sentences containing ambiguous acronyms were extracted through retrieval and the feature vector of the context were initialized by using the character-level features. Second, these initial vectors were input into the bidirectional long shortterm memory neutral network model for training. Finally, the disambiguation of acronyms was realized by the outputs of the neutral network model through the Softmax classification approach. The results of acronym disambiguation based on character-level feature model were also compared with those based on word-level feature models. Results demonstrate that the average accuracy of the character-level feature neutral network algorithm reaches 85.82% on the dataset of 106 common biomedical acronyms. Thus, the character-level feature neutral network algorithm is superior to the traditional methods, which use a large number of external resources. This study confirms that the disambiguation method based on character-level features is applicable to the disambiguation of biomedical acronyms under limited relevant data. © 2019 School of Science, IHU.
引用
收藏
页码:105 / 112
页数:7
相关论文
共 50 条
  • [1] Character-level neural network for biomedical named entity recognition
    Gridach, Mourad
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 70 : 85 - 91
  • [2] Improving Bug Localization with Character-level Convolutional Neural Network and Recurrent Neural Network
    Xiao, Yan
    Keung, Jacky
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 703 - 704
  • [3] Character-level text classification via convolutional neural network and gated recurrent unit
    Bing Liu
    Yong Zhou
    Wei Sun
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1939 - 1949
  • [4] Character-level text classification via convolutional neural network and gated recurrent unit
    Liu, Bing
    Zhou, Yong
    Sun, Wei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (08) : 1939 - 1949
  • [5] CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
    Hwang, Kyuyeon
    Sung, Wonyong
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5720 - 5724
  • [6] Experiments in Character-level Neural Network Models for Punctuation
    Gale, William
    Parthasarathy, Sarangarajan
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2794 - 2798
  • [7] CHARACTER-LEVEL INCREMENTAL SPEECH RECOGNITION WITH RECURRENT NEURAL NETWORKS
    Hwang, Kyuyeon
    Sung, Wonyong
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5335 - 5339
  • [8] Character level and word level embedding with bidirectional LSTM - Dynamic recurrent neural network for biomedical named entity recognition from literature
    Gajendran, Sudhakaran
    Manjula, D.
    Sugumaran, Vijayan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 112
  • [9] A Character-Level Convolutional Neural Network for Predicting Exploitability of Vulnerability
    Lyu, Jinghui
    Bai, Yude
    Xing, Zhenchang
    Li, Xiaohong
    Ge, Weimin
    2021 INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2021), 2021, : 119 - 126
  • [10] Comparison of character-level and part of speech features for name recognition in biomedical texts
    Collier, N
    Takeuchi, K
    JOURNAL OF BIOMEDICAL INFORMATICS, 2004, 37 (06) : 423 - 435