Disambiguation of biomedical acronyms based on a bidirectional recurrent neural network of character-level features

被引：0

作者：

Kai R. ^{[1
]}

Na L. ^{[1
]}

Wei X. ^{[1
]}

Shi-Wen W. ^{[2
]}

机构：

[1] College of Computer Science, South-Central University for Nationalities, Wuhan

[2] Université Toulouse-Jean-Jaurès, Toulouse

来源：

Journal of Engineering Science and Technology Review | 2019年 / 12卷 / 06期

关键词：

Abbreviation; Bi-LSTM; Biomedical; WSD;

D O I：

10.25103/jestr.126.13

中图分类号：

学科分类号：

摘要：

Polysemic acronyms are very common in the field of biomedicine. These acronyms have different senses in different contexts. The ambiguity of acronyms may cause significant negative impact on the understanding of the full text by machine learning. To address the disambiguation of acronyms in the biomedical domain, most associated studies are based on methods using word-level contextual features. These methods require abundant relevant external resources for model training, and the accuracy of their disambiguation of acronyms may decrease greatly upon the lack of external resources. In this study, disambiguation of biomedical acronyms was investigated on the basis of the character-level feature model to realize the disambiguation of biomedical acronyms with largely limited external corpora. First, sentences containing ambiguous acronyms were extracted through retrieval and the feature vector of the context were initialized by using the character-level features. Second, these initial vectors were input into the bidirectional long shortterm memory neutral network model for training. Finally, the disambiguation of acronyms was realized by the outputs of the neutral network model through the Softmax classification approach. The results of acronym disambiguation based on character-level feature model were also compared with those based on word-level feature models. Results demonstrate that the average accuracy of the character-level feature neutral network algorithm reaches 85.82% on the dataset of 106 common biomedical acronyms. Thus, the character-level feature neutral network algorithm is superior to the traditional methods, which use a large number of external resources. This study confirms that the disambiguation method based on character-level features is applicable to the disambiguation of biomedical acronyms under limited relevant data. © 2019 School of Science, IHU.

引用

页码：105 / 112

页数：7

共 50 条

[1] Character-level neural network for biomedical named entity recognition
Gridach, Mourad
JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 70 : 85 - 91
[2] Improving Bug Localization with Character-level Convolutional Neural Network and Recurrent Neural Network
Xiao, Yan
Keung, Jacky
2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 703 - 704
[3] Character-level text classification via convolutional neural network and gated recurrent unit
Bing Liu
Yong Zhou
Wei Sun
International Journal of Machine Learning and Cybernetics, 2020, 11 : 1939 - 1949
[4] Character-level text classification via convolutional neural network and gated recurrent unit
Liu, Bing
Zhou, Yong
Sun, Wei
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (08) : 1939 - 1949
[5] CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
Hwang, Kyuyeon
Sung, Wonyong
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5720 - 5724
[6] Experiments in Character-level Neural Network Models for Punctuation
Gale, William
Parthasarathy, Sarangarajan
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2794 - 2798
[7] CHARACTER-LEVEL INCREMENTAL SPEECH RECOGNITION WITH RECURRENT NEURAL NETWORKS
Hwang, Kyuyeon
Sung, Wonyong
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5335 - 5339
[8] Character level and word level embedding with bidirectional LSTM - Dynamic recurrent neural network for biomedical named entity recognition from literature
Gajendran, Sudhakaran
Manjula, D.
Sugumaran, Vijayan
JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 112
[9] A Character-Level Convolutional Neural Network for Predicting Exploitability of Vulnerability
Lyu, Jinghui
Bai, Yude
Xing, Zhenchang
Li, Xiaohong
Ge, Weimin
2021 INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2021), 2021, : 119 - 126
[10] Comparison of character-level and part of speech features for name recognition in biomedical texts
Collier, N
Takeuchi, K
JOURNAL OF BIOMEDICAL INFORMATICS, 2004, 37 (06) : 423 - 435

← 1 2 3 4 5 →