Comparative study of text representation and learning for Persian named entity recognition

被引：2

作者：

Pour, Mohammad Mahdi Abdollah ^{[1
]}

Momtazi, Saeedeh ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Comp Engn Dept, Tehran, Iran

来源：

ETRI JOURNAL | 2022年 / 44卷 / 05期

关键词：

contextualized representation; NER; Persian language processing;

D O I：

10.4218/etrij.2021-0269

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Transformer models have had a great impact on natural language processing (NLP) in recent years by realizing outstanding and efficient contextualized language models. Recent studies have used transformer-based language models for various NLP tasks, including Persian named entity recognition (NER). However, in complex tasks, for example, NER, it is difficult to determine which contextualized embedding will produce the best representation for the tasks. Considering the lack of comparative studies to investigate the use of different contextualized pretrained models with sequence modeling classifiers, we conducted a comparative study about using different classifiers and embedding models. In this paper, we use different transformer-based language models tuned with different classifiers, and we evaluate these models on the Persian NER task. We perform a comparative analysis to assess the impact of text representation and text classification methods on Persian NER performance. We train and evaluate the models on three different Persian NER datasets, that is, MoNa, Peyma, and Arman. Experimental results demonstrate that XLM-R with a linear layer and conditional random field (CRF) layer exhibited the best performance. This model achieved phrase-based F-measures of 70.04, 86.37, and 79.25 and word-based F scores of 78, 84.02, and 89.73 on the MoNa, Peyma, and Arman datasets, respectively. These results represent state-of-the-art performance on the Persian NER task.

引用

页码：794 / 804

页数：11

共 50 条

[1] Persian Automatic Text Summarization Based on Named Entity Recognition
Khademi, Mohammad Ebrahim
Fakhredanesh, Mohammad
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2020,
[2] A Comparative Study of Segment Representation for Biomedical Named Entity Recognition
Shashirekha, H. L.
Nayel, Hamada A.
2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 1046 - 1052
[3] Persian Named Entity Recognition
Dashtipour, Kia
Gogate, Mandar
Adeel, Ahsan
Algarafi, Abdulrahman
Howard, Newton
Hussain, Amir
2017 IEEE 16TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2017, : 79 - 83
[4] A study of active learning methods for named entity recognition in clinical text
Chen, Yukun
Lasko, Thomas A.
Mei, Qiaozhu
Denny, Joshua C.
Xu, Hua
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : 11 - 18
[5] Learning Morpheme Representation for Mongolian Named Entity Recognition
Weihua Wang
Feilong Bao
Guanglai Gao
Neural Processing Letters, 2019, 50 : 2647 - 2664
[6] Learning Morpheme Representation for Mongolian Named Entity Recognition
Wang, Weihua
Bao, Feilong
Gao, Guanglai
NEURAL PROCESSING LETTERS, 2019, 50 (03) : 2647 - 2664
[7] A Comparative Study of Named Entity Recognition for Telugu
Gorla, SaiKiranmai
Murthy, N. L. Bhanu
Malapati, Aruna
PROCEEDINGS OF THE 9TH ANNUAL MEETING OF THE FORUM FOR INFORMATION RETRIEVAL EVALUATION (FIRE 2017), 2017, : 21 - 24
[8] A Hybrid Method for Persian Named Entity Recognition
Ahmadi, Farid
Moradi, Hamed
2015 7th Conference on Information and Knowledge Technology (IKT), 2015,
[9] Transfer learning for Turkish named entity recognition on noisy text
Kagan Akkaya, Emre
Can, Burcu
NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 35 - 64
[10] A Hybrid Approach for Persian Named Entity Recognition
Hamed Moradi
Farid Ahmadi
Mohammad-Reza Feizi-Derakhshi
Iranian Journal of Science and Technology, Transactions A: Science, 2017, 41 : 215 - 222

← 1 2 3 4 5 →