Comparative study of text representation and learning for Persian named entity recognition

被引:2
|
作者
Pour, Mohammad Mahdi Abdollah [1 ]
Momtazi, Saeedeh [1 ]
机构
[1] Amirkabir Univ Technol, Comp Engn Dept, Tehran, Iran
关键词
contextualized representation; NER; Persian language processing;
D O I
10.4218/etrij.2021-0269
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Transformer models have had a great impact on natural language processing (NLP) in recent years by realizing outstanding and efficient contextualized language models. Recent studies have used transformer-based language models for various NLP tasks, including Persian named entity recognition (NER). However, in complex tasks, for example, NER, it is difficult to determine which contextualized embedding will produce the best representation for the tasks. Considering the lack of comparative studies to investigate the use of different contextualized pretrained models with sequence modeling classifiers, we conducted a comparative study about using different classifiers and embedding models. In this paper, we use different transformer-based language models tuned with different classifiers, and we evaluate these models on the Persian NER task. We perform a comparative analysis to assess the impact of text representation and text classification methods on Persian NER performance. We train and evaluate the models on three different Persian NER datasets, that is, MoNa, Peyma, and Arman. Experimental results demonstrate that XLM-R with a linear layer and conditional random field (CRF) layer exhibited the best performance. This model achieved phrase-based F-measures of 70.04, 86.37, and 79.25 and word-based F scores of 78, 84.02, and 89.73 on the MoNa, Peyma, and Arman datasets, respectively. These results represent state-of-the-art performance on the Persian NER task.
引用
收藏
页码:794 / 804
页数:11
相关论文
共 50 条
  • [31] One Class per Named Entity: Exploiting Unlabeled Text for Named Entity Recognition
    Wong, Yingchuan
    Ng, Hwee Tou
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1763 - 1768
  • [32] Various approaches to text representation for named entity disambiguation
    Lasek, Ivo
    Vojtas, Peter
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2013, 9 (03) : 242 - +
  • [33] Cost-aware active learning for named entity recognition in clinical text
    Wei, Qiang
    Chen, Yukun
    Salimi, Mandana
    Denny, Joshua C.
    Mei, Qiaozhu
    Lasko, Thomas A.
    Chen, Qingxia
    Wu, Stephen
    Franklin, Amy
    Cohen, Trevor
    Xu, Hua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (11) : 1314 - 1322
  • [34] Continual Learning for Named Entity Recognition
    Monaikul, Natawut
    Castellucci, Giuseppe
    Filice, Simone
    Rokhlenko, Oleg
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13570 - 13577
  • [35] BiLSTM-CRF for Persian Named-Entity Recognition
    Poostchi, Hanieh
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4427 - 4431
  • [36] Persian Named Entity Recognition by Gray Wolf Optimization Algorithm
    Forouzandeh, Aynaz
    Feizi-Derakhshi, Mohammad-Reza
    Gholami-Dastgerdi, Pejman
    SCIENTIFIC PROGRAMMING, 2022, 2022
  • [37] Ensemble Learning for Named Entity Recognition
    Speck, Rene
    Ngomo, Axel-Cyrille Ngonga
    SEMANTIC WEB - ISWC 2014, PT I, 2014, 8796 : 519 - 534
  • [38] Chinese Named Entity Recognition Based on Multi-Level Representation Learning
    Li, Weijun
    Ding, Jianping
    Liu, Shixia
    Liu, Xueyang
    Su, Yilei
    Wang, Ziyi
    APPLIED SCIENCES-BASEL, 2024, 14 (19):
  • [39] Joint Learning of Named Entity Recognition and Entity Linking
    Martins, Pedro Henrique
    Marinho, Zita
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 190 - 196
  • [40] Nested named entity recognition in historical archive text
    Byrne, Kate
    ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 589 - 596