AraXLNet: pre-trained language model for sentiment analysis of Arabic

被引:0
|
作者
Alhanouf Alduailej
Abdulrahman Alothaim
机构
[1] King Saud University,Department of Information Systems, College of Computer and Information Sciences
来源
关键词
Sentiment analysis; Language models; NLP; XLNet; AraXLNet; Text mining;
D O I
暂无
中图分类号
学科分类号
摘要
The Arabic language is a complex language with little resources; therefore, its limitations create a challenge to produce accurate text classification tasks such as sentiment analysis. The main goal of sentiment analysis is to determine the overall orientation of a given text in terms of whether it is positive, negative, or neutral. Recently, language models have shown great results in promoting the accuracy of text classification in English. The models are pre-trained on a large dataset and then fine-tuned on the downstream tasks. Particularly, XLNet has achieved state-of-the-art results for diverse natural language processing (NLP) tasks in English. In this paper, we hypothesize that such parallel success can be achieved in Arabic. The paper aims to support this hypothesis by producing the first XLNet-based language model in Arabic called AraXLNet, demonstrating its use in Arabic sentiment analysis in order to improve the prediction accuracy of such tasks. The results showed that the proposed model, AraXLNet, with Farasa segmenter achieved an accuracy results of 94.78%, 93.01%, and 85.77% in sentiment analysis task for Arabic using multiple benchmark datasets. This result outperformed AraBERT that obtained 84.65%, 92.13%, and 85.05% on the same datasets, respectively. The improved accuracy of the proposed model was evident using multiple benchmark datasets, thus offering promising advancement in the Arabic text classification tasks.
引用
收藏
相关论文
共 50 条
  • [41] Chinese Fine-Grained Sentiment Classification Based on Pre-trained Language Model and Attention Mechanism
    Zhou, Faguo
    Zhang, Jing
    Song, Yanan
    SMART COMPUTING AND COMMUNICATION, 2022, 13202 : 37 - 47
  • [42] Context Analysis for Pre-trained Masked Language Models
    Lai, Yi-An
    Lalwani, Garima
    Zhang, Yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3789 - 3804
  • [43] Enhancing Language Generation with Effective Checkpoints of Pre-trained Language Model
    Park, Jeonghyeok
    Zhao, Hai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2686 - 2694
  • [44] BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives
    Souza, Frederico Dias
    de Oliveira e Souza Filho, Joao Baptista
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022, 2022, 13208 : 209 - 218
  • [45] Roman Urdu Sentiment Analysis Using Pre-trained DistilBERT and XLNet
    Azhar, Nikhar
    Latif, Seemab
    2022 FIFTH INTERNATIONAL CONFERENCE OF WOMEN IN DATA SCIENCE AT PRINCE SULTAN UNIVERSITY (WIDS-PSU 2022), 2022, : 75 - 78
  • [46] LETS: A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model
    Shim, Heereen
    Lowet, Dietwig
    Luca, Stijn
    Vanrumste, Bart
    IEEE ACCESS, 2021, 9 : 115563 - 115578
  • [47] An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding
    Mohamed, Ensaf Hussein
    Moussa, Mohammed ElSaid
    Haggag, Mohamed Hassan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2020, 19 (04)
  • [48] Aspect Based Sentiment Analysis using French Pre-Trained Models
    Essebbar, Abderrahman
    Kane, Bamba
    Guinaudeau, Ophelie
    Chiesa, Valeria
    Quenel, Ilhem
    Chau, Stephane
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
  • [49] SsciBERT: a pre-trained language model for social science texts
    Si Shen
    Jiangfeng Liu
    Litao Lin
    Ying Huang
    Lin Zhang
    Chang Liu
    Yutong Feng
    Dongbo Wang
    Scientometrics, 2023, 128 : 1241 - 1263
  • [50] A Pre-trained Clinical Language Model for Acute Kidney Injury
    Mao, Chengsheng
    Yao, Liang
    Luo, Yuan
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 531 - 532