AraXLNet: pre-trained language model for sentiment analysis of Arabic

被引:0
|
作者
Alhanouf Alduailej
Abdulrahman Alothaim
机构
[1] King Saud University,Department of Information Systems, College of Computer and Information Sciences
来源
关键词
Sentiment analysis; Language models; NLP; XLNet; AraXLNet; Text mining;
D O I
暂无
中图分类号
学科分类号
摘要
The Arabic language is a complex language with little resources; therefore, its limitations create a challenge to produce accurate text classification tasks such as sentiment analysis. The main goal of sentiment analysis is to determine the overall orientation of a given text in terms of whether it is positive, negative, or neutral. Recently, language models have shown great results in promoting the accuracy of text classification in English. The models are pre-trained on a large dataset and then fine-tuned on the downstream tasks. Particularly, XLNet has achieved state-of-the-art results for diverse natural language processing (NLP) tasks in English. In this paper, we hypothesize that such parallel success can be achieved in Arabic. The paper aims to support this hypothesis by producing the first XLNet-based language model in Arabic called AraXLNet, demonstrating its use in Arabic sentiment analysis in order to improve the prediction accuracy of such tasks. The results showed that the proposed model, AraXLNet, with Farasa segmenter achieved an accuracy results of 94.78%, 93.01%, and 85.77% in sentiment analysis task for Arabic using multiple benchmark datasets. This result outperformed AraBERT that obtained 84.65%, 92.13%, and 85.05% on the same datasets, respectively. The improved accuracy of the proposed model was evident using multiple benchmark datasets, thus offering promising advancement in the Arabic text classification tasks.
引用
收藏
相关论文
共 50 条
  • [21] Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Processing
    Huawei Technologies Co., Ltd.
    不详
    不详
    Proc. Conf. Empir. Methods Nat. Lang. Process., EMNLP, (3135-3151):
  • [22] Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages
    Mabokela, Koena Ronny
    Primus, Mpho
    Celik, Turgay
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (11)
  • [23] Adder Encoder for Pre-trained Language Model
    Ding, Jianbang
    Zhang, Suiyun
    Li, Linlin
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 339 - 347
  • [24] A Study of Vietnamese Sentiment Classification with Ensemble Pre-trained Language Models
    Thin, Dang Van
    Hao, Duong Ngoc
    Nguyen, Ngan Luu-Thuy
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024, 11 (01) : 137 - 165
  • [25] Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects
    Inoue, Go
    Khalifa, Salam
    Habash, Nizar
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1708 - 1719
  • [26] Spanish Pre-Trained CaTrBETO Model for Sentiment Classification in Twitter
    Pijal, Washington
    Armijos, Arianna
    Llumiquinga, Jose
    Lalvay, Sebastian
    Allauca, Steven
    Cuenca, Erick
    2022 THIRD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND SOFTWARE TECHNOLOGIES, ICI2ST, 2022, : 93 - 98
  • [27] Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback
    Loc, Cu Vinh
    Viet, Truong Xuan
    Viet, Tran Hoang
    Thao, Le Hoang
    Viet, Nguyen Hoang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2023, 22 (03)
  • [28] Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting
    Lee, Haein
    Jung, Hae Sun
    Lee, Seon Hong
    Kim, Jang Hyun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (09): : 2334 - 2347
  • [29] Surgicberta: a pre-trained language model for procedural surgical language
    Bombieri, Marco
    Rospocher, Marco
    Ponzetto, Simone Paolo
    Fiorini, Paolo
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
  • [30] Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
    An Pha Le
    Tran Vu Pham
    Thanh-Van Le
    Huynh, Duy, V
    2021 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES (ICMLANT II), 2021, : 84 - 88