AraXLNet: pre-trained language model for sentiment analysis of Arabic

被引：0

作者：

Alhanouf Alduailej

Abdulrahman Alothaim

机构：

[1] King Saud University,Department of Information Systems, College of Computer and Information Sciences

来源：

Journal of Big Data | / 9卷

关键词：

Sentiment analysis; Language models; NLP; XLNet; AraXLNet; Text mining;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The Arabic language is a complex language with little resources; therefore, its limitations create a challenge to produce accurate text classification tasks such as sentiment analysis. The main goal of sentiment analysis is to determine the overall orientation of a given text in terms of whether it is positive, negative, or neutral. Recently, language models have shown great results in promoting the accuracy of text classification in English. The models are pre-trained on a large dataset and then fine-tuned on the downstream tasks. Particularly, XLNet has achieved state-of-the-art results for diverse natural language processing (NLP) tasks in English. In this paper, we hypothesize that such parallel success can be achieved in Arabic. The paper aims to support this hypothesis by producing the first XLNet-based language model in Arabic called AraXLNet, demonstrating its use in Arabic sentiment analysis in order to improve the prediction accuracy of such tasks. The results showed that the proposed model, AraXLNet, with Farasa segmenter achieved an accuracy results of 94.78%, 93.01%, and 85.77% in sentiment analysis task for Arabic using multiple benchmark datasets. This result outperformed AraBERT that obtained 84.65%, 92.13%, and 85.05% on the same datasets, respectively. The improved accuracy of the proposed model was evident using multiple benchmark datasets, thus offering promising advancement in the Arabic text classification tasks.

引用

共 50 条

[31] Aspect-Based Sentiment Analysis in Hindi Language by Ensembling Pre-Trained mBERT Models
Pathak, Abhilash
Kumar, Sudhanshu
Roy, Partha Pratim
Kim, Byung-Gyu
ELECTRONICS, 2021, 10 (21)
[32] Sentiment analysis based on improved pre-trained word embeddings
Rezaeinia, Seyed Mahdi
Rahmani, Rouhollah
Ghodsi, Ali
Veisi, Hadi
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
[33] Fusion Pre-trained Emoji Feature Enhancement for Sentiment Analysis
Chen, Jie
Yao, Zhiqiang
Zhao, Shu
Zhang, Yanping
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (04)
[34] Aspect-Based Sentiment Analysis of Social Media Data With Pre-Trained Language Models
Troya, Anina
Pillai, Reshmi Gopalakrishna
Rivero, Cristian Rodriguez
Genc, Zulkuf
Kayal, Subhradeep
Araci, Dogu
2021 5TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2021, 2021, : 8 - 17
[35] Pre-trained Language Model for Biomedical Question Answering
Yoon, Wonjin
Lee, Jinhyuk
Kim, Donghyeon
Jeong, Minbyul
Kang, Jaewoo
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 727 - 740
[36] BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Vu
Anh Tuan Nguyen
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, 2020, : 9 - 14
[37] ViDeBERTa: A powerful pre-trained language model for Vietnamese
Tran, Cong Dao
Pham, Nhut Huy
Nguyen, Anh
Hy, Truong Son
Vu, Tu
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1071 - 1078
[38] Misspelling Correction with Pre-trained Contextual Language Model
Hu, Yifei
Ting, Xiaonan
Ko, Youlim
Rayz, Julia Taylor
PROCEEDINGS OF 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC 2020), 2020, : 144 - 149
[39] CLIP-Llama: A New Approach for Scene Text Recognition with a Pre-Trained Vision-Language Model and a Pre-Trained Language Model
Zhao, Xiaoqing
Xu, Miaomiao
Silamu, Wushour
Li, Yanbing
SENSORS, 2024, 24 (22)
[40] Chinese-Korean Weibo Sentiment Classification Based on Pre-trained Language Model and Transfer Learning
Wang, Hengxuan
Zhang, Zhenguo
Cui, Xu
Cui, Rongyi
2022 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE (CCAI 2022), 2022, : 49 - 54

← 1 2 3 4 5 →