A comparative study of cross-lingual sentiment analysis

被引：6

作者：

Priban, Pavel ^{[1
,2
]}

Smid, Jakub ^{[1
]}

Steinberger, Josef ^{[1
]}

Mistera, Adam ^{[1
]}

机构：

[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Univ 8, Plzen 30100, Czech Republic

[2] NTIS New Technol Informat Soc, Univ 8, Plzen 30100, Czech Republic

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 247卷

关键词：

Sentiment analysis; Zero-shot cross-lingual classification; Linear transformation; Transformers; Large language models; Transfer learning;

D O I：

10.1016/j.eswa.2024.123247

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a detailed comparative study of the zero -shot cross -lingual sentiment analysis. Namely, we use modern multilingual Transformer -based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models' ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer -based XLM-R model consistently outperforms all other cross -lingual approaches in zero -shot cross -lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer -based models are comparable in performance to older but much faster methods with linear transformations. The best -performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource -intensive and slower Transformer -based models in real -world applications. The LLMs achieved impressive results that are on par or better, at least by 1%-3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross -lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross -lingual approaches for sentiment analysis.

引用

页数：39

共 50 条

[21] An Approach to Cross-lingual Sentiment Lexicon Construction
Chang, Chia-Hsuan
Wu, Ming-Lun
Hwang, San-Yih
2019 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS 2019), 2019, : 129 - 131
[22] Cross-lingual sentiment classification with stacked autoencoders
Zhou, Guangyou
Zhu, Zhiyuan
He, Tingting
Hu, Xiaohua Tony
KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 47 (01) : 27 - 44
[23] Active Learning for Cross-Lingual Sentiment Classification
Li, Shoushan
Wang, Rong
Liu, Huanhuan
Huang, Chu-Ren
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 236 - 246
[24] A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM
Miah, Md Saef Ullah
Kabir, Md Mohsin
Bin Sarwar, Talha
Safran, Mejdl
Alfarhood, Sultan
Mridha, M. F.
SCIENTIFIC REPORTS, 2024, 14 (01):
[25] A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects
Zhao, Chuanjun
Wu, Meiling
Yang, Xinyi
Zhang, Wenyue
Zhang, Shaoxia
Wang, Suge
Li, Deyu
ACM COMPUTING SURVEYS, 2024, 56 (07)
[26] Comparative analysis of book tags: a cross-lingual perspective
Lu, Chao
Zhang, Chengzhi
He, Daqing
ELECTRONIC LIBRARY, 2016, 34 (04): : 666 - 682
[27] Data Quality Controlling for Cross-Lingual Sentiment Classification
Li, Shoushan
Xue, Yunxia
Wang, Zhongqing
Lee, Sophia Yat Mei
Huang, Chu-Ren
2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 125 - 128
[28] A Cross-Lingual Approach for Building Multilingual Sentiment Lexicons
Naderalvojoud, Behzad
Qasemizadeh, Behrang
Kallmeyer, Laura
Sezer, Ebru Akcapinar
TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 259 - 266
[29] Semi-supervised Learning on Cross-Lingual Sentiment Analysis with Space Transfer
He, Xiaonan
Zhang, Hui
Chao, Wenhan
Wang, Daqing
2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 371 - 377
[30] A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis
Zhang, Qi
Zhou, Jie
Chen, Qin
Bai, Qingchun
Xiao, Jun
He, Liang
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →