A comparative study of cross-lingual sentiment analysis

被引:6
|
作者
Priban, Pavel [1 ,2 ]
Smid, Jakub [1 ]
Steinberger, Josef [1 ]
Mistera, Adam [1 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Univ 8, Plzen 30100, Czech Republic
[2] NTIS New Technol Informat Soc, Univ 8, Plzen 30100, Czech Republic
关键词
Sentiment analysis; Zero-shot cross-lingual classification; Linear transformation; Transformers; Large language models; Transfer learning;
D O I
10.1016/j.eswa.2024.123247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a detailed comparative study of the zero -shot cross -lingual sentiment analysis. Namely, we use modern multilingual Transformer -based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models' ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer -based XLM-R model consistently outperforms all other cross -lingual approaches in zero -shot cross -lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer -based models are comparable in performance to older but much faster methods with linear transformations. The best -performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource -intensive and slower Transformer -based models in real -world applications. The LLMs achieved impressive results that are on par or better, at least by 1%-3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross -lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross -lingual approaches for sentiment analysis.
引用
收藏
页数:39
相关论文
共 50 条
  • [21] An Approach to Cross-lingual Sentiment Lexicon Construction
    Chang, Chia-Hsuan
    Wu, Ming-Lun
    Hwang, San-Yih
    2019 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS 2019), 2019, : 129 - 131
  • [22] Cross-lingual sentiment classification with stacked autoencoders
    Zhou, Guangyou
    Zhu, Zhiyuan
    He, Tingting
    Hu, Xiaohua Tony
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 47 (01) : 27 - 44
  • [23] Active Learning for Cross-Lingual Sentiment Classification
    Li, Shoushan
    Wang, Rong
    Liu, Huanhuan
    Huang, Chu-Ren
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 236 - 246
  • [24] A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM
    Miah, Md Saef Ullah
    Kabir, Md Mohsin
    Bin Sarwar, Talha
    Safran, Mejdl
    Alfarhood, Sultan
    Mridha, M. F.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [25] A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects
    Zhao, Chuanjun
    Wu, Meiling
    Yang, Xinyi
    Zhang, Wenyue
    Zhang, Shaoxia
    Wang, Suge
    Li, Deyu
    ACM COMPUTING SURVEYS, 2024, 56 (07)
  • [26] Comparative analysis of book tags: a cross-lingual perspective
    Lu, Chao
    Zhang, Chengzhi
    He, Daqing
    ELECTRONIC LIBRARY, 2016, 34 (04): : 666 - 682
  • [27] Data Quality Controlling for Cross-Lingual Sentiment Classification
    Li, Shoushan
    Xue, Yunxia
    Wang, Zhongqing
    Lee, Sophia Yat Mei
    Huang, Chu-Ren
    2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 125 - 128
  • [28] A Cross-Lingual Approach for Building Multilingual Sentiment Lexicons
    Naderalvojoud, Behzad
    Qasemizadeh, Behrang
    Kallmeyer, Laura
    Sezer, Ebru Akcapinar
    TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 259 - 266
  • [29] Semi-supervised Learning on Cross-Lingual Sentiment Analysis with Space Transfer
    He, Xiaonan
    Zhang, Hui
    Chao, Wenhan
    Wang, Daqing
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 371 - 377
  • [30] A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis
    Zhang, Qi
    Zhou, Jie
    Chen, Qin
    Bai, Qingchun
    Xiao, Jun
    He, Liang
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,