Cross lingual transfer learning for sentiment analysis of Italian TripAdvisor reviews

被引:7
|
作者
Catelli, Rosario [1 ]
Bevilacqua, Luca [5 ]
Mariniello, Nicola [5 ]
di Carlo, Vladimiro Scotto [5 ]
Magaldi, Massimo [5 ]
Fujita, Hamido [2 ,3 ,4 ]
De Pietro, Giuseppe [1 ]
Esposito, Massimo [1 ]
机构
[1] Natl Res Council CNR, Inst High Performance Comp & Networking ICAR, Naples, Italy
[2] Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City, Vietnam
[3] Natl Taipei Univ Technol, Taipei, Taiwan
[4] I Somet Inc Assoc, Morioka, Iwate, Japan
[5] Engn Ingn Informat SpA, Naples, Italy
关键词
Transfer learning; Sentiment analysis; Italian dataset; BERT; TripAdvisor; Reviews;
D O I
10.1016/j.eswa.2022.118246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the years, the attention of the scientific world towards the techniques of sentiment analysis has increased considerably, driven by industry. The arrival of the Google BERT language model has confirmed the superiority of models based on a particular structure of artificial neural network called Transformer, from which many variants have resulted. These models are generally pre-trained on large text corpora and only later specialized according to the precise task to be faced on much smaller amounts of data. For these reasons, countless versions were developed to meet the specific needs of each language, especially in the case of languages with relatively few datasets available. At the same time, models that were pre-trained for multiple languages became widespread, providing greater flexibility of use in exchange for lower performance. This study shows how the use of techniques to transfer learning from languages with high resources to languages with low resources provides an important performance increase: a multilingual BERT model fine tuned on a mixed English/Italian dataset (using for the English a literature dataset and for the Italian a reviews dataset created ad-hoc from the well-known platform TripAdvisor), provides much higher performance than models specific to Italian. Overall, the results obtained by comparing the different possible approaches indicate which one is the most promising to pursue in order to obtain the best results in low resource scenarios.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Modeling Language Discrepancy for Cross-Lingual Sentiment Analysis
    Chen, Qiang
    Li, Chenliang
    Li, Wenjie
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 117 - 126
  • [32] On the Effect of Word Order on Cross-lingual Sentiment Analysis
    Atrio, Alex R.
    Badia, Toni
    Barnes, Jeremy
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 23 - 30
  • [33] Experiments in Cross-Lingual Sentiment Analysis in Discussion Forums
    Ghorbel, Hatem
    SOCIAL INFORMATICS, SOCINFO 2012, 2012, 7710 : 138 - 151
  • [34] A generalizable sentiment analysis method for creating a hotel dictionary: using big data on TripAdvisor hotel reviews
    Bagherzadeh, Sayeh
    Shokouhyar, Sajjad
    Jahani, Hamed
    Sigala, Marianna
    JOURNAL OF HOSPITALITY AND TOURISM TECHNOLOGY, 2021, 12 (02) : 210 - 238
  • [35] Cross-Lingual Sentiment Quantification
    Esuli, Andrea
    Moreo, Alejandro
    Sebastiani, Fabrizio
    IEEE INTELLIGENT SYSTEMS, 2020, 35 (03) : 106 - 113
  • [36] CL-XABSA: Contrastive Learning for Cross-Lingual Aspect-Based Sentiment Analysis
    Lin, Nankai
    Fu, Yingwen
    Lin, Xiaotian
    Zhou, Dong
    Yang, Aimin
    Jiang, Shengyi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2935 - 2946
  • [37] Choosing Transfer Languages for Cross-Lingual Learning
    Lin, Yu-Hsiang
    Chen, Chian-Yu
    Lee, Jean
    Li, Zirui
    Zhang, Yuyan
    Xia, Mengzhou
    Rijhwani, Shruti
    He, Junxian
    Zhang, Zhisong
    Ma, Xuezhe
    Anastasopoulos, Antonios
    Littell, Patrick
    Neubig, Graham
    Anastasopoulos, Antonios
    Littell, Patrick
    Neubig, Graham
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3125 - 3135
  • [38] Translation Artifacts in Cross-lingual Transfer Learning
    Artetxe, Mikel
    Labaka, Gorka
    Agirre, Eneko
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7674 - 7684
  • [39] Sentiment Analysis of Restaurant Reviews on Yelp with Incremental Learning
    Doan, Tri
    Kalita, Jugal
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 697 - 700
  • [40] Sentiment Analysis of Product Reviews using Deep Learning
    Panthati, Jagadeesh
    Bhaskar, Jasmine
    Ranga, Tarun Kumar
    Challa, Manish Reddy
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2408 - 2414