Deep Ensemble Network for Sentiment Analysis in Bi-lingual Low-resource Languages

被引:7
|
作者
Roy, Pradeep Kumar [1 ]
机构
[1] Indian Inst Informat Technol, Dept Comp Sci & Engn, Surat 394190, Gujarat, India
关键词
Sentiment analysis; code-mixed; transformer; BERT; Kannada; Malayalam; ensemble learning; deep learning; machine learning;
D O I
10.1145/3600229
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis (SA) is the systematic identification, extraction, quantification, and study of affective states and subjective information using natural language processing. It is widely used for analyzing users' feedback, such as reviews or social posts. Recently, SA has been one of the favorite research domains in NLP due to their wide range of applications, including E-commerce, healthcare, hotel business, and others. Many machine learning and deep learning-based models exist to predict the sentiment of the user's post. However, the sentiment analysis in low-resource languages such as Kannada, Malayalam, Telugu, and Tamil received less attention due to language complexity and the low availability of required resources. This research fills the gap by proposing an ensemble model for predicting the sentiment of code-mixed Kannada and Malayalam languages. The ensemble of transformer-based models achieved a promising weighted F-1-score of 0.66 for Kannada code-mixed language. In contrast, the ensemble model of the deep learning framework performed best by achieving a weighted F-1-score of 0.72 for the Malayalam dataset, outperforming existing research.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Deep Persian sentiment analysis: Cross-lingual training for low-resource languages
    Ghasemi, Rouzbeh
    Ashrafi Asli, Seyed Arad
    Momtazi, Saeedeh
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (04) : 449 - 462
  • [2] UniSent: Universal Sentiment Analysis System for Low-Resource Languages
    Jabreel, Mohammed
    Maaroof, Najlaa
    Valls, Aida
    Moreno, Antonio
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 387 - 396
  • [3] Comparative Analysis of Transformer Models for Sentiment Analysis in Low-Resource Languages
    Aliyu, Yusuf
    Sarlan, Aliza
    Danyaro, Kamaluddeen Usman
    Rahman, Abdulahi Sani B. A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (04) : 353 - 364
  • [4] A Deep Learning Sentiment Analyser for Social Media Comments in Low-Resource Languages
    Kastrati, Zenun
    Ahmedi, Lule
    Kurti, Arianit
    Kadriu, Fatbardh
    Murtezaj, Doruntina
    Gashi, Fatbardh
    ELECTRONICS, 2021, 10 (10)
  • [5] Examining Sentiment Analysis for Low-Resource Languages with Data Augmentation Techniques
    Thakkar, Gaurish
    Preradovic, Nives Mikelic
    Tadic, Marko
    ENG, 2024, 5 (04): : 2920 - 2942
  • [6] Resource Construction and Ensemble Learning Based Sentiment Analysis for the Low-resource Language Uyghur
    Yusup, Azragul
    Chen, Degang
    Ge, Yifei
    Mao, Hongliang
    Wang, Nujian
    JOURNAL OF INTERNET TECHNOLOGY, 2023, 24 (04): : 1009 - 1016
  • [7] Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon
    Koto, Fajri
    Beck, Tilman
    Talat, Zeerak
    Gurevych, Iryna
    Baldwin, Timothy
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 298 - 320
  • [8] Cross-Lingual Morphological Tagging for Low-Resource Languages
    Buys, Jan
    Botha, Jan A.
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1954 - 1964
  • [9] Building lexicon-based sentiment analysis model for low-resource languages
    Mohammed, Idi
    Prasad, Rajesh
    METHODSX, 2023, 11
  • [10] Unveiling Sentiments: A Deep Dive Into Sentiment Analysis for Low-Resource Languages-A Case Study on Hausa Texts
    Shehu, Harisu Abdullahi
    Majikumna, Kaloma Usman
    Suleiman, Aminu Bashir
    Luka, Stephen
    Sharif, Md. Haidar
    Ramadan, Rabie A.
    Kusetogullari, Huseyin
    IEEE ACCESS, 2024, 12 : 98900 - 98916