Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data

被引:0
|
作者
Puneet Kumar
Kshitij Pathania
Balasubramanian Raman
机构
[1] Indian Institute of Technology Roorkee,Department of Computer Science and Engineering
[2] Indian Institute of Technology Roorkee,Department of Mathematics
来源
Applied Intelligence | 2023年 / 53卷
关键词
Labeled data insufficiency; Cross-lingual sentiment analysis; Sanskrit language analysis; Machine translation;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a novel method for analyzing the sentiments portrayed by Sanskrit text has been proposed. Sanskrit is one of the world’s most ancient languages; however, natural language processing tasks such as machine translation and sentiment analysis have not been explored for it to the full potential because of the unavailability of sufficient labeled data. We solved this issue using a zero-shot learning-based cross-lingual sentiment analysis (CLSA) approach. The CLSA uses the resources from the source language to enhance the sentiment analysis of the target language having insufficient resources. The proposed work translates the text from Sanskrit, a language with insufficient labeled data, to English, with sufficient labeled data for sentiment analysis using a transformer model. A generative adversarial network-based strategy has been proposed to evaluate the maturity of the translations. Then a bidirectional long short-term memory-based model has been implemented to classify the sentiments using the embeddings obtained through translations. The proposed technique has achieved 87.50% accuracy for machine translation and 92.83% accuracy for sentiment classification. Sanskrit-English translations used in this work have been collected through web scraping techniques. In the absence of the ground-truth sentiment class labels, a strategy for evaluating the sentiment scores of the proposed sentiment analysis model has also been presented. A new dataset of Sanskrit text, along with their English translations and sentiment scores, has been constructed.
引用
收藏
页码:10096 / 10113
页数:17
相关论文
共 50 条
  • [41] Multi-level multilingual semantic alignment for zero-shot cross-lingual transfer learning
    Gui, Anchun
    Xiao, Han
    NEURAL NETWORKS, 2024, 173
  • [42] Boosting Zero-shot Cross-lingual Retrieval by Training on Artificially Code-Switched Data
    Litschko, Robert
    Artemova, Ekaterina
    Plank, Barbara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3096 - 3108
  • [43] Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models
    Shaheen, Zein
    Wohlgenannt, Gerhard
    Mouromtsev, Dmitry
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 450 - 456
  • [44] Label modification and bootstrapping for zero-shot cross-lingual hate speech detection
    Bigoulaeva, Irina
    Hangya, Viktor
    Gurevych, Iryna
    Fraser, Alexander
    LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (04) : 1515 - 1546
  • [45] Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
    Chatzoudis, Gerasimos
    Plitsis, Manos
    Stamouli, Spyridoula
    Dimou, Athanasia-Lida
    Katsamanis, Nassos
    Katsouros, Vassilis
    INTERSPEECH 2022, 2022, : 2178 - 2182
  • [46] Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
    Artetxe, Mikel
    Schwenk, Holger
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 597 - 610
  • [47] Label modification and bootstrapping for zero-shot cross-lingual hate speech detection
    Irina Bigoulaeva
    Viktor Hangya
    Iryna Gurevych
    Alexander Fraser
    Language Resources and Evaluation, 2023, 57 : 1515 - 1546
  • [48] Zero-shot cross-lingual transfer language selection using linguistic similarity
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [49] Transfer language selection for zero-shot cross-lingual abusive language detection
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    Arata, Masaki
    Leliwa, Gniewosz
    Wroczynski, Michal
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [50] Beyond the EnglishWeb: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers
    Repo, Liina
    Skantsi, Valtteri
    Ronnqvist, Samuel
    Hellstrom, Saara
    Oinonen, Miika
    Salmela, Anna
    Biber, Douglas
    Egbert, Jesse
    Pyysalo, Sampo
    Laippala, Veronika
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 183 - 191