Sentiment Analysis of Multilingual Dataset of Bahraini Dialects, Arabic, and English

被引:1
|
作者
Omran, Thuraya [1 ]
Sharef, Baraa [2 ]
Grosan, Crina [3 ]
Li, Yongmin [1 ]
机构
[1] Brunel Univ London, Dept Comp Sci, Uxbridge UB8 3PH, England
[2] Ahlia Univ, Coll Informat Technol, Dept Informat Technol, POB 10878, Manama, Bahrain
[3] Kings Coll London, Div Appl Technol Clin Care, London WC2R 2LS, England
关键词
Bahraini dialects resources; Bahraini resources scarcity; deep learning; products reviews;
D O I
10.3390/data8040068
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is an application of natural language processing (NLP) that requires a machine learning algorithm and a dataset. In some cases, the dataset availability is scarce, particularly with Arabic dialects, precisely the Bahraini ones, which necessitates using an approach such as translation, where a rich source language is exploited to create the target language dataset. In this study, a dataset of Amazon product reviews in Bahraini dialects is presented. This dataset was generated using two cascading stages of translation-a machine translation followed by a manual one. Machine translation was applied using Google Translate to translate English Amazon product reviews into Standard Arabic. In contrast, the manual approach was applied to translate the resulting Arabic reviews into Bahraini ones by qualified native speakers utilizing constructed customized forms. The resulting parallel dataset of English, Standard Arabic, and Bahraini dialects is called English_Modern Standard Arabic_Bahraini Dialects product reviews for sentiment analysis "E_MSA_BDs-PR-SA". The dataset is balanced, composed of 2500 positive and 2500 negative reviews. The sentiment analysis process was implemented using a stacked LSTM deep learning model. The Bahraini dialect product dataset can be utilized in the transfer learning process for sentimentally analyzing another dataset in Bahraini dialects. Dataset: https://doi.org/10.17632/5rhw2srzjj.1 Dataset License: CC-BY-NC
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Sentiment Analysis with a Multilingual Pipeline
    Bal, Daniella
    Bal, Malissa
    van Bunningen, Arthur
    Hogenboom, Alexander
    Hogenboom, Frederik
    Frasincar, Flavius
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2011, 2011, 6997 : 129 - +
  • [22] Multilingual Financial Word Embeddings for Arabic, English and French
    Zmandar, Nadhem
    El-Haj, Mahmoud
    Rayson, Paul
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4584 - 4589
  • [23] The Effect of Local Arabic Dialects on Learning English Language Pronunciation
    Abd Elwahab, Waleed
    ARAB WORLD ENGLISH JOURNAL, 2020, 11 (01) : 489 - 499
  • [24] Depression detection for twitter users using sentiment analysis in English and Arabic tweets
    Helmy, Abdelmoniem
    Nassar, Radwa
    Ramdan, Nagy
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 147
  • [25] NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages
    Winata, Genta Indra
    Aji, Alham Fikri
    Cahyawijaya, Samuel
    Mahendra, Rahmad
    Koto, Fajri
    Romadhony, Ade
    Kurniawan, Kemal
    Moeljadi, David
    Prasojo, Radityo Eko
    Fung, Pascale
    Baldwin, Timothy
    Lau, Jey Han
    Sennrich, Rico
    Ruder, Sebastian
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 815 - 834
  • [26] Sentiment Analysis in Arabic Tweets
    Duwairi, R. M.
    Marji, Raed
    Sha'ban, Narmeen
    Rushaidat, Sally
    2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [27] Sentiment Analysis for Dialectical Arabic
    Duwairi, Rehab M.
    2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2015, : 166 - 170
  • [28] Arabic Sentiment Analysis: A Survey
    Assiri, Adel
    Emam, Ahmed
    Aldossari, Hmood
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (12) : 75 - 85
  • [29] MULTILEVEL SENTIMENT ANALYSIS IN ARABIC
    Nassar, Ahmed
    Sezer, Ebru
    2019 IEEE 7TH PALESTINIAN INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (PICECE), 2019,
  • [30] Sentiment analysis dataset in Moroccan dialect: bridging the gap between Arabic and Latin scripted dialect
    Jbel, Mouad
    Jabrane, Mourad
    Hafidi, Imad
    Metrane, Abdulmutallib
    LANGUAGE RESOURCES AND EVALUATION, 2024,