Empirical study on imbalanced learning of Arabic sentiment polarity with neural word embedding

被引:7
|
作者
El-Alfy, El-Sayed M. [1 ]
Al-Azani, Sadam [1 ]
机构
[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia
关键词
Social network; sentiment analysis; polarity detection; word embedding; machine learning; imbalanced dataset; Arabic tweets; CLASSIFICATION; SMOTE;
D O I
10.3233/JIFS-179703
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the proliferation of social media and mobile technology, huge amount of unstructured data is posted daily online. Consequently, sentiment analysis has gained increasing importance as a tool to understand the opinions of certain groups of people on contemporary political, cultural, social or commercial issues. Unlike western languages, the research on sentiment analysis for dialectical Arabic language is still in its early stages with several challenges to be addressed. The main goal of this study is twofold. First, it compares the performance of core machine learning algorithms for detecting the polarity in imbalanced Arabic tweet datasets using neural word embedding as a feature extractor rather than hand-crafted or traditional features. Second, it examines the impact of using various oversampling techniques to handle the highly-imbalanced nature of the sentiment data. Intensive empirical analysis of nine machine learning methods and six oversampling methods has been conducted and the results have been discussed in terms of a wide range of performance measures.
引用
收藏
页码:6211 / 6222
页数:12
相关论文
共 50 条
  • [31] Software Sentiment Analysis Using Machine Learning with Different Word-Embedding
    Mula, Venkata Krishna Chandra
    Vijayvargiya, Sanidhya
    Kumar, Lov
    Samant, Surender Singh
    Murthy, Lalita Bhanu
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2022 WORKSHOPS, PART V, 2022, 13381 : 396 - 410
  • [32] Extending a Fuzzy Polarity Propagation Method for Multi-Domain Sentiment Analysis with Word Embedding and POS Tagging
    Pasquier, Claude
    Pereira, Celia da Costa
    Tettamanzi, Andrea G. B.
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2140 - 2147
  • [33] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
    Zouidine, Mohamed
    Khalil, Mohammed
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248
  • [34] Learning Dimensional Sentiment of Traditional Chinese Words with Word Embedding and Support Vector Regression
    Li, Baoli
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 324 - 327
  • [35] Hate Speech Detection using Word Embedding and Deep Learning in the Arabic Language Context
    Faris, Hossam
    Aljarah, Ibrahim
    Habib, Maria
    Castillo, Pedro A.
    ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 453 - 460
  • [36] Impact of Stemming and Word Embedding on Deep Learning-Based Arabic Text Categorization
    Almuzaini, Huda Abdulrahman
    Azmi, Aqil M.
    IEEE ACCESS, 2020, 8 : 127913 - 127928
  • [37] word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis
    Jimenez, Sergio
    Gonzalez, Fabio A.
    Gelbukh, Alexander
    Duenas, George
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (02) : 41 - 53
  • [38] An Empirical Study of Embedding Features in Learning to Rank
    Ensan, Faezeh
    Bagheri, Ebrahim
    Zouaq, Amal
    Kouznestsov, Alexandre
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2059 - 2062
  • [39] Domain Adaptation for Arabic Cross-Domain and Cross-Dialect Sentiment Analysis from Contextualized Word Embedding
    El Mekki, Abdellah
    El Mahdaouy, Abdelkader
    Berrada, Ismail
    Khoumsi, Ahmed
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2824 - 2837
  • [40] An empirical assessment of different word embedding and deep learning models for bug assignment
    Wang, Rongcun
    Ji, Xingyu
    Xu, Senlei
    Tian, Yuan
    Jiang, Shujuan
    Huang, Rubing
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 210