Translation Is Not Enough: Comparing Lexicon-based Methods for Sentiment Analysis in Persian

被引:0
|
作者
Basiri, Mohammad Ehsan [1 ]
Kabiri, Arman [1 ]
机构
[1] Shahrekord Univ, Dept Comp Engn, Shahrekord, Iran
关键词
component; Sentiment Analysis; Natural Language Processing; Persian Language; Lexicon-based approach; Opinion mining; Data Mining; SOCIAL MEDIA;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sentiment analysis is a subfield of data mining and natural language processing with the aim of extracting people's opinion and appraisals from their comments on the Web. Contrary to machine learning approach, lexicon-based methods have some important advantages like domain-independency and being needless of a large annotated training corpus and hence are faster. This makes lexicon-based approach prevalent in the sentiment analysis community. However, for Persian language, in contrast to English, using lexicon-based method is a new discipline. There are limited lexicons available for sentiment analysis in Persian, almost all of them are directly translated from English. In the current study, four lexicons are compared to show the importance of lexicons in the performance of document-level sentiment analysis. Specifically, the Persian version of NRC lexicon, SentiStrength, CNRC, and Adjectives are compared in a pure lexicon-based scenario. Experiments are carried out on the document-level edition of SPerSent dataset. Results show that direct translation used in NRC leads the poorest performance while pre-processing and refining lexicons used in SentiStrength and CNRC improves the performance. Also, the results show that using just adjectives leads to higher results in comparison to using NRC.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [41] Simpler is Better? Lexicon-based Ensemble Sentiment Classification Beats Supervised Methods
    Augustyniak, Lukasz
    Kajdanowicz, Tomasz
    Szymanski, Piotr
    Tuliglowicz, Wlodzimierz
    Kazienko, Przemyslaw
    Alhajj, Reda
    Szymanski, Boleslaw
    2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), 2014, : 924 - 929
  • [42] Sentiment Analysis for Standard and Dialectal Arabic Using the Lexicon-Based Approach
    Maghfour, Mohcine
    Elouardighi, Abdeljalil
    DIGITAL TECHNOLOGIES AND APPLICATIONS, ICDTA 2024, VOL 3, 2024, 1100 : 335 - 344
  • [43] Sentiment analysis in Nepali: Exploring machine learning and lexicon-based approaches
    Piryani, Rajesh
    Piryani, Bhawna
    Singh, Vivek Kumar
    Pinto, David
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2201 - 2212
  • [44] Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis
    Assiri, Adel
    Emam, Ahmed
    Al-Dossari, Hmood
    JOURNAL OF INFORMATION SCIENCE, 2018, 44 (02) : 184 - 202
  • [45] A Comparison of Lexicon-Based and ML-Based Sentiment Analysis: Are There Outlier Words?
    Mahajani, Siddhant Jaydeep
    Srivastava, Shashank
    Smeaton, Alan F.
    2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS, 2023,
  • [46] Automatic expansion of the Swedish FrameNet lexicon Comparing and combining lexicon-based and corpus-based methods
    Johansson, Richard
    CONSTRUCTIONS AND FRAMES, 2014, 6 (01) : 92 - 113
  • [47] LMS Content Evaluation System with Sentiment Analysis Using Lexicon-Based Approach
    Tan, Riegie D.
    Piad, Keno
    Lagman, Ace
    Victoriano, Jayson
    Tano, Isagani
    San Gabriel, Nicanor, Jr.
    Espino, Joseph
    2022 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND EDUCATION TECHNOLOGY (ICIET 2022), 2022, : 93 - 98
  • [48] Lexicon-Based Sentiment Analysis and Emotion Classification of Climate Change Related Tweets
    Fagbola, Temitayo Matthew
    Abayomi, Abdultaofeek
    Mutanga, Murimo Bethel
    Jugoo, Vikash
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2021), 2022, 417 : 637 - 646
  • [49] An Enhancement of Malay Social Media Text Normalization for Lexicon-Based Sentiment Analysis
    Abu Bakar, Muhammad Fakhrur Razi
    Idris, Norisma
    Shuib, Liyana
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 211 - 215
  • [50] An analysis of customer perception using lexicon-based sentiment analysis of Arabic Texts framework
    Alsemaree, Ohud
    Alam, Atm S.
    Gill, Sukhpal Singh
    Uhlig, Steve
    HELIYON, 2024, 10 (11)