Translation Is Not Enough: Comparing Lexicon-based Methods for Sentiment Analysis in Persian

被引:0
|
作者
Basiri, Mohammad Ehsan [1 ]
Kabiri, Arman [1 ]
机构
[1] Shahrekord Univ, Dept Comp Engn, Shahrekord, Iran
关键词
component; Sentiment Analysis; Natural Language Processing; Persian Language; Lexicon-based approach; Opinion mining; Data Mining; SOCIAL MEDIA;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sentiment analysis is a subfield of data mining and natural language processing with the aim of extracting people's opinion and appraisals from their comments on the Web. Contrary to machine learning approach, lexicon-based methods have some important advantages like domain-independency and being needless of a large annotated training corpus and hence are faster. This makes lexicon-based approach prevalent in the sentiment analysis community. However, for Persian language, in contrast to English, using lexicon-based method is a new discipline. There are limited lexicons available for sentiment analysis in Persian, almost all of them are directly translated from English. In the current study, four lexicons are compared to show the importance of lexicons in the performance of document-level sentiment analysis. Specifically, the Persian version of NRC lexicon, SentiStrength, CNRC, and Adjectives are compared in a pure lexicon-based scenario. Experiments are carried out on the document-level edition of SPerSent dataset. Results show that direct translation used in NRC leads the poorest performance while pre-processing and refining lexicons used in SentiStrength and CNRC improves the performance. Also, the results show that using just adjectives leads to higher results in comparison to using NRC.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [21] Towards Improving the Lexicon-Based Approach for Arabic Sentiment Analysis
    Abdulla, Nawaf A.
    Ahmed, Nizar A.
    Shehab, Mohammed A.
    Al-Ayyoub, Mahmoud
    Al-Kabi, Mohammed N.
    Al-rifai, Saleh
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2014, 9 (03) : 55 - 71
  • [22] Lexicon-based Sentiment Analysis for Reviews of Products in Brazilian Portuguese
    Avanco, Lucas V.
    Nunes, Maria G. V.
    2014 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2014, : 277 - 281
  • [23] An Italian Lexicon-based Sentiment Analysis approach for medical applications
    Martinis, Maria Chiara
    Zucco, Chiara
    Cannataro, Mario
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [24] Fast and Accurate - Improving Lexicon-Based Sentiment Classification with an Ensemble Methods
    Augustyniak, Lukasz
    Szymanski, Piotr
    Kajdanowicz, Tomasz
    Kazienko, Przemyslaw
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2016, PT II, 2016, 9622 : 108 - 116
  • [25] Lexicon-based sentiment analysis in texts using Formal Concept Analysis
    Ojeda-Hernandez, Manuel
    Lopez-Rodriguez, Domingo
    Mora, Angel
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 155 : 104 - 112
  • [26] A lexicon-based approach for sentiment analysis of multimodal content in tweets
    Thangavel, Prabakaran
    Lourdusamy, Ravi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (16) : 24203 - 24226
  • [27] The advantages of lexicon-based sentiment analysis in an age of machine learning
    van der Veen, A. Maurits
    Bleich, Erik
    PLOS ONE, 2025, 20 (01):
  • [28] The Lexicon-based Sentiment Analysis for Fan Page Ranking in Facebook
    Ngoc, Phan Trong
    Yoo, Myungsik
    2014 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2014), 2014, : 444 - 448
  • [29] A lexicon-based approach for sentiment analysis of multimodal content in tweets
    Prabakaran Thangavel
    Ravi Lourdusamy
    Multimedia Tools and Applications, 2023, 82 : 24203 - 24226
  • [30] Lexicon-based Sentiment Analysis Using the Particle Swarm Optimization
    Machova, Kristina
    Mikula, Martin
    Gao, Xiaoying
    Mach, Marian
    ELECTRONICS, 2020, 9 (08) : 1 - 22