Effect of Various Factors in Context of Feature Selection on Opinion Spam Detection

被引:2
|
作者
Rastogi, Ajay [1 ]
Mehrotra, Monica [1 ]
Ali, Syed Shafat [1 ]
机构
[1] Jamie Millia Islamia, Dept Comp Sci, New Delhi, India
关键词
feature selection; opinion spun; online reviews; classification; filter-based; model-based;
D O I
10.1109/Confluence51648.2021.9377056
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the growing popularity of online reviews, spammers often target specific products or services with the aim to mislead consumers in their purchase decisions. This has opened doors for researchers to study the problem of opinion spam detection. Till date, many effective and efficient solutions have been proposed in this regard using various types of features. However, most of the feature engineering tasks extract thousands of features, which may lead to degrade the performance and increase computation cost involved in many machine learning algorithms. Feature selection methods can greatly improve classification performance along with the reduction in computation cost of model training. In this paper, we investigate the effect of different feature selection techniques on opinion spam detection. For the same, various feature selection methods (filter-based and model-based) with varying number of features have been employed to train four different classification models. In addition, three well-known review datasets from different domains (hotel, doctor and restaurant) and four different types of features, viz., unigram, bigram, part-of-speech frequency count and word embedding, have been used to examine the impact of different factors responsible to improve the performance in opinion spam domain. Our experimental results demonstrate how different factors affect classification performance and cost, which is statistically validated by using Analysis of Variance test.
引用
收藏
页码:778 / 783
页数:6
相关论文
共 50 条
  • [31] Impact of Behavioral and Textual Features on Opinion Spam Detection
    Rastogi, Ajay
    Mehrotra, Monica
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 852 - 857
  • [32] Securing Behavior-based Opinion Spam Detection
    Ge, Shuaijun
    Ma, Guixiang
    Xie, Sihong
    Yu, Philip S.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 112 - 117
  • [33] Fusion Convolutional Attention Network for Opinion Spam Detection
    Li, Jiacheng
    Ma, Qianwen
    Yuan, Chunyuan
    Zhou, Wei
    Han, Jizhong
    Hu, Songlin
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 223 - 235
  • [34] A Contextual Relationship Model for Deceptive Opinion Spam Detection
    Fahfouh, Anass
    Riffi, Jamal
    Mahraz, Mohamed Adnane
    Yahyaouy, Ali
    Tairi, Hamid
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 1228 - 1239
  • [35] Detection of opinion spam based on anomalous rating deviation
    Savage, David
    Zhang, Xiuzhen
    Yu, Xinghuo
    Chou, Pauline
    Wang, Qingmai
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) : 8650 - 8657
  • [36] Detection of Opinion Spam with Character n-grams
    Hernandez Fusilier, Donato
    Montes-y-Gomez, Manuel
    Rosso, Paolo
    Guzman Cabrera, Rafael
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 285 - 294
  • [37] Deceptive opinion spam detection approaches: a literature survey
    Maurya, Sushil Kumar
    Singh, Dinesh
    Maurya, Ashish Kumar
    APPLIED INTELLIGENCE, 2023, 53 (02) : 2189 - 2234
  • [38] Opinion Spam Detection Based on Heterogeneous Information Network
    Sun, Yingcheng
    Loparo, Kenneth
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1156 - 1163
  • [39] Opinion Spam Detection using Review and Reviewer Centricfeatures
    Dominic, Dilsha
    Lijo, V. P.
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1154 - 1159
  • [40] Learning Document Representation for Deceptive Opinion Spam Detection
    Li, Luyang
    Ren, Wenjing
    Qin, Bing
    Liu, Ting
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015), 2015, 9427 : 393 - 404