iSRD: Spam Review Detection with Imbalanced Data Distributions

被引:0
|
作者
Al Najada, Hamzah [1 ]
Zhu, Xingquan [1 ]
机构
[1] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
来源
2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI) | 2014年
关键词
Data sampling; fake reviews; imbalanced data distributions; sentiment analysis; classification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Internet is playing an essential role for modern information systems. Applications, such as e-commerce websites, are becoming popularly available for people to purchase different types of products online. During such an online shopping process, users often rely on online review reports from previous customers to make the final decision. Because online reviews are playing essential roles for the selling of online products (or services), some vendors (or customers) are providing fake/spam reviews to mislead the customers. Any false reviews of the products may result in unfair market competition and financial loss for the customers or vendors. In this research, we aim to distinguish between spam and non-spam reviews by using supervised classification methods. When training a classifier to identify spam vs. non-spam reviews, a challenging issue is that spam reviews are only a very small portion of the online review reports. This naturally leads to a data imbalance issue for training classifiers for spam review detection, where learning methods without emphasizing on minority samples (i.e., spams) may result in poor performance in detecting spam reviews (although the overall accuracy of the algorithm might be relatively high). In order to tackle the challenge, we employ a bagging based approach to build a number of balanced datasets, through which we can train a set of spam classifiers and use their ensemble to detect review spams. Experiments and comparisons demonstrate that our method, iSRD, outperforms baseline methods for review spam detection.
引用
收藏
页码:553 / 560
页数:8
相关论文
共 50 条
  • [1] A Heterogeneous Ensemble Learning Framework for Spam Detection in Social Networks with Imbalanced Data
    Zhao, Chensu
    Xin, Yang
    Li, Xuefeng
    Yang, Yixian
    Chen, Yuling
    APPLIED SCIENCES-BASEL, 2020, 10 (03):
  • [2] Hybrid ensemble framework with self-attention mechanism for social spam detection on imbalanced data
    Rao, Sanjeev
    Verma, Anil Kumar
    Bhatia, Tarunpreet
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 217
  • [3] Battering Review Spam Through Ensemble Learning in Imbalanced Datasets
    Khurshid, Faisal
    Zhu, Yan
    Hu, Jie
    Ahmad, Muqeet
    Ahmad, Mushtaq
    COMPUTER JOURNAL, 2022, 65 (07): : 1666 - 1678
  • [4] Fast and effective spam sender detection with granular SVM on highly imbalanced mail server behavior data
    Tang, Yuchun
    Krasser, Sven
    Judge, Paul
    2006 INTERNATIONAL CONFERENCE ON COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, 2006, : 18 - +
  • [5] Detection of review spam: A survey
    Heydari, Atefeh
    Tavakoli, Mohammad Ali
    Salim, Naomie
    Heydari, Zahra
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (07) : 3634 - 3642
  • [6] Addressing Imbalanced Data in Network Intrusion Detection: A Review and Survey
    Al-Qarni, Elham Abdullah
    Al-Asmari, Ghadah Ahmad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 136 - 143
  • [7] Irony detection in Twitter with imbalanced class distributions
    Farias, Delia Irazu Hernandez
    Prali, Ronaldo
    Herrera, Francisco
    Rosso, Paolo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2147 - 2163
  • [8] Ensemble Classifiers for Spam Review Detection
    Ibrahim, Alhassan J.
    Siraj, Maheyzah Md
    Din, Mazura Mat
    2017 IEEE CONFERENCE ON APPLICATION, INFORMATION AND NETWORK SECURITY (AINS), 2017, : 130 - 134
  • [9] Spam Detection In Social Networks: A Review
    Eshraqi, Nasim
    Jalali, Mehrdad
    Moattar, Mohammad Hossein
    SECOND INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK 2015), 2015, : 148 - 152
  • [10] An imbalanced spam mail filtering method
    Ma, Zhiqiang
    Yan, Rui
    Yuan, Dongliong
    Liu, Limin
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (03): : 119 - 126