iSRD: Spam Review Detection with Imbalanced Data Distributions

被引：0

作者：

Al Najada, Hamzah ^{[1
]}

Zhu, Xingquan ^{[1
]}

机构：

[1] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA

来源：

2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI) | 2014年

关键词：

Data sampling; fake reviews; imbalanced data distributions; sentiment analysis; classification;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Internet is playing an essential role for modern information systems. Applications, such as e-commerce websites, are becoming popularly available for people to purchase different types of products online. During such an online shopping process, users often rely on online review reports from previous customers to make the final decision. Because online reviews are playing essential roles for the selling of online products (or services), some vendors (or customers) are providing fake/spam reviews to mislead the customers. Any false reviews of the products may result in unfair market competition and financial loss for the customers or vendors. In this research, we aim to distinguish between spam and non-spam reviews by using supervised classification methods. When training a classifier to identify spam vs. non-spam reviews, a challenging issue is that spam reviews are only a very small portion of the online review reports. This naturally leads to a data imbalance issue for training classifiers for spam review detection, where learning methods without emphasizing on minority samples (i.e., spams) may result in poor performance in detecting spam reviews (although the overall accuracy of the algorithm might be relatively high). In order to tackle the challenge, we employ a bagging based approach to build a number of balanced datasets, through which we can train a set of spam classifiers and use their ensemble to detect review spams. Experiments and comparisons demonstrate that our method, iSRD, outperforms baseline methods for review spam detection.

引用

页码：553 / 560

页数：8

共 50 条

[1] A Heterogeneous Ensemble Learning Framework for Spam Detection in Social Networks with Imbalanced Data
Zhao, Chensu
Xin, Yang
Li, Xuefeng
Yang, Yixian
Chen, Yuling
APPLIED SCIENCES-BASEL, 2020, 10 (03):
[2] Hybrid ensemble framework with self-attention mechanism for social spam detection on imbalanced data
Rao, Sanjeev
Verma, Anil Kumar
Bhatia, Tarunpreet
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 217
[3] Battering Review Spam Through Ensemble Learning in Imbalanced Datasets
Khurshid, Faisal
Zhu, Yan
Hu, Jie
Ahmad, Muqeet
Ahmad, Mushtaq
COMPUTER JOURNAL, 2022, 65 (07): : 1666 - 1678
[4] Fast and effective spam sender detection with granular SVM on highly imbalanced mail server behavior data
Tang, Yuchun
Krasser, Sven
Judge, Paul
2006 INTERNATIONAL CONFERENCE ON COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, 2006, : 18 - +
[5] Detection of review spam: A survey
Heydari, Atefeh
Tavakoli, Mohammad Ali
Salim, Naomie
Heydari, Zahra
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (07) : 3634 - 3642
[6] Addressing Imbalanced Data in Network Intrusion Detection: A Review and Survey
Al-Qarni, Elham Abdullah
Al-Asmari, Ghadah Ahmad
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 136 - 143
[7] Irony detection in Twitter with imbalanced class distributions
Farias, Delia Irazu Hernandez
Prali, Ronaldo
Herrera, Francisco
Rosso, Paolo
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2147 - 2163
[8] Ensemble Classifiers for Spam Review Detection
Ibrahim, Alhassan J.
Siraj, Maheyzah Md
Din, Mazura Mat
2017 IEEE CONFERENCE ON APPLICATION, INFORMATION AND NETWORK SECURITY (AINS), 2017, : 130 - 134
[9] Spam Detection In Social Networks: A Review
Eshraqi, Nasim
Jalali, Mehrdad
Moattar, Mohammad Hossein
SECOND INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK 2015), 2015, : 148 - 152
[10] An imbalanced spam mail filtering method
Ma, Zhiqiang
Yan, Rui
Yuan, Dongliong
Liu, Limin
International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (03): : 119 - 126

← 1 2 3 4 5 →