Spam Detection Using Clustering-Based SVM

被引:0
|
作者
Pandya, Darshit [1 ]
机构
[1] Indus Univ, Dept Comp Engn, Ahmadabad 382115, Gujarat, India
关键词
Text Classification; SVM; Clustering;
D O I
10.1145/3366750.3366754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spam detection task is of much more importance than earlier due to the increase in the use of messaging and mailing services. Efficient classification in such a variety of messages is a comparatively onerous task. There are a variety of machine learning algorithms used for spam detection, one of which is Support Vector Machine, also known as SVM. SVM is widely used to classify text-based documents. Though SVM is a widely used technique in document classification, its performance in the spam classification is not the best due to the uneven density of the training data. In order to improve the efficiency of SVM, I introduce a clustering-based SVM method. The training data is pre-processed using clustering algorithms and then the SVM classifier is implemented on the processed dataset. This method would increase the performance by overcoming the problem of uneven distribution of training data. The experimental results show that the performance is improved compared to that of SVM.
引用
收藏
页码:12 / 15
页数:4
相关论文
共 50 条
  • [1] Clustering-based Spam Image Filtering Considering Fuzziness of the Spam Image
    Prince, Master
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (12) : 268 - 270
  • [2] Web Spam Detection using SVM Classifier
    Patil, Rahul C.
    Patil, D. R.
    PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [3] Spam Detection using Dynamic Weighted Voting based on Clustering
    Saeedian, Mehmoush Famil
    Beigy, Hamid
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL II, PROCEEDINGS, 2008, : 122 - 126
  • [4] CLUSTERING-BASED NETWORK INTRUSION DETECTION
    Zhong, Shi
    Khoshgoftaar, Taghi M.
    Seliya, Naeem
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2007, 14 (02) : 169 - 187
  • [5] Clustering-Based Trajectory Outlier Detection
    Eldawy, Eman O.
    Mokhtar, Hoda M. O.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (05) : 133 - 139
  • [6] Clustering-Based Outlier Detection Method
    Jiang, Sheng-yi
    An, Qing-bo
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 429 - 433
  • [7] Spam query detection using stream clustering
    Shakiba, Tahere
    Zarifzadeh, Sajjad
    Derhami, Vali
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2018, 21 (02): : 557 - 572
  • [8] CLUSTERING-BASED SUBSPACE SVM ENSEMBLE FOR RELEVANCE FEEDBACK LEARNING
    Ji, Rongrong
    Yao, Hongxun
    Wang, Jicheng
    Xu, Pengfei
    Liu, Xianming
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1221 - 1224
  • [9] Spam query detection using stream clustering
    Tahere Shakiba
    Sajjad Zarifzadeh
    Vali Derhami
    World Wide Web, 2018, 21 : 557 - 572
  • [10] Addressing Class Imbalance in Customer Response Modeling Using Random and Clustering-Based Undersampling and SVM
    Kascelan, Ljiljana
    Vukovic, Suncica
    IPSI BGD TRANSACTIONS ON INTERNET RESEARCH, 2024, 20 (02): : 1 - 13