Statistical Detection of Online Drifting Twitter Spam [Invited Paper]

被引:45
|
作者
Liu, Shigang [1 ]
Zhang, Jun [1 ]
Xiang, Yang [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, 221 Burwood Hwy, Burwood, Vic 3125, Australia
关键词
Twitter spam detection; social network security; security data analytics;
D O I
10.1145/2897845.2897928
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of im-balanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [41] A proposed statistical model for spam email detection
    Tran, Dat
    Ma, Wanli
    Sharma, Dharmendra
    ICTACS 2006: FIRST INTERNATIONAL CONFERENCE ON THEORIES AND APPLICATIONS OF COMPUTER SCIENCE 2006, 2007, : 15 - +
  • [42] Online Spam Review Detection: A Survey of Literature
    Li He
    Xianzhi Wang
    Hongxu Chen
    Guandong Xu
    Human-Centric Intelligent Systems, 2022, 2 (1-2): : 14 - 30
  • [43] Improving Spam Detection in Online Social Networks
    Gupta, Arushi
    Kaushal, Rishabh
    2015 INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING AND INFORMATION PROCESSING (CCIP), 2015,
  • [44] A SURVEY ON ONLINE REVIEW SPAM DETECTION TECHNIQUES
    Rajamohana, S. P.
    Umamaheswari, K.
    Dharani, M.
    Vedackshya, R.
    2017 IEEE INTERNATIONAL CONFERENCE ON INNOVATIONS IN GREEN ENERGY AND HEALTHCARE TECHNOLOGIES (IGEHT), 2017,
  • [45] Statistical mechanics of online learning of drifting concepts: A variational approach
    Vicente, R
    Kinouchi, O
    Caticha, N
    MACHINE LEARNING, 1998, 32 (02) : 179 - 201
  • [46] Statistical Mechanics of Online Learning of Drifting Concepts: A Variational Approach
    Renato Vicente
    Osame Kinouchi
    Nestor Caticha
    Machine Learning, 1998, 32 : 179 - 201
  • [47] Improvised spam detection in twitter data using lightweight detectors and classifiers
    Velammal B.L.
    Aarthy N.
    International Journal of Web-Based Learning and Teaching Technologies, 2021, 16 (04) : 12 - 32
  • [48] Online Linear Programming with Uncertain Constraints (Invited Paper)
    Yang, Lin
    Hajiesmaili, Mohammad H.
    Wong, Wing S.
    2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,
  • [49] Threshold and Associative Based Classification for Social Spam Profile Detection on Twitter
    Hua, Willian
    Zhang, Yanqing
    2013 NINTH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2013, : 113 - 120
  • [50] Invited Paper: Using Signed Formulas for Online Certification
    Wenzel, Julius
    Berg, Andreas
    Fetzer, Christof
    STABILIZATION, SAFETY, AND SECURITY OF DISTRIBUTED SYSTEMS, SSS 2024, 2025, 14931 : 71 - 86