Statistical Detection of Online Drifting Twitter Spam [Invited Paper]

被引:45
|
作者
Liu, Shigang [1 ]
Zhang, Jun [1 ]
Xiang, Yang [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, 221 Burwood Hwy, Burwood, Vic 3125, Australia
关键词
Twitter spam detection; social network security; security data analytics;
D O I
10.1145/2897845.2897928
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of im-balanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [31] 6 Million Spam Tweets: A Large Ground Truth for Timely Twitter Spam Detection
    Chen, Chao
    Zhang, Jun
    Chen, Xiao
    Xiang, Yang
    Zhou, Wanlei
    2015 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2015, : 7065 - 7070
  • [32] Towards Online Review Spam Detection
    Lin, Yuming
    Zhu, Tao
    Wang, Xiaoling
    Zhang, Jingwei
    Zhou, Aoying
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 341 - 342
  • [33] Spamming the Mainstream: A Survey on Trending Twitter Spam Detection Techniques
    Lalitha, L. A.
    Hulipalled, Vishwanath R.
    Venugopal, K. R.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES FOR SMART NATION (SMARTTECHCON), 2017, : 444 - 448
  • [34] Twitter spam detection: Survey of new approaches and comparative study
    Wu, Tingmin
    Wen, Sheng
    Xiang, Yang
    Zhou, Wanlei
    COMPUTERS & SECURITY, 2018, 76 : 265 - 284
  • [35] Twitter spam account detection based on clustering and classification methods
    Kayode Sakariyah Adewole
    Tao Han
    Wanqing Wu
    Houbing Song
    Arun Kumar Sangaiah
    The Journal of Supercomputing, 2020, 76 : 4802 - 4837
  • [36] A comparative study of the class imbalance problem in Twitter spam detection
    Li, Chaoliang
    Liu, Shigang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (05):
  • [37] Boosting Social Spam Detection via Attention Mechanisms on Twitter
    Shen, Hua
    Liu, Xinyue
    Zhang, Xianchao
    ELECTRONICS, 2022, 11 (07)
  • [38] Twitter spam account detection based on clustering and classification methods
    Adewole, Kayode Sakariyah
    Hang, Tao
    Wu, Wanqing
    Songs, Houbing
    Sangaiah, Arun Kumar
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (07): : 4802 - 4837
  • [39] MACHINE LEARNING BASED TWITTER SPAM ACCOUNT DETECTION: A REVIEW
    Gheewala, Shivangi
    Patel, Rakesh
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 79 - 84
  • [40] Twitter Spam Detection via Bilinear Autoencoding Reconstruction Error
    He, Qian
    Zhang, Sun
    Li, Bo
    Yin, Chunyong
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2022, 12