Statistical Detection of Online Drifting Twitter Spam [Invited Paper]

被引:45
|
作者
Liu, Shigang [1 ]
Zhang, Jun [1 ]
Xiang, Yang [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, 221 Burwood Hwy, Burwood, Vic 3125, Australia
关键词
Twitter spam detection; social network security; security data analytics;
D O I
10.1145/2897845.2897928
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spam has become a critical problem in online social networks. This paper focuses on Twitter spam detection. Recent research works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets. We observe existing machine learning based detection methods suffer from the problem of Twitter spam drift, i.e., the statistical properties of spam tweets vary over time. To avoid this problem, an effective solution is to train one twitter spam classifier every day. However, it faces a challenge of the small number of im-balanced training data because labelling spam samples is time-consuming. This paper proposes a new method to address this challenge. The new method employs two new techniques, fuzzy-based redistribution and asymmetric sampling. We develop a fuzzy-based information decomposition technique to re-distribute the spam class and generate more spam samples. Moreover, an asymmetric sampling technique is proposed to re-balance the sizes of spam samples and non-spam samples in the training data. Finally, we apply the ensemble technique to combine the spam classifiers over two different training sets. A number of experiments are performed on a real-world 10-day ground-truth dataset to evaluate the new method. Experiments results show that the new method can significantly improve the detection performance for drifting Twitter spam.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] Twitter Spam Detection Using Naive Bayes Classifier
    Santoshi, K. Ushasree
    Bhavya, S. Sree
    Sri, Y. Bhavya
    Venkateswarlu, B.
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 773 - 777
  • [22] Semi-Supervised Spam Detection in Twitter Stream
    Sedhai, Surendra
    Sun, Aixin
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2018, 5 (01): : 169 - 175
  • [23] Stochastic Gradient Boosting Model for Twitter Spam Detection
    Devi, K. Kiruthika
    Kumar, G. A. Sathish
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 41 (02): : 849 - 859
  • [24] Machine and Deep Learning Algorithms for Twitter Spam Detection
    Alsaffar, Dalia
    Alfahhad, Amjad
    Alqhtani, Bashaier
    Alamri, Lama
    Alansari, Shahad
    Alqahtani, Nada
    Alboaneen, Dabiah A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2019, 2020, 1058 : 483 - 491
  • [25] DON'T FOLLOW ME Spam Detection in Twitter
    Wang, Alex Hai
    SECRYPT 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY, 2010, : 142 - 151
  • [26] A Novel Stream Clustering Framework for Spam Detection in Twitter
    Tajalizadeh, Hadi
    Boostani, Reza
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (03) : 525 - 534
  • [27] Exploiting the Spam Correlations in Scalable Online Social Spam Detection
    Xu, Hailu
    Hu, Liting
    Liu, Pinchao
    Guan, Boyuan
    CLOUD COMPUTING - CLOUD 2019, 2019, 11513 : 146 - 160
  • [28] Harnessing the Nature of Spam in Scalable Online Social Spam Detection
    Xu, Hailu
    Guan, Boyuan
    Liu, Pinchao
    Escudero, William
    Hu, Liting
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3733 - 3736
  • [29] Spam detection using statistical theorem
    Mithun, F
    Ahmed, S
    Zakariah, R
    Amin, I
    Alam, KM
    ISAS/CITSA 2004: INTERNATIONAL CONFERENCE ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS AND 10TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS ANALYSIS AND SYNTHESIS, VOL 1, PROCEEDINGS: COMMUNICATIONS, INFORMATION TECHNOLOGIES AND COMPUTING, 2004, : 245 - 248
  • [30] Statistical Rules for Thai Spam Detection
    Songkhla, Chalermpol Na
    Piromsopa, Krerk
    SECOND INTERNATIONAL CONFERENCE ON FUTURE NETWORKS: ICFN 2010, 2010, : 238 - 242