Predictive modeling for suspicious content identification on Twitter

被引:0
|
作者
Surendra Singh Gangwar
Santosh Singh Rathore
Satyendra Singh Chouhan
Sanskar Soni
机构
[1] ABV-IIITM,
[2] MNIT,undefined
来源
Social Network Analysis and Mining | 2022年 / 12卷
关键词
Suspicious content detection; User-content features; Natural language processing; Machine learning techniques; Social network;
D O I
暂无
中图分类号
学科分类号
摘要
The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.
引用
收藏
相关论文
共 50 条
  • [1] Predictive modeling for suspicious content identification on Twitter
    Gangwar, Surendra Singh
    Rathore, Santosh Singh
    Chouhan, Satyendra Singh
    Soni, Sanskar
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [2] Supporting the identification and the assessment of suspicious users on Twitter social media
    Tundis, Andrea
    Bhatia, Gaurav
    Jain, Archit
    Muehlhaeuser, Max
    2018 IEEE 17TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2018,
  • [3] On Modeling Virality of Twitter Content
    Hoang, Tuan-Anh
    Lim, Ee-Peng
    Achananuparp, Palakorn
    Jiang, Jing
    Zhu, Feida
    DIGITAL LIBRARIES: FOR CULTURAL HERITAGE, KNOWLEDGE DISSEMINATION, AND FUTURE CREATION: ICADL 2011, 2011, 7008 : 212 - 221
  • [4] Probabilistic Inference on Twitter Data to Discover Suspicious Users and Malicious Content
    Rao, Praveen
    Katib, Anas
    Kamhoua, Charles
    Kwiat, Kevin
    Njilla, Laurent
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 407 - 414
  • [5] The use of predictive modeling to identify relevant features for suspicious activity reporting
    Hayble-Gomes, Emmanuel
    JOURNAL OF MONEY LAUNDERING CONTROL, 2023, 26 (04): : 806 - 830
  • [6] Precocious identification of popular topics on Twitter with the employment of predictive clustering
    Gromov, Vasilii A.
    Konev, Anton S.
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 (11): : 3317 - 3322
  • [7] Precocious identification of popular topics on Twitter with the employment of predictive clustering
    Vasilii A. Gromov
    Anton S. Konev
    Neural Computing and Applications, 2017, 28 : 3317 - 3322
  • [8] SPOT 1.0: Scoring Suspicious Profiles On Twitter
    Perez, Charles
    Lemercier, Marc
    Birregah, Babiga
    Corpel, Alain
    2011 INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2011), 2011, : 377 - 381
  • [9] Twitter Pornography Multilingual Content Identification Based on Machine Learning
    Barfian, Edo
    Iswanto, Bambang Heru
    Isa, Sani Muhamad
    DISCOVERY AND INNOVATION OF COMPUTER SCIENCE TECHNOLOGY IN ARTIFICIAL INTELLIGENCE ERA, 2017, 116 : 129 - 136
  • [10] MODELING AND ANALYSIS OF CONTENT IDENTIFICATION
    Varna, Avinash L.
    Wu, Min
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1528 - +