Predictive modeling for suspicious content identification on Twitter

被引：0

作者：

Surendra Singh Gangwar

Santosh Singh Rathore

Satyendra Singh Chouhan

Sanskar Soni

机构：

[1] ABV-IIITM,

[2] MNIT,undefined

来源：

Social Network Analysis and Mining | 2022年 / 12卷

关键词：

Suspicious content detection; User-content features; Natural language processing; Machine learning techniques; Social network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.

引用

共 50 条

[1] Predictive modeling for suspicious content identification on Twitter
Gangwar, Surendra Singh
Rathore, Santosh Singh
Chouhan, Satyendra Singh
Soni, Sanskar
SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
[2] Supporting the identification and the assessment of suspicious users on Twitter social media
Tundis, Andrea
Bhatia, Gaurav
Jain, Archit
Muehlhaeuser, Max
2018 IEEE 17TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2018,
[3] On Modeling Virality of Twitter Content
Hoang, Tuan-Anh
Lim, Ee-Peng
Achananuparp, Palakorn
Jiang, Jing
Zhu, Feida
DIGITAL LIBRARIES: FOR CULTURAL HERITAGE, KNOWLEDGE DISSEMINATION, AND FUTURE CREATION: ICADL 2011, 2011, 7008 : 212 - 221
[4] Probabilistic Inference on Twitter Data to Discover Suspicious Users and Malicious Content
Rao, Praveen
Katib, Anas
Kamhoua, Charles
Kwiat, Kevin
Njilla, Laurent
2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 407 - 414
[5] The use of predictive modeling to identify relevant features for suspicious activity reporting
Hayble-Gomes, Emmanuel
JOURNAL OF MONEY LAUNDERING CONTROL, 2023, 26 (04): : 806 - 830
[6] Precocious identification of popular topics on Twitter with the employment of predictive clustering
Gromov, Vasilii A.
Konev, Anton S.
NEURAL COMPUTING & APPLICATIONS, 2017, 28 (11): : 3317 - 3322
[7] Precocious identification of popular topics on Twitter with the employment of predictive clustering
Vasilii A. Gromov
Anton S. Konev
Neural Computing and Applications, 2017, 28 : 3317 - 3322
[8] SPOT 1.0: Scoring Suspicious Profiles On Twitter
Perez, Charles
Lemercier, Marc
Birregah, Babiga
Corpel, Alain
2011 INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2011), 2011, : 377 - 381
[9] Twitter Pornography Multilingual Content Identification Based on Machine Learning
Barfian, Edo
Iswanto, Bambang Heru
Isa, Sani Muhamad
DISCOVERY AND INNOVATION OF COMPUTER SCIENCE TECHNOLOGY IN ARTIFICIAL INTELLIGENCE ERA, 2017, 116 : 129 - 136
[10] MODELING AND ANALYSIS OF CONTENT IDENTIFICATION
Varna, Avinash L.
Wu, Min
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1528 - +

← 1 2 3 4 5 →