Predictive modeling for suspicious content identification on Twitter

被引：0

作者：

Surendra Singh Gangwar

Santosh Singh Rathore

Satyendra Singh Chouhan

Sanskar Soni

机构：

[1] ABV-IIITM,

[2] MNIT,undefined

来源：

Social Network Analysis and Mining | 2022年 / 12卷

关键词：

Suspicious content detection; User-content features; Natural language processing; Machine learning techniques; Social network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.

引用

共 50 条

[41] Detecting illicit opioid content on Twitter
Tofighi, Babak
Aphinyanaphongs, Yindalon
Marini, Christina
Ghassemlou, Shouron
Nayebvali, Peyman
Metzger, Isabel
Raghunath, Ananditha
Thomas, Shailin
DRUG AND ALCOHOL REVIEW, 2020, 39 (03) : 205 - 208
[42] Using Twitter Content to Predict Psychopathy
Wald, Randall
Khoshgoftaar, Taghi
Napolitano, Amri
Sumner, Chris
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 394 - 401
[43] Competition Component Identification on Twitter
Yang, Cheng-Huang
Chen, Ji-De
Kao, Hung-Yu
TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 584 - 595
[44] Detection and visualization of misleading content on Twitter
Christina Boididou
Symeon Papadopoulos
Markos Zampoglou
Lazaros Apostolidis
Olga Papadopoulou
Yiannis Kompatsiaris
International Journal of Multimedia Information Retrieval, 2018, 7 : 71 - 86
[45] Twitter: a content analysis of personal information
Humphreys, Lee
Gill, Phillipa
Krishnamurthy, Balachander
INFORMATION COMMUNICATION & SOCIETY, 2014, 17 (07) : 843 - 857
[46] Malware Detection Based on Suspicious Behavior Identification
Wang, Cheng
Pang, Jianmin
Zhao, Rongcai
Fu, Wen
Liu, Xiaoxian
PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL II, 2009, : 198 - 202
[47] Spatiotemporal Analysis of Censored Content on Twitter
Varol, Onur
PROCEEDINGS OF THE 2016 ACM WEB SCIENCE CONFERENCE (WEBSCI'16), 2016, : 372 - 373
[48] Fitspiration and Thinspiration on Twitter: A Content Analysis
Kwan, Mun Yee
Gioia, Ayla N.
Braverman, Rebecca
Drumheller, Kristina
EUROPEAN EATING DISORDERS REVIEW, 2025,
[49] The association between Twitter content and suicide
Sinyor, Mark
Williams, Marissa
Zaheer, Rabia
Loureiro, Raisa
Pirkis, Jane
Heisel, Marnin J.
Schaffer, Ayal
Redelmeier, Donald A.
Cheung, Amy H.
Niederkrotenthaler, Thomas
AUSTRALIAN AND NEW ZEALAND JOURNAL OF PSYCHIATRY, 2021, 55 (03): : 268 - 276
[50] A MACROSCOPIC ANALYSIS OF NEWS CONTENT IN TWITTER
Malik, Momin M.
Pfeffer, Jurgen
DIGITAL JOURNALISM, 2016, 4 (08) : 955 - 979

← 1 2 3 4 5 →