Predictive modeling for suspicious content identification on Twitter

被引：0

作者：

Surendra Singh Gangwar

Santosh Singh Rathore

Satyendra Singh Chouhan

Sanskar Soni

机构：

[1] ABV-IIITM,

[2] MNIT,undefined

来源：

Social Network Analysis and Mining | 2022年 / 12卷

关键词：

Suspicious content detection; User-content features; Natural language processing; Machine learning techniques; Social network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.

引用

共 50 条

[21] Identification of Rumors on Twitter
Patil, Richa Anant
Gawande, Kiran
Dhage, Sudhir N.
RECENT TRENDS IN COMMUNICATION AND INTELLIGENT SYSTEMS, ICRTCIS 2019, 2020, : 219 - 226
[22] Gender identification on Twitter
Ikae, Catherine
Savoy, Jacques
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2022, 73 (01) : 58 - 69
[23] Content Credibility Check on Twitter
Gupta, Priya
Pathak, Vihaan
Goyal, Naman
Singh, Jaskirat
Varshney, Vibhu
Kumar, Sunil
APPLICATIONS OF COMPUTING AND COMMUNICATION TECHNOLOGIES, ICACCT 2018, 2018, 899 : 197 - 212
[24] Investigating the Content of #UequalsU on Twitter
Schwartz, Joseph
Grimm, Josh
HEALTH COMMUNICATION, 2023, 38 (07) : 1318 - 1326
[25] Quantifying Content Polarization on Twitter
Yan, Muheng
Wen, Xidao
Lin, Yu-Ru
Deng, Lingjia
2017 IEEE 3RD INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2017, : 299 - 308
[26] Preoperative identification of a suspicious adnexal mass
Kaijser, J.
Sayasneh, A.
Van Calster, B.
Timmerman, D.
Bourne, T.
GYNECOLOGIC ONCOLOGY, 2012, 127 (01) : 260 - 262
[27] Predictive modeling in turbulent times - What Twitter reveals about the EUR/USD exchange rate
Janetzko, Dietmar
NETNOMICS, 2014, 15 (02): : 69 - 106
[28] MICRODOCHECTOMY - THE PRECISE IDENTIFICATION OF THE SUSPICIOUS DUCT
BERNA, JD
MADRIGAL, M
GUIRAO, J
ARCAS, I
GOMEZ, S
BRITISH JOURNAL OF SURGERY, 1990, 77 (11) : 1217 - 1218
[29] Innovative Approach For Identification Of Suspicious Images
Bhaysar, Akshay
Nair, Jitin
Upadhyay, Abhishek
Bhabad, Dnyaneshwar
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 692 - 696
[30] The predictive value of positive and suspicious urine cytology: Are they different?
Kuan, Kevin C.
Segura, Sheila E.
Ahlstedt, Jeffrey
Khader, Samer N.
Hakima, Laleh
DIAGNOSTIC CYTOPATHOLOGY, 2020, 48 (11) : 998 - 1002

← 1 2 3 4 5 →