Predictive modeling for suspicious content identification on Twitter

被引:0
|
作者
Surendra Singh Gangwar
Santosh Singh Rathore
Satyendra Singh Chouhan
Sanskar Soni
机构
[1] ABV-IIITM,
[2] MNIT,undefined
来源
Social Network Analysis and Mining | 2022年 / 12卷
关键词
Suspicious content detection; User-content features; Natural language processing; Machine learning techniques; Social network;
D O I
暂无
中图分类号
学科分类号
摘要
The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.
引用
收藏
相关论文
共 50 条
  • [21] Identification of Rumors on Twitter
    Patil, Richa Anant
    Gawande, Kiran
    Dhage, Sudhir N.
    RECENT TRENDS IN COMMUNICATION AND INTELLIGENT SYSTEMS, ICRTCIS 2019, 2020, : 219 - 226
  • [22] Gender identification on Twitter
    Ikae, Catherine
    Savoy, Jacques
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2022, 73 (01) : 58 - 69
  • [23] Content Credibility Check on Twitter
    Gupta, Priya
    Pathak, Vihaan
    Goyal, Naman
    Singh, Jaskirat
    Varshney, Vibhu
    Kumar, Sunil
    APPLICATIONS OF COMPUTING AND COMMUNICATION TECHNOLOGIES, ICACCT 2018, 2018, 899 : 197 - 212
  • [24] Investigating the Content of #UequalsU on Twitter
    Schwartz, Joseph
    Grimm, Josh
    HEALTH COMMUNICATION, 2023, 38 (07) : 1318 - 1326
  • [25] Quantifying Content Polarization on Twitter
    Yan, Muheng
    Wen, Xidao
    Lin, Yu-Ru
    Deng, Lingjia
    2017 IEEE 3RD INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2017, : 299 - 308
  • [26] Preoperative identification of a suspicious adnexal mass
    Kaijser, J.
    Sayasneh, A.
    Van Calster, B.
    Timmerman, D.
    Bourne, T.
    GYNECOLOGIC ONCOLOGY, 2012, 127 (01) : 260 - 262
  • [27] Predictive modeling in turbulent times - What Twitter reveals about the EUR/USD exchange rate
    Janetzko, Dietmar
    NETNOMICS, 2014, 15 (02): : 69 - 106
  • [28] MICRODOCHECTOMY - THE PRECISE IDENTIFICATION OF THE SUSPICIOUS DUCT
    BERNA, JD
    MADRIGAL, M
    GUIRAO, J
    ARCAS, I
    GOMEZ, S
    BRITISH JOURNAL OF SURGERY, 1990, 77 (11) : 1217 - 1218
  • [29] Innovative Approach For Identification Of Suspicious Images
    Bhaysar, Akshay
    Nair, Jitin
    Upadhyay, Abhishek
    Bhabad, Dnyaneshwar
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 692 - 696
  • [30] The predictive value of positive and suspicious urine cytology: Are they different?
    Kuan, Kevin C.
    Segura, Sheila E.
    Ahlstedt, Jeffrey
    Khader, Samer N.
    Hakima, Laleh
    DIAGNOSTIC CYTOPATHOLOGY, 2020, 48 (11) : 998 - 1002