Cost-Sensitive Classifier for Spam Detection on News Media Twitter Accounts

被引:0
|
作者
Tur, Georvic [1 ]
Nabhan Homsi, Masun [1 ]
机构
[1] Univ Simon Bolivar, Dept Comp Sci & Informat Technol, Apartado 89000, Caracas, Venezuela
关键词
Spam Classification; Twitter; Topic Discovering; Cost-Sensitive Classifier; Random Forest;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Social media are increasingly being used as sources in mainstream news coverage. However, since news is so rapidly updating it is very easy to fall into the trap of believing everything as truth. Spam content usually refers to the information that goes viral and skews users' views on subjects. To this end, this paper introduces a new approach for detecting spam tweets using Cost-Sensitive Classifier that includes Random Forest. Tweets were first annotated manually and then four different sets of features were extracted from them. Afterward, four machine learning algorithms were cross-validated to determine the best base classifier for spam detection. Finally, class imbalanced problem was dealt by resampling and incorporating arbitrary misclassification costs into the learning process. Results showed that the proposed approach helped mitigate overfitting and reduced classification error by achieving an overall accuracy of 89.14% in training and 76.82% in testing.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] An Adaptive Cost-sensitive Classifier
    Chen, Xiaolin
    Song, Enming
    Ma, Guangzhi
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1, 2010, : 699 - 701
  • [2] Detection of spam-posting accounts on Twitter
    Inuwa-Dutse, Isa
    Liptrott, Mark
    Korkontzelos, Ioannis
    NEUROCOMPUTING, 2018, 315 : 496 - 511
  • [3] Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection
    Lee, Sang Min
    Kim, Dong Seong
    Park, Jong Sou
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (06) : 944 - 960
  • [4] Cost-Sensitive Boosting Pruning Trees for Depression Detection on Twitter
    Tong, Lei
    Liu, Zhihua
    Jiang, Zheheng
    Zhou, Feixiang
    Chen, Long
    Lyu, Jialin
    Zhang, Xiangrong
    Zhang, Qianni
    Sadka, Abdul
    Wang, Yinhai
    Li, Ling
    Zhou, Huiyu
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1898 - 1911
  • [5] Enhanced Detection of Text and Image Spam Using Cost-Sensitive Deep Learning
    Mallampati, Deepika
    Hegde, Nagaratna P.
    TRAITEMENT DU SIGNAL, 2024, 41 (03) : 1283 - 1292
  • [6] Cost-sensitive classifier evaluation using cost curves
    Holte, Robert C.
    Drummond, Chris
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 26 - +
  • [7] Twitter Spam Detection Using Naive Bayes Classifier
    Santoshi, K. Ushasree
    Bhavya, S. Sree
    Sri, Y. Bhavya
    Venkateswarlu, B.
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 773 - 777
  • [8] Online classifier adaptation for cost-sensitive learning
    Junlin Zhang
    José García
    Neural Computing and Applications, 2016, 27 : 781 - 789
  • [9] Online classifier adaptation for cost-sensitive learning
    Zhang, Junlin
    Garcia, Jose
    NEURAL COMPUTING & APPLICATIONS, 2016, 27 (03): : 781 - 789
  • [10] Cost-sensitive three-way email spam filtering
    Bing Zhou
    Yiyu Yao
    Jigang Luo
    Journal of Intelligent Information Systems, 2014, 42 : 19 - 45