Rumour detection on benchmark twitter datasets using graph neural networks with data augmentation

被引:0
|
作者
Patel, Shaswat [1 ]
Bansal, Prince [1 ]
Kaur, Preeti [1 ]
机构
[1] Netaji Subhas Univ Technol, Dept Comp Engn, Azad Hind Fauj Marg, New Delhi 110078, India
关键词
Rumour detection; Oversampling; Data augmentation; Graph neural network; BERT;
D O I
10.1007/s13278-024-01328-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social media has become a significant source of essential facts and alarming falsehoods, including rumours. A significant increase in rumour spreading has occurred due to the lack of an autonomous rumour detection mechanism, causing widespread and severe social repercussions. To address this challenge, we present a ground-breaking method for developing an automatic rumour detection system, focusing on the fundamental problem of class imbalance in rumour detection. Our method selectively uses oversampling to obtain a uniformly distributed dataset by leveraging contextualised data augmentation techniques to generate synthetic samples for underrepresented classes. Furthermore, we effectively recreate non-linear dialogues inside a thread using two novel graph neural networks (GNNs), which improves the system's capacity to understand complex links between postings. Our method employs a distinctive feature selection mechanism to enhance further Twitter representations based on the state-of-the-art BERTweet model. The thorough analysis of our methodology using three publicly accessible datasets yielded compelling results: (1) our GNN models outperformed the most state-of-the-art classifiers in F1-score by more than 20%. Emphasizing the importance of our approach to developing sophisticated rumour detection systems. (2) By utilizing our oversampling method, we significantly improve the F1-score by 9%, highlighting the practical implications of resolving class imbalance. (3) Our technique delivers further performance increases through non-random selection criteria for data augmentation, with the selection of relevant tweets highlighting the significance of our novel augmentation strategy. (4) Notably, our approach captures rumours in their early stages more effectively than previous classifiers, establishing a baseline for future works. The innovative aspects of our proposed method lie in its ability to solve class imbalance effectively, outperform existing classifiers in terms of performance, and drastically reduce the propagation of rumours and false information on social media platforms. Our study lays the way for developments in rumour detection by offering a comprehensive solution, eventually helping to ensure the veracity of information flowing online. We are confident that our findings have an influence on the broader field of rumour detection systems and provide fresh directions for further study.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Cyberthreat Detection from Twitter using Deep Neural Networks
    Dionisio, Nuno
    Alves, Fernando
    Ferreira, Pedro M.
    Bessani, Alysson
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [32] Class-homophilic-based data augmentation for improving graph neural networks
    Duan, Rui
    Yan, Chungang
    Wang, Junli
    Jiang, Changjun
    KNOWLEDGE-BASED SYSTEMS, 2023, 269
  • [33] A Graph Construction Method for Anomalous Traffic Detection with Graph Neural Networks Using Sets of Flow Data
    Okui, Norihiro
    Akimoto, Yusuke
    Kubota, Ayumu
    Yoshida, Takuya
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 1017 - 1018
  • [34] Real-time Twitter data sentiment analysis to predict the recession in the UK using Graph Neural Networks
    Malhi, Avleen
    Naiseh, Mohammad
    Jangra, Kunal
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 1595 - 1600
  • [35] BalancerGNN: Balancer Graph Neural Networks for imbalanced datasets: A case study on fraud detection
    Boyapati, Mallika
    Aygun, Ramazan
    NEURAL NETWORKS, 2025, 182
  • [36] Predicting the Next Diseases Using Graph Neural Networks on Administrative Medical Datasets
    Tseng, Yun-Chien
    Liu, Wei-Chen
    Peng, Wen-Chic
    Hung, Chih-Chieh
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2024, 40 (03) : 631 - 648
  • [37] Combining Neuroimaging and Omics Datasets for Disease Classification Using Graph Neural Networks
    Chan, Yi Hao
    Wang, Conghao
    Soh, Wei Kwek
    Rajapakse, Jagath C.
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [38] Investigating the Predictive Reproducibility of Federated Graph Neural Networks Using Medical Datasets
    Balik, Mehmet Yigit
    Rekik, Arwa
    Rekik, Islem
    PREDICTIVE INTELLIGENCE IN MEDICINE (PRIME 2022), 2022, 13564 : 160 - 171
  • [39] Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks
    Oviedo, Felipe
    Ren, Zekun
    Sun, Shijing
    Settens, Charles
    Liu, Zhe
    Hartono, Noor Titan Putri
    Ramasamy, Savitha
    DeCost, Brian L.
    Tian, Siyu I. P.
    Romano, Giuseppe
    Kusne, Aaron Gilad
    Buonassisi, Tonio
    NPJ COMPUTATIONAL MATERIALS, 2019, 5
  • [40] Datasets and Interfaces for Benchmarking Heterogeneous Graph Neural Networks
    Liu, Yijian
    Zhang, Hongyi
    Yang, Cheng
    Li, Ao
    Ji, Yugang
    Zhang, Luhao
    Li, Tao
    Yang, Jinyu
    Zhao, Tianyu
    Yang, Juan
    Huang, Hai
    Shi, Chuan
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5346 - 5350