Phishing Webpage Detection via Multi-Modal Integration of HTML']HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks

被引:0
|
作者
Yoon, Jun-Ho [1 ]
Buu, Seok-Jun [1 ]
Kim, Hae-Jung [2 ]
机构
[1] Gyeongsang Natl Univ, Dept Comp Engn, Jinju Si 52828, South Korea
[2] Kyungil Univ, Dept Comp Engn, Gyongsan 38428, South Korea
基金
新加坡国家研究基金会;
关键词
phishing webpage detection; graph convolutional network; transformer network; multi-modal integration; cyberspace security; MODEL;
D O I
10.3390/electronics13163344
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages.
引用
收藏
页数:21
相关论文
共 21 条
  • [1] A stacking model using URL and HTML']HTML features for phishing webpage detection
    Li, Yukun
    Yang, Zhenguo
    Chen, Xu
    Yuan, Huaping
    Liu, Wenyin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 94 : 27 - 39
  • [2] Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML']HTML
    Ariyadasa, Subhash
    Fernando, Shantha
    Fernando, Subha
    IEEE ACCESS, 2022, 10 : 82355 - 82375
  • [3] Comprehensive phishing detection: A multi-channel approach with variants TCN fusion leveraging URL and HTML']HTML features
    Aljofey, Ali
    Bello, Saifullahi Aminu
    Lu, Jian
    Xu, Chen
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2025, 238
  • [4] MULTIPHISH: MULTI-MODAL FEATURES FUSION NETWORKS FOR PHISHING DETECTION
    Zhang, Lei
    Zhang, Peng
    Liu, Luchen
    Tan, Jianlong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3520 - 3524
  • [5] Graph Convolutional Networks Based Multi-modal Data Integration for Breast Cancer Survival Prediction
    Hu, Hongbin
    Liang, Wenbin
    Zou, Xitao
    Zou, Xianchun
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT I, ICIC 2024, 2024, 14881 : 85 - 98
  • [6] Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network
    Liang, Bin
    Lou, Chenwei
    Li, Xiang
    Yang, Min
    Gui, Lin
    He, Yulan
    Pei, Wenjie
    Xu, Ruifeng
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1767 - 1777
  • [7] Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection
    Qian, Shengsheng
    Hu, Jun
    Fang, Quan
    Xu, Changsheng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [8] Heterogeneous Features Integration via Semi-supervised Multi-modal Deep Networks
    Zhao, Lei
    Hu, Qinghua
    Zhou, Yucan
    NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 11 - 19
  • [9] Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
    Gang Lv
    Yining Sun
    Fudong Nian
    Multimedia Systems, 2024, 30
  • [10] Sound event detection in traffic scenes based on graph convolutional network to obtain multi-modal information
    Jiang, Yanji
    Guo, Dingxu
    Wang, Lan
    Zhang, Haitao
    Dong, Hao
    Qiu, Youli
    Zou, Huiwen
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (04) : 5653 - 5668