CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

被引:18
|
作者
Liu, Dongjie [1 ,2 ]
Lee, Jong-Hyouk [3 ]
机构
[1] Chinese Acad Sci, Comp Network Informat Ctr, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
[3] Sejong Univ, Dept Comp & Informat Secur, Seoul 13557, South Korea
关键词
Machine learning; Internet; Browsers; Uniform resource locators; Support vector machines; Feature extraction; Crawlers; Convolutional neural network; machine learning; malicious website detection; NEURAL-NETWORK; DEEP CNN;
D O I
10.1109/ACCESS.2020.2995157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although a variety of techniques to detect malicious websites have been proposed, it becomes more and more difficult for those methods to provide a satisfying result nowadays. Many malicious websites can still escape detection with various Web spam techniques. In this paper, we first summarize three types of Web spam techniques used by malicious websites, such as redirection spam, hidden IFrame spam, and content hiding spam. We then present a new detection method that adopts the perspective of users and takes screenshots of malicious webpages to invalidate Web spams. The proposed detection method uses a Convolutional Neural Network, which is a class of deep neural networks, as a classification algorithm. In order to verify the effectiveness of the method, two different experiments have been conducted. First, the proposed method was tested based on a constructed complex dataset. We present comparison results between the proposed method and representative machine learning-based detection algorithms. Second, the proposed method was tested to detect malicious websites in a real-world Web environment for three months. These experimental results illustrate that the proposed method has a better performance and is applicable to a practical Web environment.
引用
收藏
页码:97258 / 97266
页数:9
相关论文
共 50 条
  • [1] An Improved Ensemble Deep Learning Model Based on CNN for Malicious Website Detection
    Do, Nguyet Quang
    Selamat, Ali
    Lim, Kok Cheng
    Krejcar, Ondrej
    ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND PRACTICES IN ARTIFICIAL INTELLIGENCE, 2022, 13343 : 497 - 504
  • [2] Malicious Website Detection Based on Honeypot Systems
    Koo, Tung-Ming
    Chang, Hung-Chang
    Hsu, Ya-Ting
    Lin, Huey-Yeh
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER SCIENCE AND ENGINEERING (CSE 2013), 2013, 42 : 76 - 82
  • [3] A Malicious URL Detection Method Based on CNN
    Chen, Yu
    Zhou, Yajian
    Dong, Qingqing
    Li, Qi
    2020 IEEE CONFERENCE ON TELECOMMUNICATIONS, OPTICS AND COMPUTER SCIENCE (TOCS), 2020, : 23 - 28
  • [4] Adaptive segmented webpage text based malicious website detection
    Sun, Guoying
    Zhang, Zhaoxin
    Cheng, Yanan
    Chai, Tingting
    COMPUTER NETWORKS, 2022, 216
  • [5] CNN-Webshell: Malicious Web Shell Detection with Convolutional Neural Network
    Tian, Yifan
    Wang, Jiabao
    Zhou, Zhenji
    Zhou, Shengli
    PROCEEDINGS OF 2017 VI INTERNATIONAL CONFERENCE ON NETWORK, COMMUNICATION AND COMPUTING (ICNCC 2017), 2017, : 75 - 79
  • [6] Malicious Domain Name Detection Model Based on CNN and LSTM
    Zhang Bin
    Liao Renjie
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (10) : 2944 - 2951
  • [7] Machine Learning & Concept Drift based Approach for Malicious Website Detection
    Singhal, Siddharth
    Chawla, Utkarsh
    Shorey, Rajeev
    2020 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2020,
  • [8] CNN-based malicious user detection in social networks
    Hong, Taekeun
    Choi, Chang
    Shin, Juhyun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (02):
  • [9] MALICIOUS WEBSITE DETECTION UNDER THE EXPLORATORY ATTACK
    Wang, Manlin
    Zhang, Fei
    Chan, Patrick P. K.
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 565 - 570
  • [10] Learning URL Embedding for Malicious Website Detection
    Yan, Xiaodan
    Xu, Yang
    Cui, Baojiang
    Zhang, Shuhan
    Guo, Taibiao
    Li, Chaoliang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (10) : 6673 - 6681