CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

被引:18
|
作者
Liu, Dongjie [1 ,2 ]
Lee, Jong-Hyouk [3 ]
机构
[1] Chinese Acad Sci, Comp Network Informat Ctr, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
[3] Sejong Univ, Dept Comp & Informat Secur, Seoul 13557, South Korea
关键词
Machine learning; Internet; Browsers; Uniform resource locators; Support vector machines; Feature extraction; Crawlers; Convolutional neural network; machine learning; malicious website detection; NEURAL-NETWORK; DEEP CNN;
D O I
10.1109/ACCESS.2020.2995157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although a variety of techniques to detect malicious websites have been proposed, it becomes more and more difficult for those methods to provide a satisfying result nowadays. Many malicious websites can still escape detection with various Web spam techniques. In this paper, we first summarize three types of Web spam techniques used by malicious websites, such as redirection spam, hidden IFrame spam, and content hiding spam. We then present a new detection method that adopts the perspective of users and takes screenshots of malicious webpages to invalidate Web spams. The proposed detection method uses a Convolutional Neural Network, which is a class of deep neural networks, as a classification algorithm. In order to verify the effectiveness of the method, two different experiments have been conducted. First, the proposed method was tested based on a constructed complex dataset. We present comparison results between the proposed method and representative machine learning-based detection algorithms. Second, the proposed method was tested to detect malicious websites in a real-world Web environment for three months. These experimental results illustrate that the proposed method has a better performance and is applicable to a practical Web environment.
引用
收藏
页码:97258 / 97266
页数:9
相关论文
共 50 条
  • [41] Analysis and Detection of Malicious Data Exfiltration in Web Traffic
    Al-Bataineh, Areej
    White, Gregory
    PROCEEDINGS OF THE 2012 7TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE, 2012, : 26 - 31
  • [42] Effective Analysis, Characterization, and Detection of Malicious Web Pages
    Eshete, Birhanu
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 355 - 359
  • [43] Multi-Modal Features Representation-Based Convolutional Neural Network Model for Malicious Website Detection
    Alsaedi, Mohammed
    Ghaleb, Fuad A.
    Saeed, Faisal
    Ahmad, Jawad
    Alasli, Mohammed
    IEEE ACCESS, 2024, 12 : 7271 - 7284
  • [44] Malicious Web traffic detection for Internet of Things environments
    Yong, Binbin
    Liu, Xin
    Yu, Qingchen
    Huang, Liang
    Zhou, Qingguo
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 77 : 260 - 272
  • [45] The CNN and DPM based approach for multiple object detection in images
    Dange, Amruta D.
    Momin, B. F.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1106 - 1109
  • [46] Detection of Malicious Software on Based on Multiple Equations of API-calls Sequences
    Hachinyan, Olga
    PROCEEDINGS OF THE 2017 IEEE RUSSIA SECTION YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING CONFERENCE (2017 ELCONRUS), 2017, : 415 - 418
  • [47] Visualization Feature and CNN Based Homology Classification of Malicious Code
    CHU Qianfeng
    LIU Gongshen
    ZHU Xinyu
    Chinese Journal of Electronics, 2020, 29 (01) : 154 - 160
  • [48] Visualization Feature and CNN Based Homology Classification of Malicious Code
    Chu, Qianfeng
    Liu, Gongshen
    Zhu, Xinyu
    CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (01) : 154 - 160
  • [49] Classification of Malicious URLs by CNN Model Based on Genetic Algorithm
    Wu, Tiefeng
    Xi, Yunfang
    Wang, Miao
    Zhao, Zhichao
    APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [50] Evaluating CNN and LSTM for Web Attack Detection
    Wang, Jiabao
    Zhou, Zhenji
    Chen, Jun
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 283 - 287