Deep Hashing for Malware Family Classification and New Malware Identification

被引:2
|
作者
Zhang, Yunchun [1 ]
Liao, Zikun [1 ]
Zhang, Ning [1 ]
Min, Shaohui [1 ]
Wang, Qi [1 ]
Quek, Tony Q. S. [2 ]
Zhao, Mingxiong [1 ]
机构
[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期
基金
中国国家自然科学基金;
关键词
Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;
D O I
10.1109/JIOT.2024.3353250
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.
引用
收藏
页码:26837 / 26851
页数:15
相关论文
共 50 条
  • [41] A Malware Classification Method Based on Generic Malware Information
    Choi, Jiyeon
    Kim, HeeSeok
    Choi, Jangwon
    Song, Jungsuk
    NEURAL INFORMATION PROCESSING, PT II, 2015, 9490 : 329 - 336
  • [42] Deep Learning Applied to Imbalanced Malware Datasets Classification
    Salas, Marcelo Palma
    de Geus, Paulo Licio
    JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2024, 15 (01) : 342 - 359
  • [43] HYDRA: A multimodal deep learning framework for malware classification
    Gibert, Daniel
    Mateu, Carles
    Planes, Jordi
    COMPUTERS & SECURITY, 2020, 95
  • [44] Deep Learning Model with Sequential Features for Malware Classification
    Wu, Xuan
    Song, Yafei
    Hou, Xiaoyi
    Ma, Zexuan
    Chen, Chen
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [45] Malware Detection using Malware Image and Deep Learning
    Choi, Sunoh
    Jang, Sungwook
    Kim, Youngsoo
    Kim, Jonghyun
    2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1193 - 1195
  • [46] A few-shot malware classification approach for unknown family recognition using malware feature visualization
    Conti, Mauro
    Khandhar, Shubham
    Vinod, P.
    Computers and Security, 2022, 122
  • [47] Efficient Characterization and Classification of Malware Using Deep Learning
    De la Rosa, Leonardo
    Kilgallon, Sean
    Vanderbruggen, Tristan
    Cavazos, John
    2018 RESILIENCE WEEK (RWS), 2018, : 77 - 83
  • [48] Deep Feature Extraction and Classification of Android Malware Images
    Singh, Jaiteg
    Thakur, Deepak
    Ali, Farman
    Gera, Tanya
    Kwak, Kyung Sup
    SENSORS, 2020, 20 (24) : 1 - 29
  • [49] A few-shot malware classification approach for unknown family recognition using malware feature visualization
    Conti, Mauro
    Khandhar, Shubham
    Vinod, P.
    COMPUTERS & SECURITY, 2022, 122
  • [50] EntropyVis: Malware Classification
    Ren, Zhuojun
    Chen, Guang
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,