Deep Hashing for Malware Family Classification and New Malware Identification

被引：2

作者：

Zhang, Yunchun ^{[1
]}

Liao, Zikun ^{[1
]}

Zhang, Ning ^{[1
]}

Min, Shaohui ^{[1
]}

Wang, Qi ^{[1
]}

Quek, Tony Q. S. ^{[2
]}

Zhao, Mingxiong ^{[1
]}

机构：

[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China

[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期

基金：

中国国家自然科学基金;

关键词：

Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;

D O I：

10.1109/JIOT.2024.3353250

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.

引用

页码：26837 / 26851

页数：15

共 50 条

[41] A Malware Classification Method Based on Generic Malware Information
Choi, Jiyeon
Kim, HeeSeok
Choi, Jangwon
Song, Jungsuk
NEURAL INFORMATION PROCESSING, PT II, 2015, 9490 : 329 - 336
[42] Deep Learning Applied to Imbalanced Malware Datasets Classification
Salas, Marcelo Palma
de Geus, Paulo Licio
JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2024, 15 (01) : 342 - 359
[43] HYDRA: A multimodal deep learning framework for malware classification
Gibert, Daniel
Mateu, Carles
Planes, Jordi
COMPUTERS & SECURITY, 2020, 95
[44] Deep Learning Model with Sequential Features for Malware Classification
Wu, Xuan
Song, Yafei
Hou, Xiaoyi
Ma, Zexuan
Chen, Chen
APPLIED SCIENCES-BASEL, 2022, 12 (19):
[45] Malware Detection using Malware Image and Deep Learning
Choi, Sunoh
Jang, Sungwook
Kim, Youngsoo
Kim, Jonghyun
2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1193 - 1195
[46] A few-shot malware classification approach for unknown family recognition using malware feature visualization
Conti, Mauro
Khandhar, Shubham
Vinod, P.
Computers and Security, 2022, 122
[47] Efficient Characterization and Classification of Malware Using Deep Learning
De la Rosa, Leonardo
Kilgallon, Sean
Vanderbruggen, Tristan
Cavazos, John
2018 RESILIENCE WEEK (RWS), 2018, : 77 - 83
[48] Deep Feature Extraction and Classification of Android Malware Images
Singh, Jaiteg
Thakur, Deepak
Ali, Farman
Gera, Tanya
Kwak, Kyung Sup
SENSORS, 2020, 20 (24) : 1 - 29
[49] A few-shot malware classification approach for unknown family recognition using malware feature visualization
Conti, Mauro
Khandhar, Shubham
Vinod, P.
COMPUTERS & SECURITY, 2022, 122
[50] EntropyVis: Malware Classification
Ren, Zhuojun
Chen, Guang
2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,

← 1 2 3 4 5 →