Deep Hashing for Malware Family Classification and New Malware Identification

被引：2

作者：

Zhang, Yunchun ^{[1
]}

Liao, Zikun ^{[1
]}

Zhang, Ning ^{[1
]}

Min, Shaohui ^{[1
]}

Wang, Qi ^{[1
]}

Quek, Tony Q. S. ^{[2
]}

Zhao, Mingxiong ^{[1
]}

机构：

[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China

[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期

基金：

中国国家自然科学基金;

关键词：

Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;

D O I：

10.1109/JIOT.2024.3353250

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.

引用

页码：26837 / 26851

页数：15

共 50 条

[31] Efficient Deep Learning Network With Multi-Streams for Android Malware Family Classification
Kim, Hyun-Il
Kang, Moonyoung
Cho, Seong-Je
Choi, Sang-Il
IEEE ACCESS, 2022, 10 : 5518 - 5532
[32] Malware Family Classification using Active Learning by Learning
Chen, Chin-Wei
Su, Ching-Hung
Lee, Kun-Wei
Bair, Ping-Hao
2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 590 - 595
[33] CNN-Based Malware Family Classification and Evaluation
Hebish, Mohamed Wael
Awni, Mohamed
2024 14TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEENG 2024, 2024, : 219 - 224
[34] Evaluating Feature Robustness for Windows Malware Family Classification
Duby, Adam
Taylor, Teryl
Bloom, Gedare
Zhuang, Yanyan
2022 31ST INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2022), 2022,
[35] Malware family classification via efficient Huffman features
O'Shaughnessy, Stephen
Breitinger, Frank
FORENSIC SCIENCE INTERNATIONAL-DIGITAL INVESTIGATION, 2021, 37
[36] A Hybrid Approach for Android Malware Detection and Family Classification
Dhalaria, Meghna
Gandotra, Ekta
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 174 - 188
[37] Graph-Based Classification of IoT Malware Families Enhanced by Fuzzy Hashing
Mahmoudyar, Nastaran
Ghorbani, Ali A.
Lashkari, Arash Habibi
INTERNET OF THINGS, IFIPIOT 2024, 2025, 737 : 131 - 148
[38] DTMIC: Deep transfer learning for malware image classification
Kumar, Sanjeev
Janet, B.
JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 64
[39] Malware Classification using Deep Convolutional Neural Networks
Kornish, David
Geary, Justin
Sansing, Victor
Ezekiel, Soundararajan
Pearlstein, Larry
Njilla, Laurent
2018 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2018,
[40] MCSMGS: Malware Classification Model Based on Deep Learning
Meng, Xi
Shan, Zhen
Liu, Fudong
Zhao, Bingling
Han, Jin
Wang, Jing
Wang, Hongyan
2017 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2017, : 272 - 275

← 1 2 3 4 5 →