Deep Hashing for Malware Family Classification and New Malware Identification

被引:2
|
作者
Zhang, Yunchun [1 ]
Liao, Zikun [1 ]
Zhang, Ning [1 ]
Min, Shaohui [1 ]
Wang, Qi [1 ]
Quek, Tony Q. S. [2 ]
Zhao, Mingxiong [1 ]
机构
[1] Yunnan Univ, Engn Res Ctr Cyberspace, Natl Pilot Sch Software, Kunming 650500, Peoples R China
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 16期
基金
中国国家自然科学基金;
关键词
Malware; Feature extraction; Image retrieval; Image classification; Artificial neural networks; Internet of Things; Semantics; Deep hashing; deep neural networks (DNNs); image retrieval; malware classification; malware images; SEMANTICS; NETWORK;
D O I
10.1109/JIOT.2024.3353250
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although numerous state-of-the-art deep neural networks have recently been proposed for malware classification, effectively detecting malware on a large-scale sample set and identifying zero-day or new malware variants still pose significant challenges. To address this issue, a deep hashing-based malware classification model is designed for malware identification, including two parts: 1) ResNet50-based deep hashing for malware retrieval and 2) voting-based malware classification. Specifically, multiple deep hashing models are developed by extracting the high-layer outputs (feature maps) from the ResNet50 trained with malware gray-scale images in the first part. In this case, to maximize the Hamming distance or dissimilarity among hash values computed with malware samples under different families, a ResNet50-based deep polarized network (RNDPN) is designed to return Top K similar samples. In the second part, we propose a majority-voting and a Hamming-distance-based voting for malware identification according to the retrieved results. The experiment results show that RNDPN outperforms the other six deep hashing models with 97.54% mean average precision (mAP) for malware retrieval when only 40 similar examples are retrieved, where the best results for all deep hashing models are observed with 48-bits hashing code length. Furthermore, the Hamming distance-based voting method implemented with RNDPN demonstrates unparalleled performance in malware classification compared to other models. Notably, it achieves exceptional results in two key aspects: 1) malware classification accuracy with an impressive accuracy rate of 96.5% and 2) the identification of new or zero-day malware with a commendable accuracy of 85.7%.
引用
收藏
页码:26837 / 26851
页数:15
相关论文
共 50 条
  • [31] Efficient Deep Learning Network With Multi-Streams for Android Malware Family Classification
    Kim, Hyun-Il
    Kang, Moonyoung
    Cho, Seong-Je
    Choi, Sang-Il
    IEEE ACCESS, 2022, 10 : 5518 - 5532
  • [32] Malware Family Classification using Active Learning by Learning
    Chen, Chin-Wei
    Su, Ching-Hung
    Lee, Kun-Wei
    Bair, Ping-Hao
    2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 590 - 595
  • [33] CNN-Based Malware Family Classification and Evaluation
    Hebish, Mohamed Wael
    Awni, Mohamed
    2024 14TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEENG 2024, 2024, : 219 - 224
  • [34] Evaluating Feature Robustness for Windows Malware Family Classification
    Duby, Adam
    Taylor, Teryl
    Bloom, Gedare
    Zhuang, Yanyan
    2022 31ST INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2022), 2022,
  • [35] Malware family classification via efficient Huffman features
    O'Shaughnessy, Stephen
    Breitinger, Frank
    FORENSIC SCIENCE INTERNATIONAL-DIGITAL INVESTIGATION, 2021, 37
  • [36] A Hybrid Approach for Android Malware Detection and Family Classification
    Dhalaria, Meghna
    Gandotra, Ekta
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 174 - 188
  • [37] Graph-Based Classification of IoT Malware Families Enhanced by Fuzzy Hashing
    Mahmoudyar, Nastaran
    Ghorbani, Ali A.
    Lashkari, Arash Habibi
    INTERNET OF THINGS, IFIPIOT 2024, 2025, 737 : 131 - 148
  • [38] DTMIC: Deep transfer learning for malware image classification
    Kumar, Sanjeev
    Janet, B.
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 64
  • [39] Malware Classification using Deep Convolutional Neural Networks
    Kornish, David
    Geary, Justin
    Sansing, Victor
    Ezekiel, Soundararajan
    Pearlstein, Larry
    Njilla, Laurent
    2018 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2018,
  • [40] MCSMGS: Malware Classification Model Based on Deep Learning
    Meng, Xi
    Shan, Zhen
    Liu, Fudong
    Zhao, Bingling
    Han, Jin
    Wang, Jing
    Wang, Hongyan
    2017 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2017, : 272 - 275