Static PE Malware Detection Using Gradient Boosting Decision Trees Algorithm

被引:16
|
作者
Huu-Danh Pham [1 ]
Tuan Dinh Le [2 ]
Thanh Nguyen Vu [1 ]
机构
[1] Vietnam Natl Univ Ho Chi Minh City, Univ Informat Technol, Ho Chi Minh City, Vietnam
[2] Long An Univ Econ & Ind, Tan An, Long An Provinc, Vietnam
关键词
Malware detection; Machine learning; PE file format; Gradient boosting decision trees; EMBER dataset;
D O I
10.1007/978-3-030-03192-3_17
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Static malware detection is an essential layer in a security suite, which attempts to classify samples as malicious or benign before execution. However, most of the related works incur the scalability issues, for examples, methods using neural networks usually take a lot of training time [13], or use imbalanced datasets [17, 20], which makes validation metrics misleading in reality. In this study, we apply a static malware detection method by Portable Executable analysis and Gradient Boosting Decision Tree algorithm. We manage to reduce the training time by appropriately reducing the feature dimension. The experiment results show that our proposed method can achieve up to 99.394% detection rate at 1% false alarm rate, and score results in less than 0.1% false alarm rate at a detection rate 97.572%, based on more than 600,000 training and 200,000 testing samples from Endgame Malware BEnchmark for Research (EMBER) dataset [1].
引用
收藏
页码:228 / 236
页数:9
相关论文
共 50 条
  • [1] Malware Detection Using Gradient Boosting Decision Trees with Customized Log Loss Function
    Gao, Yun
    Hasegawa, Hirokazu
    Yamaguchi, Yukiko
    Shimada, Hajime
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 273 - 278
  • [2] Unsupervised Domain Adaptation for Static Malware Detection based on Gradient Boosting Trees
    Qi, Panpan
    Wang, Wei
    Zhu, Lei
    Ng, See Kiong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1457 - 1466
  • [3] Effective Malware Detection using Shapely Boosting Algorithm
    Kumar, Rajesh
    Geetha, S.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (01) : 101 - 111
  • [4] Fall Detection Algorithm based on Gradient Boosting Decision Tree
    Ning, Yunkun
    Zhang, Sheng
    Nie, Xiaofen
    Li, Guanglin
    Zhao, Guoru
    CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
  • [5] Machine Unlearning in Gradient Boosting Decision Trees
    Lin, Huawei
    Chung, Jun Woo
    Lao, Yingjie
    Zhao, Weijie
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1374 - 1383
  • [6] Label Aggregation of Gradient Boosting Decision Trees
    Xiang, X. C.
    Zhang, H. X.
    Xia, S. T.
    PROCEEDINGS OF 2020 2ND INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MACHINE VISION AND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND MACHINE LEARNING, IPMV 2020, 2020, : 140 - 145
  • [7] FPGA Accelerator for Gradient Boosting Decision Trees
    Alcolea, Adrian
    Resano, Javier
    ELECTRONICS, 2021, 10 (03) : 1 - 15
  • [8] On Incremental Learning for Gradient Boosting Decision Trees
    Zhang, Chongsheng
    Zhang, Yuan
    Shi, Xianjin
    Almpanidis, George
    Fan, Gaojuan
    Shen, Xiajiong
    NEURAL PROCESSING LETTERS, 2019, 50 (01) : 957 - 987
  • [9] Gradient Boosting Decision Trees for Echocardiogram Images
    de Melo, Vinicius Veloso
    Ushizima, Daniela Mayumi
    Baracho, Salety Ferreira
    Coelho, Regina Celia
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [10] Practical Federated Gradient Boosting Decision Trees
    Li, Qinbin
    Wen, Zeyi
    He, Bingsheng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4642 - 4649