Large-Scale Malicious Software Classification With Fuzzified Features and Boosted Fuzzy Random Forest

被引:7
|
作者
Li, Fang-Qi [1 ]
Wang, Shi-Lin [1 ]
Liew, Alan Wee-Chung [2 ]
Ding, Weiping [3 ]
Liu, Gong-Shen [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Griffith Univ, Sch Informat & Commun Technol, Gold Coast, Qld 4222, Australia
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
基金
中国国家自然科学基金;
关键词
Malware; Feature extraction; Machine learning; Decision trees; Forestry; Support vector machines; Boosted random forest; computer security; fuzzy decision tree; malware classification; MACHINE; SYSTEM;
D O I
10.1109/TFUZZ.2020.3016023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification of malicious software, especially in a very large dataset, is a challenging task for machine intelligence. Malware can have highly diversified features, each of which has highly heterogeneous distributions. These factors increase the difficulties for traditional data analytic approaches to deal with them. Although deep learning based methods have reported good classification performance, the deep models usually lack interpretability and are fragile under adversarial attacks. To solve these problems, fuzzy systems have become a competitive candidate in malware analysis. In this article, a new fuzzy-based approach is proposed for malware classification. We focused on portable executable files in the Windows platform and analyzed the distributions of static features and content-oriented features. Fuzzification was used to reduce the ubiquitous impact of noise and outliers in a very large dataset. Finally, a novel boosted classifier consisted of fuzzy decision trees and support vector machine is proposed to perform the malware classification. By using fuzzy decision trees, the inner structure of the classifier can be readily interpreted as discriminative rules, whereas the novel boosting strategy provides state-of-the-art classification performance. Extensive experimental results showed that our method significantly outperformed several state-of-the-art classifiers.
引用
收藏
页码:3205 / 3218
页数:14
相关论文
共 50 条
  • [21] SOFTWARE METRIC CLASSIFICATION TREES HELP GUIDE THE MAINTENANCE OF LARGE-SCALE SYSTEMS
    SELBY, RW
    PORTER, AA
    CONFERENCE ON SOFTWARE MAINTENANCE - 1989, PROCEEDINGS, 1989, : 116 - 123
  • [22] Combining unsupervised and knowledge-based methods in large-scale forest classification
    Quegan, S
    Yu, JJ
    Balzter, H
    LeToan, T
    IGARSS 2000: IEEE 2000 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOL I - VI, PROCEEDINGS, 2000, : 426 - 428
  • [23] Data-dependent compression of random features for large-scale kernel approximation
    Agrawal, Raj
    Campbell, Trevor
    Huggins, Jonathan
    Broderick, Tamara
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [24] Building rooftop classification using Random Forests for large-scale PV deployment
    Assouline, Dan
    Mohajeri, Nahid
    Scartezzini, Jean-Louis
    EARTH RESOURCES AND ENVIRONMENTAL REMOTE SENSING/GIS APPLICATIONS VIII, 2017, 10428
  • [25] Fuzzy Rough Set Based Feature Selection for Large-Scale Hierarchical Classification
    Zhao, Hong
    Wang, Ping
    Hu, Qinghua
    Zhu, Pengfei
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2019, 27 (10) : 1891 - 1903
  • [26] A modified constructive fuzzy neural networks for classification of large-scale and complicated data
    Wang, Lunwen
    Wu, Yanhua
    Tan, Ying
    Zhang, Ling
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 3972 : 14 - 19
  • [27] Panel: Large-scale software testing
    Horgan, B
    EIGHTH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 1997, : 220 - 220
  • [28] IR software for large-scale research
    Newby, G
    ASIST 2001: PROCEEDINGS OF THE 64TH ASIST ANNUAL MEETING, VOL 38, 2001, 2001, 38 : 656 - 656
  • [29] Coordination in Large-Scale Software Teams
    Begel, Andrew
    Nagappan, Nachiappan
    Poile, Christopher
    Layman, Lucas
    2009 ICSE WORKSHOP ON COOPERATIVE AND HUMAN ASPECTS OF SOFTWARE ENGINEERING, 2009, : 1 - +
  • [30] DEVELOPING SOFTWARE FOR LARGE-SCALE REUSE
    SEIDEWITZ, E
    BALFOUR, B
    ADAMS, SS
    WADE, DM
    COX, B
    SIGPLAN NOTICES, 1993, 28 (10): : 137 - 143