Research and Implementation of Kernel Malicious Code Detection Based on Machine Learning

被引:0
|
作者
Tian D.-H. [1 ,2 ]
Wei H. [1 ]
Zhang B. [1 ]
Yu Y.-L. [1 ]
Li J.-S. [1 ]
Ma R. [1 ]
机构
[1] Beijing Key Laboratory of Software Security Engineering Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing
[2] Shanxi Military and Civilian Integration Software Engineering Technology Research Center, Taiyuan
关键词
API; Decision tree; Malicious code classification; Opcode; Random forest;
D O I
10.15918/j.tbit1001-0645.2019.261
中图分类号
学科分类号
摘要
With the development of computer science, the world is becoming more and more dependent on computers, and computer security is becoming more and more important. Malicious code is the biggest enemy of computer security. In this paper, a new method was proposed based on machine learning and new classification features to identify malicious programs, make a preliminary family classification of them, point out some shortcomings of previous machine learning in malicious code detection and classification, and screen out better distinguishing features. Firstly, n-gram algorithm was used to optimize the opcode characteristics in the disassembly code of malicious code. And then a Bag of Words model and TF-IDF algorithm were used to optimize the API call characteristics. Finally, a model was programmed and the data set was used to train and test the model. In the experiment, the classification accuracy of the model with decision tree algorithm can reach 87.41%, and the classification accuracy of the model with random forest algorithm can reach 90.06%. The experimental results show that, compared with others presented in the detection and classification of malicious code, the features of proposed method can achieve a better effect. © 2020, Editorial Department of Transaction of Beijing Institute of Technology. All right reserved.
引用
收藏
页码:1295 / 1301
页数:6
相关论文
共 6 条
  • [1] Duan Xiaoyun, Research on the malware detection based on windows API call behavior, (2016)
  • [2] Lu Zhanjun, A study of static malicious code detection method based on opcode sequences, (2013)
  • [3] Jiang Yongkang, Wu Yue, Zou Futai, An image-based malware classification model, Communication Technology, 51, 12, pp. 2953-2959, (2018)
  • [4] Qiao Y, Yun X, Zhang Y., How to automatically identify the homology of different malware, Proceedings of Trust,Security and Privacy in Computing and Communications, pp. 929-936, (2016)
  • [5] Abou-Assaleh T, Cercone N, Keselj V, Et al., N-gram-based detection of new malicious code, Proceedings of the 28th Annual International Computer Software and Applications Conference, (2004)
  • [6] Moskovitch R, Feher C, Tzachar N, Et al., Unknown malcode detection using OPCODE representation[C], Proceedings of European Conference on Intelligence and Security Informatics, pp. 204-215, (2008)