Deep neural-based vulnerability discovery demystified: data, model and performance

被引:14
|
作者
Lin, Guanjun [1 ]
Xiao, Wei [2 ]
Zhang, Leo Yu [3 ]
Gao, Shang [3 ]
Tai, Yonghang [4 ]
Zhang, Jun [5 ]
机构
[1] Sanming Univ, Sch Informat Engn, Sanming, Fujian, Peoples R China
[2] Changchun Univ Technol, Sch Comp Sci & Engn, Changchun, Jilin, Peoples R China
[3] Deakin Univ, Sch Informat Technol, Geelong, Vic 3216, Australia
[4] Yunnan Normal Univ, Yunnan Key Lab Optoelect Informat Technol, Kunming, Yunnan, Peoples R China
[5] Swinburne Univ Technol, Sch Software & Elect Engn, Melbourne, Vic 3122, Australia
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 20期
关键词
Vulnerability discovery; Deep learning; Function-level; Baseline dataset; Performance evaluation;
D O I
10.1007/s00521-021-05954-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting source-code level vulnerabilities at the development phase is a cost-effective solution to prevent potential attacks from happening at the software deployment stage. Many machine learning, including deep learning-based solutions, have been proposed to aid the process of vulnerability discovery. However, these approaches were mainly evaluated on self-constructed/-collected datasets. It is difficult to evaluate the effectiveness of proposed approaches due to lacking a unified baseline dataset. To bridge this gap, we construct a function-level vulnerability dataset from scratch, providing in source-code-label pairs. To evaluate the constructed dataset, a function-level vulnerability detection framework is built to incorporate six mainstream neural network models as vulnerability detectors. We perform experiments to investigate the performance behaviors of the neural model-based detectors using source code as raw input with continuous Bag-of-Words neural embeddings. Empirical results reveal that the variants of recurrent neural networks and convolutional neural network perform well on our dataset, as the former is capable of handling contextual information and the latter learns features from small context windows. In terms of generalization ability, the fully connected network outperforms the other network architectures. The performance evaluation can serve as a reference benchmark for neural model-based vulnerability detection at function-level granularity. Our dataset can serve as ground truth for ML-based function-level vulnerability detection and a baseline for evaluating relevant approaches.
引用
收藏
页码:13287 / 13300
页数:14
相关论文
共 50 条
  • [21] A novel neural-based model for acoustic-articulatory inversion mapping
    Hossein Behbood
    Seyyed Ali Seyyedsalehi
    Hamid Reza Tohidypour
    Mojtaba Najafi
    Shahriar Gharibzadeh
    Neural Computing and Applications, 2012, 21 : 935 - 943
  • [22] Neural-Based Model Predictive Control for Tackling Steering Delays of Autonomous Cars
    Guidolini, Ranik
    de Souza, Alberto F.
    Mutz, Filipe
    Badue, Claudine
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4324 - 4331
  • [23] Neural-based model of spiral antenna radiation patterns for detection of angle of arrival
    Salem, Paul
    Wu, Chen
    Yagoub, M. C. E.
    2006 IEEE INTERNATIONAL WORKSHOP ON ANTENNA TECHNOLOGY: SMALL ANTENNAS AND NOVEL METAMATERIALS (IWAT), 2006, : 357 - +
  • [24] Design and implementation of a calibrating T-model neural-based A/D converter
    Tang, Z
    Shirata, Y
    Ishizuka, O
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1996, E79A (04) : 553 - 559
  • [25] A neural-based predictive model of the compressive strength of waste LCD glass concrete
    Kao, Chih-Han
    Wang, Chien-Chih
    Wang, Her -Yung
    COMPUTERS AND CONCRETE, 2017, 19 (05): : 457 - 465
  • [26] A neural-based re-ranking model for Chinese named entity recognition
    Guo J.
    Han Y.
    Ke Y.
    International Journal of Reasoning-based Intelligent Systems, 2019, 11 (03): : 265 - 272
  • [27] An adaptive updating model for pavement performance based on Deep Neural Networks
    Li, Yifan
    Bai, Qiang
    Hu, Aihui
    Chen, Lin
    Martinez-Pastor, Beatriz
    CONSTRUCTION AND BUILDING MATERIALS, 2024, 449
  • [28] Knowledge discovery in databases based on deep neural networks
    Tan, Yuanhua
    Zhang, Chaolin
    Ma, Yonglin
    Mao, Yici
    PROCEEDINGS OF THE 2015 10TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, 2015, : 1222 - 1227
  • [29] Performance of a Hybrid Neural-Based Framework for Alternative Electricity Price Forecasting in the Smart Grid
    Lei G.
    Xu C.
    Chen J.
    Zhao H.
    Parvaneh H.
    Distributed Generation and Alternative Energy Journal, 2022, 37 (03): : 405 - 434
  • [30] DeepMoD: Deep learning for model discovery in noisy data
    Both, Gert-Jan
    Choudhury, Subham
    Sens, Pierre
    Kusters, Remy
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 428