CodeGraphSMOTE - Data Augmentation for Vulnerability Discovery

被引:1
|
作者
Ganz, Tom [1 ]
Imgrund, Erik [1 ]
Haerterich, Martin [1 ]
Rieck, Konrad [2 ]
机构
[1] SAP Secur Res, Walldorf, Germany
[2] Tech Univ Berlin, Berlin, Germany
关键词
Vulnerability Discovery; Data Augmentation; Graph Neural Networks;
D O I
10.1007/978-3-031-37586-6_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The automated discovery of vulnerabilities at scale is a crucial area of research in software security. While numerous machine learning models for detecting vulnerabilities are known, recent studies show that their generalizability and transferability heavily depend on the quality of the training data. Due to the scarcity of real vulnerabilities, available datasets are highly imbalanced, making it difficult for deep learning models to learn and generalize effectively. Based on the fact that programs can inherently be represented by graphs and to leverage recent advances in graph neural networks, we propose a novel method to generate synthetic code graphs for data augmentation to enhance vulnerability discovery. Our method includes two significant contributions: a novel approach for generating synthetic code graphs and a graph-to-code transformer to convert code graphs into their code representation. Applying our augmentation strategy to vulnerability discovery models achieves the same originally reported F1-score with less than 20% of the original dataset and we outperform the F1-score of prior work on augmentation strategies by up to 25.6% in detection performance.
引用
收藏
页码:282 / 301
页数:20
相关论文
共 50 条
  • [1] Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation
    Liu, Shangqing
    Ma, Wei
    Wang, Jian
    Xie, Xiaofei
    Feng, Ruitao
    Liu, Yang
    PROCEEDINGS OF THE 25TH ACM SIGPLAN/SIGBED INTERNATIONAL CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, LCTES 2024, 2024, : 166 - 177
  • [2] Auctus: A Dataset Search Engine for Data Discovery and Augmentation
    Castelo, Sonia
    Rampin, Remi
    Santos, Aecio
    Bessa, Aline
    Chirigati, Fernando
    Freire, Juliana
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 2791 - 2794
  • [3] Vulnerability Discovery for All: Experiences of Marginalization in Vulnerability Discovery
    Fulton, Kelsey R.
    Katcher, Samantha
    Song, Kevin
    Chetty, Marshini
    Mazurek, Michelle L.
    Messdaghi, Chloe
    Votipka, Daniel
    2023 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2023, : 1997 - 2014
  • [4] Data-driven Insights from Vulnerability Discovery Metrics
    Munaiah, Nuthan
    Meneely, Andrew
    2019 IEEE/ACM JOINT 4TH INTERNATIONAL WORKSHOP ON RAPID CONTINUOUS SOFTWARE ENGINEERING AND 1ST INTERNATIONAL WORKSHOP ON DATA-DRIVEN DECISIONS, EXPERIMENTATION AND EVOLUTION (RCOSE-DDREE 2019), 2019, : 1 - 7
  • [5] Processor Vulnerability Discovery
    Lyu, Yongqiang
    Sun, Rihui
    Qu, Gang
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [6] Deep neural-based vulnerability discovery demystified: data, model and performance
    Guanjun Lin
    Wei Xiao
    Leo Yu Zhang
    Shang Gao
    Yonghang Tai
    Jun Zhang
    Neural Computing and Applications, 2021, 33 : 13287 - 13300
  • [7] Deep neural-based vulnerability discovery demystified: data, model and performance
    Lin, Guanjun
    Xiao, Wei
    Zhang, Leo Yu
    Gao, Shang
    Tai, Yonghang
    Zhang, Jun
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (20): : 13287 - 13300
  • [8] Vulnerability Discovery with Attack Injection
    Antunes, Joao
    Neves, Nuno
    Correia, Miguel
    Verissimo, Paulo
    Neves, Rui
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2010, 36 (03) : 357 - 370
  • [9] Modeling the vulnerability discovery process
    Alhazmi, O. H.
    Malaiya, Y. K.
    16TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2005, : 129 - 138
  • [10] Modeling Skewness in Vulnerability Discovery
    Joh, HyunChul
    Malaiya, Yashwant K.
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2014, 30 (08) : 1445 - 1459