Code Aggregate Graph: Effective Representation for Graph Neural Networks to Detect Vulnerable Code

被引:0
|
作者
Nguyen, Hoang Viet [1 ]
Zheng, Junjun [2 ]
Inomata, Atsuo [2 ]
Uehara, Tetsutaro [1 ]
机构
[1] Ritsumeikan University, College of Information Science and Engineering, Kusatsu,5258577, Japan
[2] Osaka University, Graduate School of Information Science and Technology, Osaka,5650871, Japan
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning, especially graph neural networks (GNNs), provides efficient, fast, and automated methods to detect vulnerable code. However, the accuracy could be improved as previous studies were limited by existing code representations. Additionally, the diversity of embedding techniques and GNN models can make selecting the appropriate method challenging. Herein we propose Code Aggregate Graph (CAG) to improve vulnerability detection efficiency. CAG combines the principles of different code analyses such as abstract syntax tree, control flow graph, and program dependence graph with dominator and post-dominator trees. This extensive representation empowers deep graph networks for enhanced classification. We also implement different data encoding methods and neural networks to provide a multidimensional view of the system performance. Specifically, three word embedding approaches and three deep GNNs are utilized to build classifiers. Then CAG is evaluated using two datasets: a real-world open-source dataset and the software assurance reference dataset. CAG is also compared with seven state-of-the-art methods and six classic representations. CAG shows the best performance. Compared to previous studies, CAG has an increased accuracy (5.4%) and F1-score (5.1%). Additionally, experiments confirm that encoding has a positive impact on accuracy (4-6%) but the network type does not. The study should contribute to a meaningful benchmark for future research on code representations, data encoding, and GNNs. © 2013 IEEE.
引用
收藏
页码:123786 / 123800
相关论文
共 50 条
  • [1] Code Aggregate Graph: Effective Representation for Graph Neural Networks to Detect Vulnerable Code
    Nguyen, Hoang Viet
    Zheng, Junjun
    Inomata, Atsuo
    Uehara, Tetsutaro
    IEEE ACCESS, 2022, 10 : 123786 - 123800
  • [2] Improving Cross-Language Code Clone Detection via Code Representation Learning and Graph Neural Networks
    Mehrotra, Nikita
    Sharma, Akash
    Jindal, Anmol
    Purandare, Rahul
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (11) : 4846 - 4868
  • [3] Graph Representation for Emergency Egress Code Analysis
    Essawy, Yasmeen A. S.
    Abdullah, Abdelhamid
    Nassar, Khaled
    PROCEEDINGS OF THE CANADIAN SOCIETY OF CIVIL ENGINEERING ANNUAL CONFERENCE 2022, VOL 3, CSCE 2022, 2024, 359 : 617 - 627
  • [4] Graph Representation for Emergency Egress Code Analysis
    Essawy, Yasmeen A. S.
    Abdullah, Abdelhamid
    Nassar, Khaled
    PROCEEDINGS OF THE CANADIAN SOCIETY OF CIVIL ENGINEERING ANNUAL CONFERENCE 2022, VOL 4, CSCE 2022, 2024, 367 : 617 - 627
  • [5] Graph-state representation of the tonic code
    Liao, Pengcheng
    Feder, David L.
    PHYSICAL REVIEW A, 2021, 104 (01)
  • [6] Detecting Runtime Exceptions by Deep Code Representation Learning with Attention-Based Graph Neural Networks
    Li, Rongfan
    Chen, Bihuan
    Zhang, Fengyi
    Sun, Chao
    Peng, Xin
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 373 - 384
  • [7] Don Tucker finalist: predicting the combinatorial code of olfaction via graph neural networks and representation learning
    Hladis, Matej
    Fiorucci, Sebastien
    Topin, Jeremie
    CHEMICAL SENSES, 2022, 47
  • [8] Detecting Runtime Exceptions by Deep Code Representation Learning with Attention-Based Graph Neural Networks
    School of Computer Science, Fudan University, Shanghai Key Laboratory of Data Science, China
    Proc. - IEEE Int. Conf. Softw. Anal., Evol. Reengineering, SANER, 2022, (373-384):
  • [9] Unsupervised Classifying of Software Source Code Using Graph Neural Networks
    Vytovtov, Petr
    Chuvilin, Kirill
    PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 518 - 524
  • [10] Automatically Generating Code Comment Using Heterogeneous Graph Neural Networks
    Jin, Dun
    Liu, Peiyu
    Zhu, Zhenfang
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 1078 - 1088