Few-VulD: A Few-shot learning framework for software vulnerability detection☆ ☆

被引:0
|
作者
Zheng, Tianming [1 ]
Liu, Haojun [2 ]
Xu, Hang [1 ]
Chen, Xiang [1 ]
Yi, Ping [1 ]
Wu, Yue [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA USA
基金
国家重点研发计划;
关键词
Vulnerability detection; Few-shot learning; Meta-learning; BiLSTM; Artificial intelligence; Deep learning; NEURAL-NETWORKS;
D O I
10.1016/j.cose.2024.103992
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid development of artificial intelligence (AI) has led to the introduction of numerous software vulnerability detection methods based on deep learning algorithms. However, a significant challenge is their dependency on large volumes of code samples for effective training. This requirement poses a considerable hurdle, particularly when adapting to diverse software application scenarios and various vulnerability types, where gathering sufficient and relevant training data for different classification tasks is often arduous. To address the challenge, this paper introduces Few-VulD, a novel framework for software vulnerability detection based on few-shot learning. This framework is designed to be efficiently trained with a minimal number of samples from a variety of existing classification tasks. Its key advantage lies in its ability to rapidly adapt to new vulnerability detection tasks, such as identifying new types of vulnerabilities, with only a small set of learning samples. This capability is particularly beneficial in scenarios where available vulnerability samples are limited. We compare Few-VulD with five state-of-the-art methods on the SySeVR and Big-Vul datasets. On the SySeVR dataset, Few-VulD outperforms all other methods, achieving a recall rate of 87.9% and showing an improvement of 11.7% to 57.8%. On the Big-Vul dataset, Few-VulD outperforms three of the methods, including one that utilizes a pretrained large language model (LLM), with recall improvements ranging from 8.5% to 40.1%. The other two methods employ pretrained LLMs from Microsoft CodeXGLUE (Lu et al., 2021). Few-VulD reaches 78.7% and 95.5% of their recall rates without the need for extensive data pretraining. The performance proves the effectiveness of Few-VulD in vulnerability detection tasks with limited samples.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Few-Shot Learning with Novelty Detection
    Bjerge, Kim
    Bodesheim, Paul
    Karstoft, Henrik
    DEEP LEARNING THEORY AND APPLICATIONS, PT I, DELTA 2024, 2024, 2171 : 340 - 363
  • [2] A MUTUAL LEARNING FRAMEWORK FOR FEW-SHOT SOUND EVENT DETECTION
    Yang, Dongchao
    Wang, Helin
    Zou, Yuexian
    Ye, Zhongjie
    Wang, Wenwu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 811 - 815
  • [3] Few-Shot Few-Shot Learning and the role of Spatial Attention
    Lifchitz, Yann
    Avrithis, Yannis
    Picard, Sylvaine
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2693 - 2700
  • [4] Counterfactual Generation Framework for Few-Shot Learning
    Dang, Zhuohang
    Luo, Minnan
    Jia, Chengyou
    Yan, Caixia
    Chang, Xiaojun
    Zheng, Qinghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 3747 - 3758
  • [5] A Few-Shot Learning Framework for Air Vehicle Detection by Similarity Embedding
    Chen, Juan
    Liu, Yuchuan
    Liu, Yicong
    Wang, Shiying
    Chen, Siyuan
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [6] Few-Shot Learning for Road Object Detection
    Majee, Anay
    Agrawal, Kshitij
    Subramanian, Anbumani
    AAAI WORKSHOP ON META-LEARNING AND METADL CHALLENGE, VOL 140, 2021, 140 : 115 - 126
  • [7] Few-shot learning for defect detection in manufacturing
    Zajec, Patrik
    Rozanec, Joze M.
    Theodoropoulos, Spyros
    Fontul, Mihail
    Koehorst, Erik
    Fortuna, Blaz
    Mladenic, Dunja
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024, 62 (19) : 6979 - 6998
  • [8] HoloDetect: Few-Shot Learning for Error Detection
    Heidari, Alireza
    McGrath, Joshua
    Ilyas, Ihab F.
    Rekatsinas, Theodoros
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 829 - 846
  • [9] Empowering few-shot learning: a multimodal optimization framework
    Liriam Enamoto
    Geraldo Pereira Rocha Filho
    Li Weigang
    Neural Computing and Applications, 2025, 37 (5) : 3539 - 3560
  • [10] Empowering few-shot learning: a multimodal optimization framework
    Enamoto, Liriam
    Rocha Filho, Geraldo Pereira
    Weigang, Li
    Neural Computing and Applications, 2024,