Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical Study

被引:0
|
作者
Triet Huynh Minh Le [1 ,2 ]
Du, Xiaoning [3 ]
Babar, M. Ali [1 ,2 ]
机构
[1] Univ Adelaide, CREST Ctr Res Engn Software Technol, Adelaide, SA, Australia
[2] Cyber Secur Cooperat Res Ctr, Joondalup, WA, Australia
[3] Monash Univ, Melbourne, Vic, Australia
关键词
Software vulnerability; Software security; Deep learning; Data quality; SZZ algorithm;
D O I
10.1145/3643991.3644919
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collecting relevant and high-quality data is integral to the development of effective Software Vulnerability (SV) prediction models. Most of the current SV datasets rely on SV-fixing commits to extract vulnerable functions and lines. However, none of these datasets have considered latent SVs existing between the introduction and fix of the collected SVs. There is also little known about the usefulness of these latent SVs for SV prediction. To bridge these gaps, we conduct a large-scale study on the latent vulnerable functions in two commonly used SV datasets and their utilization for function-level and line-level SV predictions. Leveraging the state-of-the-art SZZ algorithm, we identify more than 100k latent vulnerable functions in the studied datasets. We find that these latent functions can increase the number of SVs by 4x on average and correct up to 5k mislabeled functions, yet they have a noise level of around 6%. Despite the noise, we show that the state-of-the-art SV prediction model can significantly benefit from such latent SVs. The improvements are up to 24.5% in the performance (F1-Score) of function-level SV predictions and up to 67% in the effectiveness of localizing vulnerable lines. Overall, our study presents the first promising step toward the use of latent SVs to improve the quality of SV datasets and enhance the performance of SV prediction tasks.
引用
收藏
页码:716 / 727
页数:12
相关论文
共 50 条
  • [1] Empirical Results on the Study of Software Vulnerabilities (NIER Track)
    Wu, Yan
    Siy, Harvey
    Gandhi, Robin
    2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 964 - 967
  • [2] A Study on Software Vulnerability Prediction Model
    Shamal, P. K.
    Rahamathulla, K.
    Akbar, Ali
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 703 - 706
  • [3] Software Vulnerability Prediction in Low-Resource Languages: An Empirical Study of CodeBERT and ChatGPT
    Triet Huynh Minh Le
    Babar, M. Ali
    Tung Hoang Thai
    PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024, 2024, : 679 - 685
  • [4] A Study on Latent Vulnerabilities
    Beng Heng Ng
    Hu, Xin
    Prakash, Atul
    2010 29TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS SRDS 2010, 2010, : 333 - 337
  • [5] Exploitability prediction of software vulnerabilities
    Bhatt, Navneet
    Anand, Adarsh
    Yadavalli, V. S. S.
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2021, 37 (02) : 648 - 663
  • [6] Software systems at risk: An empirical study of cloned vulnerabilities in practice
    Kim, Seulbae
    Lee, Heejo
    COMPUTERS & SECURITY, 2018, 77 : 720 - 736
  • [7] Modeling software vulnerabilities with vulnerability cause graphs
    Byers, David
    Ardi, Shanai
    Shahmehri, Nahid
    Duma, Claudiu
    ICSM 2006: 22ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2006, : 411 - +
  • [8] Hidden GEMs: Automated Discovery of Access Control Vulnerabilities in Graphical User Interfaces
    Mulliner, Collin
    Robertson, William
    Kirda, Engin
    2014 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2014), 2014, : 149 - 162
  • [9] Empirical Study on Software Bug Prediction
    Rizwan, Syed
    Wang Tiantian
    Su Xiaohong
    Salahuddin
    2017 INTERNATIONAL CONFERENCE ON SOFTWARE AND E-BUSINESS (ICSEB 2017), 2015, : 55 - 59
  • [10] Software vulnerability prediction: A systematic mapping study
    Kalouptsoglou, Ilias
    Siavvas, Miltiadis
    Ampatzoglou, Apostolos
    Kehagias, Dionysios
    Chatzigeorgiou, Alexander
    INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 164