Machine Learning to Combine Static Analysis Alerts with Software Metrics to Detect Security Vulnerabilities: An Empirical Study

被引:3
|
作者
Pereira, Jose D'Abruzzo [1 ]
Campos, Joao R. [1 ]
Vieira, Marco [1 ]
机构
[1] Univ Coimbra, CISUC, DEI, Coimbra, Portugal
关键词
Security; Vulnerability Detection; Static Code Analysis; Software Metrics; ANALYSIS TOOLS;
D O I
10.1109/EDCC53658.2021.00008
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Software developers can use diverse techniques and tools to reduce the number of vulnerabilities, but the effectiveness of existing solutions in real projects is questionable. For example, Static Analysis Tools (SATs) report potential vulnerabilities by analyzing code patterns, and Software Metrics (SMs) can be used to predict vulnerabilities based on high-level characteristics of the code. In theory, both approaches can be applied from the early stages of the development process, but it is well known that they fail to detect critical vulnerabilities and raise a large number of false alarms. This paper studies the hypothesis of using Machine Learning (ML) to combine alerts from SATs with SMs to predict vulnerabilities in a large software project (under development for many years). In practice, we use four ML algorithms, alerts from two SATs, and a large number of SMs to predict whether a source code file is vulnerable or not (binary classification) and to predict the vulnerability category (multiclass classification). Results show that one can achieve either high precision or high recall, but not both at the same time. To understand the reason, we analyze and compare snippets of source code, demonstrating that vulnerable and non-vulnerable files share similar characteristics, making it hard to distinguish vulnerable from non-vulnerable code based on SAT alerts and SMs.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [21] Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities
    Gkortzis, Antonios
    Feitosa, Daniel
    Spinellis, Diomidis
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 172
  • [22] LAPSE plus Static Analysis Security Software: Vulnerabilities Detection in Java']Java EE Applications
    Martin Perez, Pablo
    Filipiak, Joanna
    Maria Sierra, Jose
    FUTURE INFORMATION TECHNOLOGY, PT 1, 2011, 184 : 148 - 156
  • [23] Detecting Android Security Vulnerabilities Using Machine Learning and System Calls Analysis
    Campos, Carlos Renato Salim
    Jaafar, Fehmi
    Malik, Yasir
    2019 COMPANION OF THE 19TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS-C 2019), 2019, : 109 - 113
  • [24] Discovering software vulnerabilities using data-flow analysis and machine learning
    Kronjee, Jorrit
    Hommersom, Arjen
    Vranken, Harald
    13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [25] Software defect prediction: A study on software metrics using statistical and machine learning methods
    Canaparo, Marco
    Ronchierr, Elisabetta
    Bertaccini, Gianluca
    INTERNATIONAL SYMPOSIUM ON GRIDS & CLOUDS 2022, 2022,
  • [26] How Do Developers Act on Static Analysis Alerts? An Empirical Study of Coverity Usage
    Imtiaz, Nasif
    Murphy, Brendan
    Williams, Laurie
    2019 IEEE 30TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2019, : 323 - 333
  • [27] Applying machine learning to predict software fault proneness using change metrics, static code metrics, and a combination of them
    Alshehri, Yasser Ali
    Goseva-Popstojanova, Katerina
    Dzielski, Dale G.
    Devine, Thomas
    IEEE SOUTHEASTCON 2018, 2018,
  • [28] Architectural Security Weaknesses in Industrial Control Systems (ICS) An Empirical Study based on Disclosed Software Vulnerabilities
    Gonzalez, Danielle
    Alhenaki, Fawaz
    Mirakhorli, Mehdi
    2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE (ICSA), 2019, : 31 - 40
  • [29] Applying Software Design Metrics to Developer Story: A Supervised Machine Learning Analysis
    Algarni, Asaad
    Magel, Kenneth
    2019 IEEE FIRST INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2019), 2019, : 156 - 159
  • [30] Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software
    Alahdab, Mohannad
    Calikli, Gul
    PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROFES 2019, 2019, 11915 : 195 - 202