Automated Extraction of Software Names from Vulnerability Reports using LSTM and Expert System

被引:0
|
作者
Khokhlov, Igor [1 ]
Okutan, Ahmet [2 ]
Bryla, Ryan [2 ]
Simmons, Steven [2 ]
Mirakhorli, Mehdi [2 ]
机构
[1] Sacred Heart Univ, Fairfield, CT 06825 USA
[2] Rochester Inst Technol, Rochester, MN USA
关键词
Common Product Enumeration; Common Vulnerability; and Exposures; Natural Language Processing; Software Product Name Extraction; Software Vulnerability;
D O I
10.1109/STC55697.2022.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerabilities are closely monitored by the security community to timely address the security and privacy issues in software systems. Before a vulnerability is published by vulnerability management systems, it needs to be characterized to highlight its unique attributes, including affected software products and versions, to help security professionals prioritize their patches. Associating product names and versions with disclosed vulnerabilities may require a labor-intensive process that may delay their publication and fix, and thereby give attackers more time to exploit them. This work proposes a machine learning method to extract software product names and versions from unstructured CVE descriptions automatically. It uses Word2Vec and Char2Vec models to create context-aware features from CVE descriptions and uses these features to train a Named Entity Recognition (NER) model using bidirectional Long short-term memory (LSTM) networks. Based on the attributes of the product names and versions in previously published CVE descriptions, we created a set of Expert System (ES) rules to refine the predictions of the NER model and improve the performance of the developed method. Experiment results on real-life CVE examples indicate that using the trained NER model and the set of ES rules, software names and versions in unstructured CVE descriptions could be identified with FMeasure values above 0.95.
引用
收藏
页码:125 / 134
页数:10
相关论文
共 50 条
  • [31] Automated feature extraction from power system transients using wavelet transform
    Xu, X
    Kezunovic, M
    POWERCON 2002: INTERNATIONAL CONFERENCE ON POWER SYSTEM TECHNOLOGY, VOLS 1-4, PROCEEDINGS, 2002, : 1994 - 1998
  • [32] AUTOMATED EXTRACTION OF EXPERT DOMAIN KNOWLEDGE FROM GENETIC PROGRAMMING SYNTHESIS RESULTS
    McConaghy, Trent
    Palmers, Pieter
    Gielen, Georges
    Steyaert, Michiel
    GENETIC PROGRAMMING THEORY AND PRACTICE VI, 2009, : 111 - 124
  • [35] Enhancing Thyroid Pathology With Artificial Intelligence: Automated Data Extraction From Electronic Health Reports Using RUBY
    Culie, Dorian
    Schiappa, Renaud
    Contu, Sara
    Seutin, Eva
    Pace-Loscos, Tanguy
    Poissonnet, Gilles
    Villarme, Agathe
    Bozec, Alexandre
    Chamorey, Emmanuel
    JCO CLINICAL CANCER INFORMATICS, 2024, 8
  • [36] Automated embolus identification using a rule-based expert system
    Fan, L
    Evans, DH
    Naylor, AR
    ULTRASOUND IN MEDICINE AND BIOLOGY, 2001, 27 (08): : 1065 - 1077
  • [37] Vulnerability Discovery Model for a Software System Using Stochastic Differential Equation
    Shrivastava, A. K.
    Sharma, Ruchi
    Kapur, P. K.
    2015 1ST INTERNATIONAL CONFERENCE ON FUTURISTIC TRENDS ON COMPUTATIONAL ANALYSIS AND KNOWLEDGE MANAGEMENT (ABLAZE), 2015, : 199 - 205
  • [38] An Expert System for Road Extraction from Remote Sensing Image
    Wang, Yuehai
    Wang, Jingang
    Wang, Qi
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 125 - 128
  • [39] Assessing software system maintainability using structural measures and expert assessments
    Anda, Bente
    2007 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2007, : 294 - 303
  • [40] Automated classification of software issue reports using machine learning techniques: an empirical study
    Pandey N.
    Sanyal D.K.
    Hudait A.
    Sen A.
    Innovations in Systems and Software Engineering, 2017, 13 (4) : 279 - 297