Automated Extraction of Software Names from Vulnerability Reports using LSTM and Expert System

被引:0
|
作者
Khokhlov, Igor [1 ]
Okutan, Ahmet [2 ]
Bryla, Ryan [2 ]
Simmons, Steven [2 ]
Mirakhorli, Mehdi [2 ]
机构
[1] Sacred Heart Univ, Fairfield, CT 06825 USA
[2] Rochester Inst Technol, Rochester, MN USA
关键词
Common Product Enumeration; Common Vulnerability; and Exposures; Natural Language Processing; Software Product Name Extraction; Software Vulnerability;
D O I
10.1109/STC55697.2022.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerabilities are closely monitored by the security community to timely address the security and privacy issues in software systems. Before a vulnerability is published by vulnerability management systems, it needs to be characterized to highlight its unique attributes, including affected software products and versions, to help security professionals prioritize their patches. Associating product names and versions with disclosed vulnerabilities may require a labor-intensive process that may delay their publication and fix, and thereby give attackers more time to exploit them. This work proposes a machine learning method to extract software product names and versions from unstructured CVE descriptions automatically. It uses Word2Vec and Char2Vec models to create context-aware features from CVE descriptions and uses these features to train a Named Entity Recognition (NER) model using bidirectional Long short-term memory (LSTM) networks. Based on the attributes of the product names and versions in previously published CVE descriptions, we created a set of Expert System (ES) rules to refine the predictions of the NER model and improve the performance of the developed method. Experiment results on real-life CVE examples indicate that using the trained NER model and the set of ES rules, software names and versions in unstructured CVE descriptions could be identified with FMeasure values above 0.95.
引用
收藏
页码:125 / 134
页数:10
相关论文
共 50 条
  • [21] A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning
    Harzevili, Nima shiri
    Belle, Alvine boaye
    Wang, Junjie
    Wang, Song
    Jiang, Zhen ming
    Nagappan, Nachiappan
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [22] AUTOMATED EXTRACTION SYSTEM USING A CONTINUOUS MINER
    KARLOVSKY, J
    MINING CONGRESS JOURNAL, 1978, 64 (05): : 24 - 28
  • [23] Integrating NLP and Ontology Matching into a Unified System for Automated Information Extraction from Geological Hazard Reports
    Qiu, Qinjun
    Huang, Zhen
    Xu, Dexin
    Ma, Kai
    Tao, Liufeng
    Wang, Run
    Chen, Jianguo
    Xie, Zhong
    Pan, Yongsheng
    JOURNAL OF EARTH SCIENCE, 2023, 34 (05) : 1433 - 1446
  • [24] Integrating NLP and Ontology Matching into a Unified System for Automated Information Extraction from Geological Hazard Reports
    Qinjun Qiu
    Zhen Huang
    Dexin Xu
    Kai Ma
    Liufeng Tao
    Run Wang
    Jianguo Chen
    Zhong Xie
    Yongsheng Pan
    Journal of Earth Science, 2023, 34 (05) : 1433 - 1446
  • [25] Integrating NLP and Ontology Matching into a Unified System for Automated Information Extraction from Geological Hazard Reports
    Qinjun Qiu
    Zhen Huang
    Dexin Xu
    Kai Ma
    Liufeng Tao
    Run Wang
    Jianguo Chen
    Zhong Xie
    Yongsheng Pan
    Journal of Earth Science, 2023, (05) : 1433 - 1446
  • [26] Integrating NLP and Ontology Matching into a Unified System for Automated Information Extraction from Geological Hazard Reports
    Qinjun Qiu
    Zhen Huang
    Dexin Xu
    Kai Ma
    Liufeng Tao
    Run Wang
    Jianguo Chen
    Zhong Xie
    Yongsheng Pan
    Journal of Earth Science, 2023, 34 : 1433 - 1446
  • [27] Automated redaction of names in adverse event reports using transformer-based neural networks
    Meldau, Eva-Lisa
    Bista, Shachi
    Melgarejo-Gonzalez, Carlos
    Noren, G. Niklas
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [28] AUTOMATED CHROMATOGRAPHIC DATA INTERPRETATION USING AN EXPERT-SYSTEM
    ELLING, JW
    MNISZEWSKI, SM
    ZAHRT, JD
    KLATT, LN
    JOURNAL OF CHROMATOGRAPHIC SCIENCE, 1994, 32 (06) : 213 - 218
  • [29] Automated Information Extraction from Free-Text EEG Reports
    Biswal, Siddharth
    Nip, Zarina
    Moura Junior, Valdcry
    Bianchi, Matt T.
    Rosenthal, Eric S.
    Westover, M. Brandon
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 6804 - 6807
  • [30] Automated Extraction Information System from HUD's Images using ANN
    Guarino de Vasconcelos, Luiz Eduardo
    Kusumoto, Andre Yoshimi
    Oliveira Leite, Nelson Paiva
    Araujo Lopes, Cristina Moniz
    2015 12TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY - NEW GENERATIONS, 2015, : 657 - 661