Automated Extraction of Software Names from Vulnerability Reports using LSTM and Expert System

被引:0
|
作者
Khokhlov, Igor [1 ]
Okutan, Ahmet [2 ]
Bryla, Ryan [2 ]
Simmons, Steven [2 ]
Mirakhorli, Mehdi [2 ]
机构
[1] Sacred Heart Univ, Fairfield, CT 06825 USA
[2] Rochester Inst Technol, Rochester, MN USA
关键词
Common Product Enumeration; Common Vulnerability; and Exposures; Natural Language Processing; Software Product Name Extraction; Software Vulnerability;
D O I
10.1109/STC55697.2022.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerabilities are closely monitored by the security community to timely address the security and privacy issues in software systems. Before a vulnerability is published by vulnerability management systems, it needs to be characterized to highlight its unique attributes, including affected software products and versions, to help security professionals prioritize their patches. Associating product names and versions with disclosed vulnerabilities may require a labor-intensive process that may delay their publication and fix, and thereby give attackers more time to exploit them. This work proposes a machine learning method to extract software product names and versions from unstructured CVE descriptions automatically. It uses Word2Vec and Char2Vec models to create context-aware features from CVE descriptions and uses these features to train a Named Entity Recognition (NER) model using bidirectional Long short-term memory (LSTM) networks. Based on the attributes of the product names and versions in previously published CVE descriptions, we created a set of Expert System (ES) rules to refine the predictions of the NER model and improve the performance of the developed method. Experiment results on real-life CVE examples indicate that using the trained NER model and the set of ES rules, software names and versions in unstructured CVE descriptions could be identified with FMeasure values above 0.95.
引用
收藏
页码:125 / 134
页数:10
相关论文
共 50 条
  • [41] Feature extraction from phonocardiogram for diagnosis based on expert system
    Jeharon, Hamdee
    Seagar, Andrew
    Seagar, Nittaya
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 5479 - 5482
  • [42] Using software for expert advice on radiated emissions from cables
    Seaward Electronics
    Electron Eng London, 837 (3pp):
  • [43] Using software for expert advice on radiated emissions from cables
    Taylor, R
    ELECTRONIC ENGINEERING, 1996, 68 (837): : EMC63 - &
  • [44] Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports
    Qiu, John X.
    Yoon, Hong-Jun
    Fearn, Paul A.
    Tourassi, Georgia D.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2018, 22 (01) : 244 - 251
  • [45] TTPHunter: Automated Extraction of Actionable Intelligence as TTPs from Narrative Threat Reports
    Rani, Nanda
    Saha, Bikash
    Maurya, Vikas
    Shukla, Sandeep Kumar
    PROCEEDINGS OF 2023 AUSTRALIAN COMPUTER SCIENCE WEEK, ACSW 2023, 2023, : 126 - 134
  • [46] Automated extraction of anorectal pressures from high-resolution manometry reports
    ElWazir, Mohamed
    Gautam, Misha
    Mishra, Rahul
    Oblizajek, Nicholas R.
    Blackett, John W.
    Bharucha, Adil E.
    NEUROGASTROENTEROLOGY AND MOTILITY, 2022, 34 (11):
  • [47] Automated Extraction of Tumor Staging and Diagnosis Information From Surgical Pathology Reports
    Abedian, Sajjad
    Sholle, Evan T.
    Adekkanattu, Prakash M.
    Cusick, Marika M.
    Weiner, Stephanie E.
    Shoag, Jonathan E.
    Hu, Jim C.
    Campion, Thomas R., Jr.
    JCO CLINICAL CANCER INFORMATICS, 2021, 5 : 1054 - 1061
  • [48] SPread: Automated Financial Metric Extraction and Spreading Tool from Earnings Reports
    Nourbakhsh, Armineh
    Ghassemi, Mohammad M.
    Pomerville, Steven
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 853 - 856
  • [49] Automated extraction of information from free text of Spanish oncology pathology reports
    Mendoza-Urbano, Diana Marcela
    Garcia, Johan Felipe
    Moreno, Juan Sebastian
    Bravo-Ocana, Juan Carlos
    Riascos, Alvaro Jose
    Harvey, Angela Zambrano
    Prada, Sergio, I
    COLOMBIA MEDICA, 2023, 54 (01):
  • [50] Text Analysis Software Using Topic Modeling Techniques for the Extraction of Knowledge from Cases Related to Vulnerability and Access to Justice
    Espinosa, Jorge E.
    Mateus, Sandra P.
    Ramirez, Diana M.
    ARTIFICIAL INTELLIGENCE IN HCI, PT III, AI-HCI 2024, 2024, 14736 : 334 - 352