A Supervised Machine Learning Based Approach for Automatically Extracting High-Level Threat Intelligence from Unstructured Sources

被引:31
|
作者
Ghazi, Yumna [1 ]
Anwar, Zahid [1 ]
Mumtaz, Rafia [1 ]
Saleem, Shahzad [1 ]
Tahir, Ali [1 ]
机构
[1] NUST, SEECS, Dept Comp, Islamabad, Pakistan
关键词
Cyber Threat Intelligence; Natural Language Processing; Tactics; Techniques and Procedures (TTPs); STIX; Indicators of Compromise;
D O I
10.1109/FIT.2018.00030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The last few years have seen a radical shift in the cyber defense paradigm from reactive to proactive, and this change is marked by the steadily increasing trend of Cyber Threat Intelligence (CTI) sharing. Currently, there are numerous Open Source Intelligence (OSINT) sources providing periodically updated threat feeds that are fed into various analytical solutions. At this point, there is an excessive amount of data being produced from such sources, both structured (STIX, OpenIOC, etc.) as well as unstructured (blacklists, etc.). However, more often than not, the level of detail required for making informed security decisions is missing from threat feeds, since most indicators are atomic in nature, like IPs and hashes, which are usually rather volatile. These feeds distinctly lack strategic threat information, like attack patterns and techniques that truly represent the behavior of an attacker or an exploit. Moreover, there is a lot of duplication in threat information and no single place where one could explore the entirety of a threat, hence requiring hundreds of man hours for sifting through numerous sources - trying to discern signal from noise - to find all the credible information on a threat. We have made use of natural language processing to extract threat feeds from unstructured cyber threat information sources with approximately 70% precision, providing comprehensive threat reports in standards like STIX, which is a widely accepted industry standard that represents CTI. The automation of an otherwise tedious manual task would ensure the timely gathering and sharing of relevant CTI that would give organizations the edge to be able to proactively defend against known as well as unknown threats.
引用
收藏
页码:129 / 134
页数:6
相关论文
共 50 条
  • [41] Mapping of high-resolution daily particulate matter (PM2.5) concentration at the city level through a machine learning-based downscaling approach
    Nguyen, Phuong D. M.
    Phan, An H.
    Ngo, Truong X.
    Ho, Bang Q.
    Pham, Tran Vu
    Nguyen, Thanh T. N.
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2024, 197 (01)
  • [42] Towards precision cardiometabolic prevention: results from a machine learning, semi-supervised clustering approach in the nationwide population-based ORISCAV-LUX 2 study
    Fagherazzi, Guy
    Zhang, Lu
    Aguayo, Gloria
    Pastore, Jessica
    Goetzinger, Catherine
    Fischer, Aurelie
    Malisoux, Laurent
    Samouda, Hanen
    Bohn, Torsten
    Ruiz-Castell, Maria
    Huiart, Laetitia
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [43] Towards precision cardiometabolic prevention: results from a machine learning, semi-supervised clustering approach in the nationwide population-based ORISCAV-LUX 2 study
    Guy Fagherazzi
    Lu Zhang
    Gloria Aguayo
    Jessica Pastore
    Catherine Goetzinger
    Aurélie Fischer
    Laurent Malisoux
    Hanen Samouda
    Torsten Bohn
    Maria Ruiz-Castell
    Laetitia Huiart
    Scientific Reports, 11
  • [44] Towards a More Efficient Training Process in High-Level Female Volleyball From a Match Analysis Intervention Program Based on the Constraint-Led Approach: The Voice of the Players
    Fernandez-Echeverria, Carmen
    Mesquita, Isabel
    Gonzalez-Silva, Jara
    Moreno, M. Perla
    FRONTIERS IN PSYCHOLOGY, 2021, 12
  • [45] High spatiotemporal resolution dynamic contrast-enhanced MRI improves the image-based discrimination of histopathology risk groups of peripheral zone prostate cancer: a supervised machine learning approach
    David J. Winkel
    Hanns-Christian Breit
    Tobias K. Block
    Daniel T. Boll
    Tobias J. Heye
    European Radiology, 2020, 30 : 4828 - 4837
  • [46] High spatiotemporal resolution dynamic contrast-enhanced MRI improves the image-based discrimination of histopathology risk groups of peripheral zone prostate cancer: a supervised machine learning approach
    Winkel, David J.
    Breit, Hanns-Christian
    Block, Tobias K.
    Boll, Daniel T.
    Heye, Tobias J.
    EUROPEAN RADIOLOGY, 2020, 30 (09) : 4828 - 4837
  • [47] Development of ions adsorption onto nanoparticles from water/wastewater sources via novel nanocomposite materials: A machine learning-based approach (vol 35,104462,2024)
    Talath, Sirajunisa
    Wali, Adil Farooq
    Sridhar, Sathvik B.
    Hani, Umme
    Alanazi, Muteb
    Alharby, Tareq Nafea
    ADVANCED POWDER TECHNOLOGY, 2025, 36 (01)
  • [48] A Machine Learning Based Downscaling Approach to Produce High Spatio-Temporal Resolution Land Surface Temperature of the Antarctic Dry Valleys from MODIS Data
    Lezama Valdes, Lilian-Maite
    Katurji, Marwan
    Meyer, Hanna
    REMOTE SENSING, 2021, 13 (22)
  • [49] An Open-Source Machine Learning-Based Methodological Approach for Processing High-Resolution UAS LiDAR Data in Archaeological Contexts: A Case Study from Epirus, Greece
    Abate, Nicodemo
    Roubis, Dimitris
    Aggeli, Anthi
    Sileo, Maria
    Amodio, Antonio Minervino
    Vitale, Valentino
    Frisetti, Alessia
    Danese, Maria
    Arzu, Pierluigi
    Sogliani, Francesca
    Lasaponara, Rosa
    Masini, Nicola
    JOURNAL OF ARCHAEOLOGICAL METHOD AND THEORY, 2025, 32 (02)
  • [50] Machine Learning-Based Approach for Predicting the Altcoins Price Direction Change from a High-Frequency Data of Seven Years Based on Socio-Economic Factors, Bitcoin Prices, Twitter and News Sentiments
    Gupta, Anamika
    Pandey, Gaurav
    Gupta, Rajan
    Das, Smaran
    Prakash, Ajmera
    Garg, Kartik
    Sarkar, Shreyan
    COMPUTATIONAL ECONOMICS, 2024, 64 (05) : 2981 - 3026