A Supervised Machine Learning Based Approach for Automatically Extracting High-Level Threat Intelligence from Unstructured Sources

被引:31
|
作者
Ghazi, Yumna [1 ]
Anwar, Zahid [1 ]
Mumtaz, Rafia [1 ]
Saleem, Shahzad [1 ]
Tahir, Ali [1 ]
机构
[1] NUST, SEECS, Dept Comp, Islamabad, Pakistan
关键词
Cyber Threat Intelligence; Natural Language Processing; Tactics; Techniques and Procedures (TTPs); STIX; Indicators of Compromise;
D O I
10.1109/FIT.2018.00030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The last few years have seen a radical shift in the cyber defense paradigm from reactive to proactive, and this change is marked by the steadily increasing trend of Cyber Threat Intelligence (CTI) sharing. Currently, there are numerous Open Source Intelligence (OSINT) sources providing periodically updated threat feeds that are fed into various analytical solutions. At this point, there is an excessive amount of data being produced from such sources, both structured (STIX, OpenIOC, etc.) as well as unstructured (blacklists, etc.). However, more often than not, the level of detail required for making informed security decisions is missing from threat feeds, since most indicators are atomic in nature, like IPs and hashes, which are usually rather volatile. These feeds distinctly lack strategic threat information, like attack patterns and techniques that truly represent the behavior of an attacker or an exploit. Moreover, there is a lot of duplication in threat information and no single place where one could explore the entirety of a threat, hence requiring hundreds of man hours for sifting through numerous sources - trying to discern signal from noise - to find all the credible information on a threat. We have made use of natural language processing to extract threat feeds from unstructured cyber threat information sources with approximately 70% precision, providing comprehensive threat reports in standards like STIX, which is a widely accepted industry standard that represents CTI. The automation of an otherwise tedious manual task would ensure the timely gathering and sharing of relevant CTI that would give organizations the edge to be able to proactively defend against known as well as unknown threats.
引用
收藏
页码:129 / 134
页数:6
相关论文
共 50 条
  • [21] Confidence-Level-Based Semi-Supervised Machine Learning Approach for Partial Discharge Signal Classification
    Niazi, M. Tahir Khan
    Hussain, Md Rashid
    Park, Chanyeop
    2022 IEEE/AIAA TRANSPORTATION ELECTRIFICATION CONFERENCE AND ELECTRIC AIRCRAFT TECHNOLOGIES SYMPOSIUM (ITEC+EATS 2022), 2022, : 912 - 916
  • [22] High-Level Synthesis from C vs. a DSL-based Approach
    de Oliveira, Cristiano B.
    Marques, Eduardo
    Cardoso, Joao M. P.
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 257 - 262
  • [23] Precise prognostics of biochar yield from various biomass sources by Bayesian approach with supervised machine learning and ensemble methods
    Nguyen, Van Giao
    Sharma, Prabhakar
    Agbulut, Uemit
    Le, Huu Son
    Tran, Viet Dung
    Cao, Dao Nam
    INTERNATIONAL JOURNAL OF GREEN ENERGY, 2024, 21 (09) : 2180 - 2204
  • [24] Efficient Approach for Extracting High-Level B-Spline Features from LIDAR Data for Light-Weight Mapping
    Usman, Muhammad
    Ali, Ahmad
    Tahir, Abdullah
    Rahman, Muhammad Zia Ur
    Khan, Abdul Manan
    SENSORS, 2022, 22 (23)
  • [25] An Empirical Semi-Supervised Machine Learning Approach on Extracting and Ranking Document Level Multi-Word Product Names Using Improved C-value Approach
    Sivashankari, R.
    Valarmathi, B.
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 770 - 775
  • [26] Inducing high-level behaviors from problem-solving traces using machine-learning tools
    Robinet, Vivien
    Bisson, Gilles
    Gordon, Mirta B.
    Lemaire, Benoit
    IEEE INTELLIGENT SYSTEMS, 2007, 22 (04) : 22 - 30
  • [27] ARKAIV: Predicting Data Exfiltration Using Supervised Machine Learning Based on Tactics Mapping From Threat Reports and Event Logs
    Hakim, Arif Rahman
    Ramli, Kalamullah
    Salman, Muhammad
    Pranggono, Bernardi
    Agustina, Esti Rahmawati
    IEEE ACCESS, 2025, 13 : 28381 - 28397
  • [28] Predicting non-verbal intelligence level from resting-state connectivity: Machine learning approach
    Feklicheva, Inna
    Chipeeva, Nadezda
    Zakharov, Ilia
    Ivanov, Sergey
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2023, 58 : 785 - 785
  • [29] Towards Goal Based Architecture Design for Learning High-Level Representation of Behaviors from Demonstration
    Fonooni, Benjamin
    Hellstrom, Thomas
    Janlert, Lars-Erik
    2013 IEEE INTERNATIONAL MULTI-DISCIPLINARY CONFERENCE ON COGNITIVE METHODS IN SITUATION AWARENESS AND DECISION SUPPORT (COGSIMA), 2013, : 67 - 74
  • [30] Conceptualization of Rule- and Machine Learning-based High-Level Building Blocks for Design Task Complexity Assessment
    Yinanc, Kutay Can
    Konkol, Kathrin
    Cencic, Maiara Rosa
    ZWF Zeitschrift fuer Wirtschaftlichen Fabrikbetrieb, 2024, 119 (11): : 817 - 821