A Supervised Machine Learning Based Approach for Automatically Extracting High-Level Threat Intelligence from Unstructured Sources

被引:31
|
作者
Ghazi, Yumna [1 ]
Anwar, Zahid [1 ]
Mumtaz, Rafia [1 ]
Saleem, Shahzad [1 ]
Tahir, Ali [1 ]
机构
[1] NUST, SEECS, Dept Comp, Islamabad, Pakistan
关键词
Cyber Threat Intelligence; Natural Language Processing; Tactics; Techniques and Procedures (TTPs); STIX; Indicators of Compromise;
D O I
10.1109/FIT.2018.00030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The last few years have seen a radical shift in the cyber defense paradigm from reactive to proactive, and this change is marked by the steadily increasing trend of Cyber Threat Intelligence (CTI) sharing. Currently, there are numerous Open Source Intelligence (OSINT) sources providing periodically updated threat feeds that are fed into various analytical solutions. At this point, there is an excessive amount of data being produced from such sources, both structured (STIX, OpenIOC, etc.) as well as unstructured (blacklists, etc.). However, more often than not, the level of detail required for making informed security decisions is missing from threat feeds, since most indicators are atomic in nature, like IPs and hashes, which are usually rather volatile. These feeds distinctly lack strategic threat information, like attack patterns and techniques that truly represent the behavior of an attacker or an exploit. Moreover, there is a lot of duplication in threat information and no single place where one could explore the entirety of a threat, hence requiring hundreds of man hours for sifting through numerous sources - trying to discern signal from noise - to find all the credible information on a threat. We have made use of natural language processing to extract threat feeds from unstructured cyber threat information sources with approximately 70% precision, providing comprehensive threat reports in standards like STIX, which is a widely accepted industry standard that represents CTI. The automation of an otherwise tedious manual task would ensure the timely gathering and sharing of relevant CTI that would give organizations the edge to be able to proactively defend against known as well as unknown threats.
引用
收藏
页码:129 / 134
页数:6
相关论文
共 50 条
  • [31] High Intraocular Pressure Detection from Frontal Eye Images: A Machine Learning Based Approach
    Aloudat, Mohammad
    Faezipour, Miad
    El-Sayed, Ahmed
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 5406 - 5409
  • [32] Extracting cancer mortality statistics from death certificates: A hybrid machine learning and rule-based approach for common and rare cancers
    Koopman, Bevan
    Zuccon, Guido
    Nguyen, Anthony
    Bergheim, Anton
    Grayson, Narelle
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2018, 89 : 1 - 9
  • [33] Machine learning based fast and accurate High Level Synthesis design space exploration: From graph to synthesis
    Goswami, Pingakshya
    Schaefer, Benjamin Carrion
    Bhatia, Dinesh
    INTEGRATION-THE VLSI JOURNAL, 2023, 88 : 116 - 124
  • [34] Mathematical modeling of ions adsorption from water/wastewater sources via porous materials: A machine learning-based approach
    Yang, Guang
    Jafar, Nadhir N. A.
    Albadr, Rafid Jihad
    Alwan, Mariem
    Yousif, Zainab Sadeq
    Kamona, Suhair Mohammad Husein
    Ibrahim, Safaa Mohammed
    Altimari, Usama S.
    Kareem, Ashwaq Talib
    Jettie, Raghu
    Alubady, Raaid
    Alawadi, Ahmed
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2024, 255
  • [35] Machine-learning based study on the on-site renewable electrical performance of an optimal hybrid PCMs integrated renewable system with high-level parameters' uncertainties
    Zhou, Yuekuan
    Zheng, Siqian
    Zhang, Guoqiang
    RENEWABLE ENERGY, 2020, 151 (403-418) : 403 - 418
  • [36] A machine learning-based approach for generating high-resolution soil moisture from SMAP products
    Zhang, Yueyuan
    Chen, Yangbo
    Chen, Lingfang
    Xu, Shichao
    Sun, Huaizhang
    GEOCARTO INTERNATIONAL, 2022, 37 (27) : 16086 - 16107
  • [37] A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation
    H. Al-Ali
    A. Cuzzocrea
    E. Damiani
    R. Mizouni
    G. Tello
    Soft Computing, 2020, 24 : 7557 - 7578
  • [38] A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation
    Al-Ali, H.
    Cuzzocrea, A.
    Damiani, E.
    Mizouni, R.
    Tello, G.
    SOFT COMPUTING, 2020, 24 (10) : 7557 - 7578
  • [39] Combining Local Knowledge with Object-Based Machine Learning Techniques for Extracting Informal Settlements from Very High-Resolution Satellite Data
    Alrasheedi, Khlood Ghalib
    Dewan, Ashraf
    El-Mowafy, Ahmed
    EARTH SYSTEMS AND ENVIRONMENT, 2024, 8 (02) : 281 - 296
  • [40] Development of ions adsorption onto nanoparticles from water/ wastewater sources via novel nanocomposite materials: A machine learning-based approach
    Talath, Sirajunisa
    Wali, Adil Farooq
    Sridhar, Sathvik B.
    Hani, Umme
    Alanazi, Muteb
    Alharby, Tareq Nafea
    ADVANCED POWDER TECHNOLOGY, 2024, 35 (06)