Data-driven Curation, Learning and Analysis for Inferring Evolving loT Botnets in the Wild

被引:3
|
作者
Pour, Morteza Safaei [1 ]
Mangino, Antonio [1 ]
Friday, Kurt [1 ]
Rathbun, Matthias [1 ]
Bou-Harb, Elias [1 ]
Iqbal, Farkhund [2 ]
Shaban, Khaled [3 ]
Erradi, Abdelkarim [3 ]
机构
[1] Florida Atlantic Univ, Cyber Threat Intelligence Lab, Boca Raton, FL 33431 USA
[2] Zayed Univ, Coll Technol Innovat, Dubai, U Arab Emirates
[3] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
基金
美国国家科学基金会;
关键词
Internet-of-Things; IoT botnets; network security; network telescopes; Internet measurements; deep learning; INTERNET;
D O I
10.1145/3339252.3339272
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The insecurity of the Internet-of-Things (IoT) paradigm continues to wreak havoc in consumer and critical infrastructure realms. Several challenges impede addressing IoT security at large, including, the lack of IoT-centric data that can be collected, analyzed and correlated, due to the highly heterogeneous nature of such devices and their widespread deployments in Internet-wide environments. To this end, this paper explores macroscopic, passive empirical data to shed light on this evolving threat phenomena. This not only aims at classifying and inferring Internet-scale compromised IoT devices by solely observing such one-way network traffic, but also endeavors to uncover, track and report on orchestrated "in the wild" IoT botnets. Initially, to prepare the effective utilization of such data, a novel probabilistic model is designed and developed to cleanse such traffic from noise samples (i.e., misconfiguration traffic). Subsequently, several shallow and deep learning models are evaluated to ultimately design and develop a multi-window convolution neural network trained on active and passive measurements to accurately identify compromised IoT devices. Consequently, to infer orchestrated and unsolicited activities that have been generated by well-coordinated IoT botnets, hierarchical agglomerative clustering is deployed by scrutinizing a set of innovative and efficient network feature sets. By analyzing 3.6 TB of recent darknet traffic, the proposed approach uncovers a momentous 440,000 compromised IoT devices and generates evidence -based artifacts related to 350 IoT botnets. While some of these detected botnets refer to previously documented campaigns such as the Hide and Seek, Ha j ime and Fbot, other events illustrate evolving threats such as those with cryptojacking capabilities and those that are targeting industrial control system communication and control services.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Process safety enhancements for data-driven evolving fuzzy models
    Lughofer, Edwin
    2006 INTERNATIONAL SYMPOSIUM ON EVOLVING FUZZY SYSTEMS, PROCEEDINGS, 2006, : 42 - 48
  • [42] Data-driven and physics-constrained learning for inferring physical parameters and solving forward problems of heat conduction
    Zhang, Bo
    Li, Zhen
    APPLIED THERMAL ENGINEERING, 2025, 264
  • [43] Data-Driven Learning-Based Fault Tolerant Stability Analysis
    Ge Lei
    Chen Shun
    COMPLEXITY, 2020, 2020
  • [44] Sensitivity Analysis of the Composite Data-Driven Pipelines in the Automated Machine Learning
    Barabanova, Irina, V
    Vychuzhanin, Pavel
    Nikitin, Nikolay O.
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 484 - 493
  • [45] Data-Driven Guided Attention for Analysis of Physiological Waveforms With Deep Learning
    Martinez, Jonathan
    Nowroozilarki, Zhale
    Jafari, Roozbeh
    Mortazavi, Bobak J.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5482 - 5493
  • [46] Data-Driven Passivity Analysis and Fault Detection Using Reinforcement Learning
    Ma, Haoran
    Zhao, Zhengen
    Li, Zhuyuan
    Yang, Ying
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024,
  • [47] Control analysis and synthesis of data-driven learning for uncertain linear systems
    Meng, Deyuan
    AUTOMATICA, 2023, 148
  • [48] Data-Driven Operator Theoretic Methods for Phase Space Learning and Analysis
    Sai Pushpak Nandanoori
    Subhrajit Sinha
    Enoch Yeung
    Journal of Nonlinear Science, 2022, 32
  • [49] Data-Driven Suitability Analysis to Enable Machine Learning Explainability and Security
    Wolf, Shaya
    Foster, Rita
    Haile, Jed
    Borowczak, Mike
    2021 RESILIENCE WEEK (RWS), 2021,
  • [50] Data-Driven Supervised Learning for Life Science Data
    Muench, Maximilian
    Raab, Christoph
    Biehl, Michael
    Schleif, Frank-Michael
    FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2020, 6