Data-driven Curation, Learning and Analysis for Inferring Evolving loT Botnets in the Wild

被引:3
|
作者
Pour, Morteza Safaei [1 ]
Mangino, Antonio [1 ]
Friday, Kurt [1 ]
Rathbun, Matthias [1 ]
Bou-Harb, Elias [1 ]
Iqbal, Farkhund [2 ]
Shaban, Khaled [3 ]
Erradi, Abdelkarim [3 ]
机构
[1] Florida Atlantic Univ, Cyber Threat Intelligence Lab, Boca Raton, FL 33431 USA
[2] Zayed Univ, Coll Technol Innovat, Dubai, U Arab Emirates
[3] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
基金
美国国家科学基金会;
关键词
Internet-of-Things; IoT botnets; network security; network telescopes; Internet measurements; deep learning; INTERNET;
D O I
10.1145/3339252.3339272
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The insecurity of the Internet-of-Things (IoT) paradigm continues to wreak havoc in consumer and critical infrastructure realms. Several challenges impede addressing IoT security at large, including, the lack of IoT-centric data that can be collected, analyzed and correlated, due to the highly heterogeneous nature of such devices and their widespread deployments in Internet-wide environments. To this end, this paper explores macroscopic, passive empirical data to shed light on this evolving threat phenomena. This not only aims at classifying and inferring Internet-scale compromised IoT devices by solely observing such one-way network traffic, but also endeavors to uncover, track and report on orchestrated "in the wild" IoT botnets. Initially, to prepare the effective utilization of such data, a novel probabilistic model is designed and developed to cleanse such traffic from noise samples (i.e., misconfiguration traffic). Subsequently, several shallow and deep learning models are evaluated to ultimately design and develop a multi-window convolution neural network trained on active and passive measurements to accurately identify compromised IoT devices. Consequently, to infer orchestrated and unsolicited activities that have been generated by well-coordinated IoT botnets, hierarchical agglomerative clustering is deployed by scrutinizing a set of innovative and efficient network feature sets. By analyzing 3.6 TB of recent darknet traffic, the proposed approach uncovers a momentous 440,000 compromised IoT devices and generates evidence -based artifacts related to 350 IoT botnets. While some of these detected botnets refer to previously documented campaigns such as the Hide and Seek, Ha j ime and Fbot, other events illustrate evolving threats such as those with cryptojacking capabilities and those that are targeting industrial control system communication and control services.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] On data-driven curation, learning, and analysis for inferring evolving internet-of-Things (IoT) botnets in the wild
    Pour, Morteza Safaei
    Mangino, Antonio
    Friday, Kurt
    Rathbun, Matthias
    Bou-Harb, Elias
    Iqbal, Farkhund
    Samtani, Sagar
    Crichigno, Jorge
    Ghani, Nasir
    COMPUTERS & SECURITY, 2020, 91
  • [2] A Data-Driven Analysis of Behaviors in Data Curation Processes
    Han, Lei
    Chen, Tianwa
    Demartini, Gianluca
    Indulska, Marta
    Sadiq, Shazia
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
  • [3] EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions
    Heid, Esther
    Probst, Daniel
    Green, William H.
    Madsen, Georg K. H.
    CHEMICAL SCIENCE, 2023, 14 (48) : 14229 - 14242
  • [4] Inferring and Improving Street Maps with Data-Driven Automation
    Bastani, Favyen
    He, Songtao
    Jagwani, Satvat
    Park, Edward
    Abbar, Sofiane
    Alizadeh, Mohammad
    Balakrishnan, Hari
    Chawla, Sanjay
    Madden, Sam
    Sadeghi, Mohammad Amin
    COMMUNICATIONS OF THE ACM, 2021, 64 (11) : 109 - 117
  • [5] Avant-garde: an automated data-driven DIA data curation tool
    Alvaro Sebastian Vaca Jacome
    Ryan Peckner
    Nicholas Shulman
    Karsten Krug
    Katherine C. DeRuff
    Adam Officer
    Karen E. Christianson
    Brendan MacLean
    Michael J. MacCoss
    Steven A. Carr
    Jacob D. Jaffe
    Nature Methods, 2020, 17 : 1237 - 1244
  • [6] Avant-garde: an automated data-driven DIA data curation tool
    Vaca Jacome, Alvaro Sebastian
    Peckner, Ryan
    Shulman, Nicholas
    Krug, Karsten
    DeRuff, Katherine C.
    Officer, Adam
    Christianson, Karen E.
    MacLean, Brendan
    MacCoss, Michael J.
    Carr, Steven A.
    Jaffe, Jacob D.
    NATURE METHODS, 2020, 17 (12) : 1237 - +
  • [7] A scalable learning algorithm for data-driven program analysis
    Cha, Sooyoung
    Jeong, Sehun
    Oh, Hakjoo
    INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 104 : 1 - 13
  • [8] Data-Driven Loop Bound Learning for Termination Analysis
    Xu, Rongchen
    Chen, Jianhui
    He, Fei
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 499 - 510
  • [9] Data-Driven Bifurcation Analysis via Learning of Homeomorphism
    Tang, Wentao
    6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 1149 - 1160
  • [10] Collection development or data-driven content curation? An exploratory project in Manchester
    Kirkwood, Rachel Joy
    LIBRARY MANAGEMENT, 2016, 37 (4-5) : 275 - 284