Data-driven Curation, Learning and Analysis for Inferring Evolving loT Botnets in the Wild

被引:3
|
作者
Pour, Morteza Safaei [1 ]
Mangino, Antonio [1 ]
Friday, Kurt [1 ]
Rathbun, Matthias [1 ]
Bou-Harb, Elias [1 ]
Iqbal, Farkhund [2 ]
Shaban, Khaled [3 ]
Erradi, Abdelkarim [3 ]
机构
[1] Florida Atlantic Univ, Cyber Threat Intelligence Lab, Boca Raton, FL 33431 USA
[2] Zayed Univ, Coll Technol Innovat, Dubai, U Arab Emirates
[3] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
基金
美国国家科学基金会;
关键词
Internet-of-Things; IoT botnets; network security; network telescopes; Internet measurements; deep learning; INTERNET;
D O I
10.1145/3339252.3339272
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The insecurity of the Internet-of-Things (IoT) paradigm continues to wreak havoc in consumer and critical infrastructure realms. Several challenges impede addressing IoT security at large, including, the lack of IoT-centric data that can be collected, analyzed and correlated, due to the highly heterogeneous nature of such devices and their widespread deployments in Internet-wide environments. To this end, this paper explores macroscopic, passive empirical data to shed light on this evolving threat phenomena. This not only aims at classifying and inferring Internet-scale compromised IoT devices by solely observing such one-way network traffic, but also endeavors to uncover, track and report on orchestrated "in the wild" IoT botnets. Initially, to prepare the effective utilization of such data, a novel probabilistic model is designed and developed to cleanse such traffic from noise samples (i.e., misconfiguration traffic). Subsequently, several shallow and deep learning models are evaluated to ultimately design and develop a multi-window convolution neural network trained on active and passive measurements to accurately identify compromised IoT devices. Consequently, to infer orchestrated and unsolicited activities that have been generated by well-coordinated IoT botnets, hierarchical agglomerative clustering is deployed by scrutinizing a set of innovative and efficient network feature sets. By analyzing 3.6 TB of recent darknet traffic, the proposed approach uncovers a momentous 440,000 compromised IoT devices and generates evidence -based artifacts related to 350 IoT botnets. While some of these detected botnets refer to previously documented campaigns such as the Hide and Seek, Ha j ime and Fbot, other events illustrate evolving threats such as those with cryptojacking capabilities and those that are targeting industrial control system communication and control services.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Data-Driven ESP Vocabulary Learning
    Liu, Ping
    2016 2ND INTERNATIONAL CONFERENCE ON MODERN EDUCATION AND SOCIAL SCIENCE (MESS 2016), 2016, : 219 - 225
  • [32] DATA-DRIVEN LEARNING OF NONAUTONOMOUS SYSTEMS
    Qin, Tong
    Chen, Zhen
    Jakeman, John D.
    Xiu, Dongbin
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2021, 43 (03): : A1607 - A1624
  • [33] Data-driven approach for ontology learning
    Ocampo-Guzman, Isidra
    Lopez-Arevalo, Ivan
    Sosa-Sosa, Victor
    2009 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATION CONTROL (CCE 2009), 2009, : 463 - 468
  • [34] A novel data-driven approach on inferring loop invariants for C programs
    Lu, Hong
    Wang, Huitao
    Gui, Jiacheng
    Chen, Panfeng
    Huang, Hao
    JOURNAL OF COMPUTER LANGUAGES, 2022, 71
  • [35] Data-Driven Detection of Phase Changes in Evolving Distribution Systems
    Pena, Bethany D.
    Blakely, Logan
    Reno, Matthew J.
    2022 IEEE TEXAS POWER AND ENERGY CONFERENCE (TPEC), 2021, : 213 - 218
  • [36] Data-driven resolvent analysis
    Herrmann, Benjamin
    Baddoo, Peter J.
    Semaan, Richard
    Brunton, Steven L.
    McKeon, Beverley J.
    JOURNAL OF FLUID MECHANICS, 2021, 918
  • [37] Data-driven analysis of speech
    Hermansky, H
    TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 10 - 18
  • [38] Multiple sensor fault diagnosis by evolving data-driven approach
    El-Koujok, M.
    Benammar, M.
    Meskin, N.
    Al-Naemi, M.
    Langari, R.
    INFORMATION SCIENCES, 2014, 259 : 346 - 358
  • [39] A METHOD FOR DATA-DRIVEN SIMULATIONS OF EVOLVING SOLAR ACTIVE REGIONS
    Cheung, Mark C. M.
    DeRosa, Marc L.
    ASTROPHYSICAL JOURNAL, 2012, 757 (02):
  • [40] Dynamic Data-Driven Avionics Systems: Inferring Failure Modes from Data Streams
    Imai, Shigeru
    Galli, Alessandro
    Varela, Carlos A.
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 1665 - 1674