Evaluation of Load Prediction Techniques for Distributed Stream Processing

被引:4
|
作者
Gontarska, Kordian [1 ,2 ]
Geldenhuys, Morgan [2 ]
Scheinert, Dominik [2 ]
Wiesner, Philipp [2 ]
Polze, Andreas [1 ]
Thamsen, Lauritz [2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Potsdam, Germany
[2] Tech Univ Berlin, Berlin, Germany
来源
2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021 | 2021年
关键词
Distributed Stream Processing; Resource Management and Optimization; Load Prediction; Time Series Forecasting; Machine Learning;
D O I
10.1109/IC2E52221.2021.00023
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near to real time. They are an essential part of many data-intensive applications and analytics platforms. The rate at which events arrive at DSP systems can vary considerably over time, which may be due to trends, cyclic, and seasonal patterns within the data streams. A priori knowledge of incoming workloads enables proactive approaches to resource management and optimization tasks such as dynamic scaling, live migration of resources, and the tuning of configuration parameters during run-times, thus leading to a potentially better Quality of Service. In this paper we conduct a comprehensive evaluation of different load prediction techniques for DSP jobs. We identify three use-cases and formulate requirements for making load predictions specific to DSP jobs. Automatically optimized classical and Deep Learning methods are being evaluated on nine different datasets from typical DSP domains, i.e. the IoT, Web 2.0, and cluster monitoring. We compare model performance with respect to overall accuracy and training duration. Our results show that the Deep Learning methods provide the most accurate load predictions for the majority of the evaluated datasets.
引用
收藏
页码:91 / 98
页数:8
相关论文
共 50 条
  • [41] A Survey of Distributed Data Stream Processing Frameworks
    Isah, Haruna
    Abughofa, Tariq
    Mahfuz, Sazia
    Ajerla, Dharmitha
    Zulkernine, Farhana
    Khan, Shahzad
    IEEE ACCESS, 2019, 7 : 154300 - 154316
  • [42] Adaptive key partitioning in distributed stream processing
    Gang Liu
    Zeting Wang
    Amelie Chi Zhou
    Rui Mao
    CCF Transactions on High Performance Computing, 2024, 6 : 164 - 178
  • [43] Consistency Maintenance in Distributed Analytical Stream Processing
    Trofimov, Artem
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2018, 2018, 909 : 413 - 422
  • [44] Benchmarking Distributed Stream Data Processing Systems
    Karimov, Jeyhun
    Rabl, Tilmann
    Katsifodimos, Asterios
    Samarev, Roman
    Heiskanen, Henri
    Markl, Volker
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1507 - 1518
  • [45] Accommodating Bursts in Distributed Stream Processing Systems
    Drougas, Yannis
    Kalogeraki, Vana
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 362 - 372
  • [46] Tracing Distributed Data Stream Processing Systems
    Zvara, Zoltan
    Szabo, Peter G. N.
    Hermann, Gabor
    Benczur, Andras
    2017 IEEE 2ND INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2017, : 235 - 242
  • [47] Predictable remote invocations for distributed stream processing
    Basanta-Val, P.
    Fernandez-Garcia, N.
    Sanchez-Fernandez, L.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 107 (716-729): : 716 - 729
  • [48] Distributed stream join query processing with semijoins
    Tran, Tri Minh
    Lee, Byung Suk
    DISTRIBUTED AND PARALLEL DATABASES, 2010, 27 (03) : 211 - 254
  • [49] Distributed stream join query processing with semijoins
    Tri Minh Tran
    Byung Suk Lee
    Distributed and Parallel Databases, 2010, 27 : 211 - 254
  • [50] Distributed Adaptive Windowed Stream Join Processing
    Tri Minh Tran
    Lee, Byung Suk
    INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2011, 2 (02) : 59 - 81