Data Distribution and Scheduling for Distributed Analytics Tasks

被引:0
|
作者
Pasteris, Stephen [1 ]
Wang, Shiqiang [2 ]
Makaya, Christian [2 ]
Chan, Kevin [3 ]
Herbster, Mark [1 ]
机构
[1] UCL, Dept Comp Sci, London, England
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
[3] US Army, Res Lab, Adelphi, MD USA
关键词
Data placement; Internet of Things (IoT); maximum flow problem; mobile edge computing; optimization; FLOW; ALGORITHM;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider a distributed edge computing system where we have a number of interconnected machines with limited communication bandwidth and storage capacity. Analytics tasks run on the machines, where each task runs on a single machine but may require data from multiple other machines. Every task requires a given amount of data to run, and it needs to receive all its data within a specific deadline. The application scenario is that each machine has limited storage, thus we usually cannot place the entire amount of data for a specific task on a single machine that executes the task. We assume that the task execution is sparse in time, so that at most one task is executed in the system at any time. The problem we study in this paper is how to distribute the data on machines in the system, without violating the bandwidth and storage constraints, while ensuring that the data transfer deadlines are met. We prove that the optimal solution to this problem is equivalent to that of a max-flow problem on a specifically constructed graph. We present how to construct this graph so that the problem can be solved using standard algorithms for max-flow problems, and also provide some numerical results and further discussions.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Distributed Big Data Analytics in the Internet of Signals
    Anavangot, Vijay
    Menon, Varun G.
    Nayyar, Anand
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON SYSTEM MODELING & ADVANCEMENT IN RESEARCH TRENDS (SMART), 2018, : 73 - 77
  • [32] Distributed algorithm for big data analytics in healthcare
    Forestiero, Agostino
    Papuzzo, Giuseppe
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 776 - 779
  • [33] Distributed Big Data Analytics in Service Computing
    Yu, Weider D.
    Gottumukkala, AvinashChander
    Senthailselvi, Deenash Arivazhagan
    Maniraj, Prabhu
    Khonde, Tushar
    2017 IEEE 13TH INTERNATIONAL SYMPOSIUM ON AUTONOMOUS DECENTRALIZED SYSTEMS (ISADS 2017), 2017, : 55 - 60
  • [34] Data Analytics Algorithm Benchmark on Distributed Systems
    Hamid, Mohd Hakim Abdul
    Abu, Nur Azman
    Mohamad, Siti Nurul Mahfuzah
    Idris, Ariff
    Zakaria, Zahriladha
    Sulaiman, Zuraidah
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY (ICAST'18), 2018, 2016
  • [35] Distributed Data Analytics Framework for Smart Transportation
    Howard, Alexander J.
    Lee, Tim
    Mahar, Sara
    Intrevado, Paul
    Myung-kyung, Diane
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1374 - 1380
  • [36] Pangea: Monolithic Distributed Storage for Data Analytics
    Zou, Jia
    Iyengar, Arun
    Jermaine, Chris
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (06): : 681 - 694
  • [37] Visually Programming Dataflows for Distributed Data Analytics
    Thamsen, Lauritz
    Renner, Thomas
    Byfeld, Marvin
    Paeschke, Markus
    Schroeder, Daniel
    Boehm, Felix
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2285 - 2294
  • [38] Parallel Scheduling of Data-Intensive Tasks
    Meng, Xiao
    Golab, Lukasz
    EURO-PAR 2020: PARALLEL PROCESSING, 2020, 12247 : 117 - 133
  • [39] Scheduling of Periodic Tasks with Data Dependency on Multiprocessors
    Wang, Jinlin
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 699 - 702
  • [40] Scheduling imprecise tasks in real-time distributed systems
    de Oliveira, WS
    Fraga, JD
    Farines, JM
    FOURTH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2001, : 319 - 326