Stream Distributed Coded Computing

被引:4
|
作者
Cohen A. [1 ]
Thiran G. [2 ]
Esfahanizadeh H. [1 ]
Medard M. [1 ]
机构
[1] Research Laboratory of Electronic, Massachusetts Institute of Technology, Cambridge, 02139, MA
[2] ICTEAM, Université Catholique de Louvain, Louvain-la-Neuve
关键词
Distributed coded computation; in-order execution delay; large matrix-matrix multiplication; large matrix-vector multiplication; queuing theory; stragglers; ultra-reliable low-latency;
D O I
10.1109/JSAIT.2021.3102279
中图分类号
学科分类号
摘要
The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the stragglers. To address this challenge, introducing efficient amount of redundant computations via distributed coded computation has received significant attention. Recent approaches in this area have mainly focused on introducing minimum computational redundancies to tolerate certain number of stragglers. To the best of our knowledge, the current literature lacks a unified end-to-end design in a heterogeneous setting where the workers can vary in their computation and communication capabilities. The contribution of this paper is to devise a novel framework for joint scheduling-coding, in a setting where the workers and the arrival of stream computational jobs are based on stochastic models. In our initial joint scheme, we propose a systematic framework that illustrates how to select a set of workers and how to split the computational load among the selected workers based on their differences in order to minimize the average in-order job execution delay. Through simulations, we demonstrate that the performance of our framework is dramatically better than the performance of naive method that splits the computational load uniformly among the workers, and it is close to the ideal performance. © 2020 IEEE.
引用
收藏
页码:1025 / 1040
页数:15
相关论文
共 50 条
  • [1] Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems
    Esfahanizadeh, Homa
    Cohen, Alejandro
    Medard, Muriel
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 230 - 239
  • [2] Distributed Decoding for Coded Distributed Computing
    Yazdanialahabadi, Arash
    Ardakani, Masoud
    IEEE INTERNET OF THINGS JOURNAL, 2021, 9 (14) : 12555 - 12562
  • [3] Compressed Coded Distributed Computing
    Li, Songze
    Maddah-Ali, Mohammad Ali
    Avestimehr, A. Salman
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 2032 - 2036
  • [4] Secure Coded Distributed Computing
    Sasi, Shanuja
    Giinlii, Onur
    2024 IEEE 25TH INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, SPAWC 2024, 2024, : 811 - 815
  • [5] Compressed Coded Distributed Computing
    Elkordy, Ahmed Roushdy
    Li, Songze
    Maddah-Ali, Mohammad Ali
    Avestimehr, A. Salman
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (05) : 2773 - 2783
  • [6] Topological Coded Distributed Computing
    Wan, Kai
    Ji, Mingyue
    Caire, Giuseppe
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [7] On Heterogeneous Coded Distributed Computing
    Kiamari, Mehrdad
    Wang, Chenwei
    Avestimehr, A. Salman
    GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
  • [8] Weakly Secure Coded Distributed Computing
    Zhao, Ruimin
    Wang, Jin
    Lu, Kejie
    Wang, Jianping
    Wang, Xiumin
    Zhou, Jingya
    Cao, Chunming
    2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 603 - 610
  • [9] Coded Distributed Computing With Partial Recovery
    Ozfatura, Emre
    Ulukus, Sennur
    Gunduz, Deniz
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (03) : 1945 - 1959
  • [10] Coded Computing for Distributed Graph Analytics
    Prakash, Saurav
    Reisizadeh, Amirhossein
    Pedarsani, Ramtin
    Avestimehr, Salman
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 1221 - 1225