Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

被引:5
|
作者
Esfahanizadeh, Homa [1 ]
Cohen, Alejandro [2 ]
Medard, Muriel [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Technion Israel Inst Technol, Haifa, Israel
关键词
distributed systems; coded computation; heterogeneous; straggler; scheduling;
D O I
10.1109/INFOCOM48880.2022.9796977
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several workers, which brings up the major challenge of coping with delays and failures caused by the system's heterogeneity and uncertainties. In particular, minimizing the end-to-end job in-order execution delay, from arrival to delivery, is of great importance for real-world delay-sensitive applications. In this paper, for computation of each job iteration in a stochastic heterogeneous distributed system where the workers vary in their computing and communicating powers, we present a novel joint scheduling-coding framework that optimally split the coded computational load among the workers. This closes the gap between the workers' response time, and is critical to maximize the resource utilization. To further reduce the in-order execution delay, we also incorporate redundant computations in each iteration of a distributed computational job. Our simulation results demonstrate that the delay obtained using the proposed solution is dramatically lower than the uniform split which is oblivious to the system's heterogeneity and, in fact, is very close to an ideal lower bound just by introducing a small percentage of redundant computations.
引用
收藏
页码:230 / 239
页数:10
相关论文
共 50 条
  • [1] Stream Distributed Coded Computing
    Cohen A.
    Thiran G.
    Esfahanizadeh H.
    Medard M.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (03): : 1025 - 1040
  • [2] On Heterogeneous Coded Distributed Computing
    Kiamari, Mehrdad
    Wang, Chenwei
    Avestimehr, A. Salman
    GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
  • [3] Learning Auction in Coded Distributed Computing with Heterogeneous User Demands
    Liang, Jiawei
    Li, Juan
    Zhu, Kun
    Yi, Changyan
    2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [4] On Batch-Processing Based Coded Computing for Heterogeneous Distributed Computing Systems
    Wang, Baoqian
    Xie, Junfei
    Lu, Kejie
    Wan, Yan
    Fu, Shengli
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2021, 8 (03): : 2438 - 2454
  • [5] Computing Resource Allocation for Heterogeneous Coded Distributed Computing
    Dai, Mingjun
    Yuan, Jialong
    Tong, Yanli
    Wang, Lan
    Lin, Xiaohui
    2022 31ST WIRELESS AND OPTICAL COMMUNICATIONS CONFERENCE (WOCC), 2022, : 18 - 23
  • [6] Coded Distributed Computing with Heterogeneous Function Assignments
    Woolsey, Nicholas
    Chen, Rong-Rong
    Ji, Mingyue
    ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
  • [7] Cascaded Coded Distributed Computing on Heterogeneous Networks
    Woolsey, Nicholas
    Chen, Rong-Rong
    Ji, Mingyue
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 2644 - 2648
  • [8] DEVELOPMENT OF THE DISTRIBUTED COMPUTING SYSTEMS AND RUNNING APPLICATIONS IN THE HETEROGENEOUS COMPUTING ENVIRONMENT
    Bogdanov, A. V.
    Lazarev, A.
    Tun, Myo Tun
    Htut, La Min
    DISTRIBUTED COMPUTING AND GRID-TECHNOLOGIES IN SCIENCE AND EDUCATION, 2010, : 69 - 74
  • [9] Coded Distributed Computing With Predictive Heterogeneous User Demands: A Learning Auction Approach
    Zhu, Kun
    Liang, Jiawei
    Li, Juan
    Yi, Changyan
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2022, 40 (08) : 2426 - 2439
  • [10] A New Combinatorial Coded Design for Heterogeneous Distributed Computing
    Woolsey, Nicholas
    Chen, Rong-Rong
    Ji, Mingyue
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (09) : 5672 - 5685