Dynamic Straggler Mitigation for Large-Scale Spatial Simulations

被引:0
|
作者
Bin Khunayn, Eman [1 ]
Xie, Hairuo [2 ]
Karunasekera, Shanika [2 ]
Ramamohanarao, Kotagiri [3 ]
机构
[1] King Abdulaziz City Sci & Technol KACST, Riyadh, Saudi Arabia
[2] Univ Melbourne, Melbourne, Australia
[3] Australian Acad Sci, Canberra, Australia
关键词
Spatial simulation; stragglers; BSP; load balancing; traffic simulation;
D O I
10.1145/3578933
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Spatial simulations have been widely used to study real-world environments, such as transportation systems. Applications like prediction and analysis of transportation require the simulation to handle millions of objects while running faster than real time. Running such large-scale simulation requires high computational power, which can be provided through parallel distributed computing. Implementations of parallel distributed spatial simulations usually follow a bulk synchronous parallel (BSP) model to ensure the correctness of simulation. The processing in BSP is divided into iterations of computation and communication, running on multiple workers, followed by a global barrier synchronisation to ensure that all communications are concluded. Unfortunately, the BSP model is plagued by the straggler problem, where a delay in any worker slows down the entire simulation. Stragglers may occur for many reasons, including imbalanced workload distribution or communication and synchronisation delays. The straggler problem can become more severe with increasing parallelism and continuous change of workload distribution among workers. This article proposes methods to dynamically mitigate stragglers and tackle communication delays. The proposed strategies can rebalance the workload distribution during simulation. These methods employ the spatial properties of the simulated environments to combine a flexible synchronisation model with decentralised dynamic load balancing and on-demand resource allocation. All proposed methods are implemented and evaluated using a microscopic traffic simulator as an example of large-scale spatial simulations. We run traffic simulations for Melbourne, Beijing and New York with different straggler scenarios. Our methods significantly improve simulation performance compared to advanced methods such as global dynamic load balancing.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] An Optimized Straggler Mitigation Framework for Large-Scale Distributed Computing Systems
    Said, Samar A.
    Habashy, Shahira M.
    Salem, Sameh A.
    Saad, Elsayed M.
    IEEE ACCESS, 2022, 10 : 97075 - 97088
  • [2] An Optimized Straggler Mitigation Framework for Large-Scale Distributed Computing Systems
    Said, Samar A.
    Habashy, Shahira M.
    Salem, Sameh A.
    Saad, Elsayed M.
    IEEE Access, 2022, 10 : 97075 - 97088
  • [3] Straggler Mitigation at Scale
    Aktas, Mehmet Fatih
    Soljanin, Emina
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (06) : 2266 - 2279
  • [4] Large-scale atomistic simulations of dynamic fracture
    Vashishta, P
    Kalia, RK
    Nakano, A
    COMPUTING IN SCIENCE & ENGINEERING, 1999, 1 (05) : 56 - 65
  • [5] Efficient Straggler Replication in Large-Scale Parallel Computing
    Wang, Da
    Joshi, Gauri
    Wornell, Gregory W.
    ACM TRANSACTIONS ON MODELING AND PERFORMANCE EVALUATION OF COMPUTING SYSTEMS, 2019, 4 (02)
  • [6] PROBLEMS OF LARGE-SCALE SIMULATIONS
    HENDRICK.F
    EKISTICS-THE PROBLEMS AND SCIENCE OF HUMAN SETTLEMENTS, 1974, 37 (222): : 312 - 315
  • [7] Large-scale simulations of reionization
    Kohler, Katharina
    Gnedin, Nickolay Y.
    Hamilton, Andrew J. S.
    ASTROPHYSICAL JOURNAL, 2007, 657 (01): : 15 - 29
  • [8] LARGE-SCALE SIMULATIONS OF THE HIPPOCAMPUS
    TRAUB, RD
    MILES, R
    WONG, RKS
    IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 1988, 7 (04): : 31 - 38
  • [9] TESTING LARGE-SCALE SIMULATIONS
    BRYAN, OF
    NATRELLA, MC
    BYTE, 1985, 10 (10): : 183 - &
  • [10] Large-scale simulations of the ribosome
    Sanbonmatsu, KY
    Tung, CS
    BIOPHYSICAL JOURNAL, 2004, 86 (01) : 415A - 415A