Speeding up Distributed Request-Response Workflows

被引:74
|
作者
Jalaparti, Virajith
Bodik, Peter [1 ]
Kandula, Srikanth [1 ]
Menache, Ishai [1 ]
Rybalkin, Mikhail
Yan, Chenyu [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
Interactive services; Tail latency; Optimization; Reissues; Partial results;
D O I
10.1145/2534169.2486028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We found that interactive services at Bing have highly variable datacenter-side processing latencies because their processing consists of many sequential stages, parallelization across 10s-1000s of servers and aggregation of responses across the network. To improve the tail latency of such services, we use a few building blocks: reissuing laggards elsewhere in the cluster, new policies to return incomplete results and speeding up laggards by giving them more resources. Combining these building blocks to reduce the overall latency is non-trivial because for the same amount of resource (e.g., number of reissues), different stages improve their latency by different amounts. We present Kwiken, a framework that takes an end-to-end view of latency improvements and costs. It decomposes the problem of minimizing latency over a general processing DAG into a manageable optimization over individual stages. Through simulations with production traces, we show sizable gains; the 99th percentile of latency improves by over 50% when just 0.1% of the responses are allowed to have partial results and by over 40% for 25% of the services when just 5% extra resources are used for reissues.
引用
收藏
页码:219 / 230
页数:12
相关论文
共 50 条
  • [31] Speeding up Scientific Imaging Workflows: Design of Automated Image Annotation Tool
    Colbry, Dirk
    Dyer, Fred
    Dworkin, Ian
    Wang, Yang
    Wang, Lifeng
    2013 1ST IEEE WORKSHOP ON USER-CENTERED COMPUTER VISION (UCCV), 2013, : 13 - 18
  • [32] Speeding Up Distributed Machine Learning Using Codes
    Lee, Kangwook
    Lam, Maximilian
    Pedarsani, Ramtin
    Papailiopoulos, Dimitris
    Ramchandran, Kannan
    2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1143 - 1147
  • [33] Leveraging Coding Techniques for Speeding up Distributed Computing
    Konstantinidis, Konstantinos
    Ramamoorthy, Aditya
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [34] Speeding Up Distributed Machine Learning Using Codes
    Lee, Kangwook
    Lam, Maximilian
    Pedarsani, Ramtin
    Papailiopoulos, Dimitris
    Ramchandran, Kannan
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (03) : 1514 - 1529
  • [35] Speeding up Distributed Low-rank Matrix Factorization
    Qin, Chengjie
    Rusu, Florin
    2013 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CLOUDCOM-ASIA), 2013, : 521 - 528
  • [36] REPLICATION TECHNIQUES FOR SPEEDING UP PARALLEL APPLICATIONS ON DISTRIBUTED SYSTEMS
    BAL, HE
    KAASHOEK, MF
    TANENBAUM, AS
    JANSEN, J
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1992, 4 (05): : 337 - 355
  • [37] A Just-in-Time Networking Framework for Minimizing Request-Response Latency of Wireless Time-Sensitive Applications
    Zhang, Lihao
    Liew, Soung Chang
    Chen, He
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (08) : 7126 - 7142
  • [38] Time-discretization for speeding-up scheduling of deadline-constrained workflows in clouds
    Genez, Thiago A. L.
    Bittencourt, Luiz F.
    Madeira, Edmundo R. M.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 107 : 1116 - 1129
  • [39] Leveraging Coding Techniques and Redundancy for Speeding Up Distributed Computing and Robustifying Distributed Learning
    Konstantinidis, Konstantinos
    ProQuest Dissertations and Theses Global, 2022,
  • [40] Prophet: Speeding up Distributed DNN Training with Predictable Communication Scheduling
    Zhang, Zhenwei
    Qi, Qiang
    Shang, Ruitao
    Chen, Li
    Xu, Fei
    50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,