Speeding up Distributed Request-Response Workflows

被引:74
|
作者
Jalaparti, Virajith
Bodik, Peter [1 ]
Kandula, Srikanth [1 ]
Menache, Ishai [1 ]
Rybalkin, Mikhail
Yan, Chenyu [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
Interactive services; Tail latency; Optimization; Reissues; Partial results;
D O I
10.1145/2534169.2486028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We found that interactive services at Bing have highly variable datacenter-side processing latencies because their processing consists of many sequential stages, parallelization across 10s-1000s of servers and aggregation of responses across the network. To improve the tail latency of such services, we use a few building blocks: reissuing laggards elsewhere in the cluster, new policies to return incomplete results and speeding up laggards by giving them more resources. Combining these building blocks to reduce the overall latency is non-trivial because for the same amount of resource (e.g., number of reissues), different stages improve their latency by different amounts. We present Kwiken, a framework that takes an end-to-end view of latency improvements and costs. It decomposes the problem of minimizing latency over a general processing DAG into a manageable optimization over individual stages. Through simulations with production traces, we show sizable gains; the 99th percentile of latency improves by over 50% when just 0.1% of the responses are allowed to have partial results and by over 40% for 25% of the services when just 5% extra resources are used for reissues.
引用
收藏
页码:219 / 230
页数:12
相关论文
共 50 条
  • [21] Symbolic synthesis of finite-state controllers for request-response specifications
    Wallmeier, N
    Hütten, P
    Thomas, W
    IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2003, 2759 : 11 - 22
  • [22] COBRRA: COntention-aware cache Bypass with Request-Response Arbitration
    Bagchi, Aritra
    Joshi, Dinesh
    Panda, Preeti Ranjan
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (01)
  • [23] CoAP-Based Request-Response Interaction Model for the Internet of Things
    Khan, Fazlullah
    Rahman, Izaz Ur
    Khan, Mukhtaj
    Iqbal, Nadeem
    Alam, Muhammad
    FUTURE INTELLIGENT VEHICULAR TECHNOLOGIES, FUTURE 5V 2016, 2017, 185 : 146 - 156
  • [24] gAUDIT: A Group Communication-capable Request-Response Middleware for Auditing Clouds
    Flittner, Matthias
    Weigel, Alexander
    Zitterbart, Martina
    2017 INTERNATIONAL CONFERENCE ON NETWORKED SYSTEMS (NETSYS), 2017,
  • [25] A COMPARISON OF REQUEST-RESPONSE SEQUENCES IN THE DISCOURSE OF NORMAL AND LANGUAGE-DISORDERED CHILDREN
    BRINTON, B
    FUJIKI, M
    JOURNAL OF SPEECH AND HEARING DISORDERS, 1982, 47 (01): : 57 - 62
  • [26] Speeding Up in Distributed SystemC Simulations
    Galiano, V.
    Migallon, H.
    Perez-Caparros, D.
    Palomino, J. A.
    Martinez, M.
    INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE 2008, 2009, 50 : 24 - +
  • [27] A Communication Model to Integrate the Request-Response and the Publish-Subscribe Paradigms into Ubiquitous Systems
    Rodriguez-Dominguez, Carlos
    Benghazi, Kawtar
    Noguera, Manuel
    Luis Garrido, Jose
    Luisa Rodriguez, Maria
    Ruiz-Lopez, Tomas
    SENSORS, 2012, 12 (06) : 7648 - 7668
  • [28] Resolvable Designs for Speeding Up Distributed Computing
    Konstantinidis, Konstantinos
    Ramamoorthy, Aditya
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2020, 28 (04) : 1657 - 1670
  • [29] Improving Internet Video Streaming Performance by Parallel TCP-based Request-Response Streams
    Kuschnig, Robert
    Kofler, Ingo
    Hellwagner, Hermann
    2010 7TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE-CCNC 2010, 2010, : 200 - 204
  • [30] Speeding up epidemic emergency response
    Rottingen, John-Arne
    Godal, Tore
    SCIENCE, 2015, 350 (6257) : 170 - 170