A Case for Adaptive Resource Management in Alibaba Datacenter Using Neural Networks

被引:0
|
作者
Sa Wang
Yan-Hai Zhu
Shan-Pei Chen
Tian-Ze Wu
Wen-Jie Li
Xu-Sheng Zhan
Hai-Yang Ding
Wei-Song Shi
Yun-Gang Bao
机构
[1] Chinese Academy of Sciences,State Key Laboratory of Computer Architecture, Institute of Computing Technology
[2] University of Chinese Academy of Sciences,Department of Computer Science
[3] Peng Cheng Laboratory,undefined
[4] Alibaba Inc.,undefined
[5] Wayne State University,undefined
关键词
resource management; neural network; resource efficiency; tail latency;
D O I
暂无
中图分类号
学科分类号
摘要
Both resource efficiency and application QoS have been big concerns of datacenter operators for a long time, but remain to be irreconcilable. High resource utilization increases the risk of resource contention between co-located workload, which makes latency-critical (LC) applications suffer unpredictable, and even unacceptable performance. Plenty of prior work devotes the effort on exploiting effective mechanisms to protect the QoS of LC applications while improving resource efficiency. In this paper, we propose MAGI, a resource management runtime that leverages neural networks to monitor and further pinpoint the root cause of performance interference, and adjusts resource shares of corresponding applications to ensure the QoS of LC applications. MAGI is a practice in Alibaba datacenter to provide on-demand resource adjustment for applications using neural networks. The experimental results show that MAGI could reduce up to 87.3% performance degradation of LC application when co-located with other antagonist applications.
引用
收藏
页码:209 / 220
页数:11
相关论文
共 50 条
  • [41] Adaptive selection of dynamic VM consolidation algorithm using neural network for cloud resource management
    Witanto, Joseph Nathanael
    Lim, Hyotaek
    Atiquzzaman, Mohammed
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 87 : 35 - 42
  • [42] Adaptive Spectrum Aggregation for Opportunistic Resource Management in Multichannel Networks
    Chabalala, Chabalala S.
    Takawira, Fambirai
    2017 IEEE AFRICON, 2017, : 161 - 166
  • [43] Adaptive resource management system for home-area networks
    Okamura, H
    21ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS, PROCEEDINGS, 2001, : 187 - 192
  • [44] Adaptive Resource Allocation for Interference Management in Small Cell Networks
    Elsherif, Ahmed R.
    Chen, Wei-Peng
    Ito, Akira
    Ding, Zhi
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2015, 63 (06) : 2107 - 2125
  • [45] Reliable Adaptive Resource Management for Cognitive Cloud Vehicular Networks
    Cordeschi, Nicola
    Amendola, Danilo
    Baccarelli, Enzo
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2015, 64 (06) : 2528 - 2537
  • [46] Adaptive resource allocation for multimedia QoS management in wireless networks
    Huang, L
    Kumar, S
    Kuo, CCJ
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2004, 53 (02) : 547 - 558
  • [47] Iterative neural networks for adaptive inference on resource-constrained devices
    Sam Leroux
    Tim Verbelen
    Pieter Simoens
    Bart Dhoedt
    Neural Computing and Applications, 2022, 34 : 10321 - 10336
  • [48] Iterative neural networks for adaptive inference on resource-constrained devices
    Leroux, Sam
    Verbelen, Tim
    Simoens, Pieter
    Dhoedt, Bart
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (13): : 10321 - 10336
  • [49] ADAPTIVE OPTIMIZATION AND CONTROL USING NEURAL NETWORKS
    MEAD, WC
    BROWN, SK
    JONES, RD
    BOWLING, PS
    BARNES, CW
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 1994, 352 (1-2): : 309 - 315
  • [50] Personalized Adaptive Learning using Neural Networks
    Chaplot, Devendra Singh
    Rhim, Eunhee
    Kim, Jihie
    PROCEEDINGS OF THE THIRD (2016) ACM CONFERENCE ON LEARNING @ SCALE (L@S 2016), 2016, : 165 - 168