A Case for Adaptive Resource Management in Alibaba Datacenter Using Neural Networks

被引:0
|
作者
Sa Wang
Yan-Hai Zhu
Shan-Pei Chen
Tian-Ze Wu
Wen-Jie Li
Xu-Sheng Zhan
Hai-Yang Ding
Wei-Song Shi
Yun-Gang Bao
机构
[1] Chinese Academy of Sciences,State Key Laboratory of Computer Architecture, Institute of Computing Technology
[2] University of Chinese Academy of Sciences,Department of Computer Science
[3] Peng Cheng Laboratory,undefined
[4] Alibaba Inc.,undefined
[5] Wayne State University,undefined
关键词
resource management; neural network; resource efficiency; tail latency;
D O I
暂无
中图分类号
学科分类号
摘要
Both resource efficiency and application QoS have been big concerns of datacenter operators for a long time, but remain to be irreconcilable. High resource utilization increases the risk of resource contention between co-located workload, which makes latency-critical (LC) applications suffer unpredictable, and even unacceptable performance. Plenty of prior work devotes the effort on exploiting effective mechanisms to protect the QoS of LC applications while improving resource efficiency. In this paper, we propose MAGI, a resource management runtime that leverages neural networks to monitor and further pinpoint the root cause of performance interference, and adjusts resource shares of corresponding applications to ensure the QoS of LC applications. MAGI is a practice in Alibaba datacenter to provide on-demand resource adjustment for applications using neural networks. The experimental results show that MAGI could reduce up to 87.3% performance degradation of LC application when co-located with other antagonist applications.
引用
收藏
页码:209 / 220
页数:11
相关论文
共 50 条
  • [11] An Adaptive Resource Management Architecture for Active Networks
    Fariza Sabrina
    Sanjay Jha
    Telecommunication Systems, 2003, 24 : 139 - 166
  • [12] Adaptive resource management platform for reconfigurable networks
    University of Piraeus, Piraeus, Greece
    不详
    不详
    不详
    不详
    Multibody Syst Dyn, 2006, 4 (799-811):
  • [13] An adaptive resource management architecture for active networks
    Sabrina, F
    Jha, S
    TELECOMMUNICATION SYSTEMS, 2003, 24 (2-4) : 139 - 166
  • [14] Adaptive resource management for multimedia wireless networks
    Seth, M
    Fapojuwo, AO
    2003 IEEE 58TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS1-5, PROCEEDINGS, 2003, : 1668 - 1672
  • [15] Fully Adaptive Resource Management in Radar Networks
    Oechslin, Roland
    Wieland, Sebastian
    Zutter, Andreas
    Aulenbacher, Uwe
    Wellig, Peter
    2020 IEEE RADAR CONFERENCE (RADARCONF20), 2020,
  • [16] Cloud resource management using adaptive firefly algorithm and artificial neural network
    Manigandan S.K.
    Manjula S.
    Nagaraju V.
    Ramya D.
    TapasBapu B.R.
    International Journal of Cloud Computing, 2022, 11 (5-6) : 480 - 491
  • [17] DARD: Distributed Adaptive Routing for Datacenter Networks
    Wu, Xin
    Yang, Xiaowei
    2012 IEEE 32ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2012, : 32 - 41
  • [18] Adaptive Resource Management in Mobile Wireless Networks Using Feedback Control Theory
    Monir Hossain
    Mahbub Hassan
    Harsha R. Sirisena
    Telecommunication Systems, 2004, 25 : 401 - 415
  • [19] Adaptive resource management in mobile wireless networks using feedback control theory
    Hossain, M
    Hassan, M
    Sirisena, HR
    TELECOMMUNICATION SYSTEMS, 2004, 25 (3-4) : 401 - 415
  • [20] Adaptive flow scheduling for modular datacenter networks
    Xingyan Zhang
    Peer-to-Peer Networking and Applications, 2017, 10 : 1142 - 1151