Online Resource Management in Thermal and Energy Constrained Heterogeneous High Performance Computing

被引:6
|
作者
Oxley, Mark A. [1 ]
Pasricha, Sudeep [2 ]
Maciejewski, Anthony A. [1 ]
Siegel, Howard Jay [1 ,2 ]
Burns, Patrick J. [3 ]
机构
[1] Colorado State Univ, Dept Elect & Comp Engn, Ft Collins, CO 80523 USA
[2] Colorado State Univ, Dept Comp Sci, Ft Collins, CO 80523 USA
[3] Colorado State Univ, Informat Technol, Ft Collins, CO 80523 USA
关键词
heterogeneous computing; resource management; thermal-aware computing; energy-aware computing; HPC; DVFS; DATA CENTERS; POWER;
D O I
10.1109/DASC-PICom-DataCom-CyberSciTec.2016.111
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Operators of high-performance computing (HPC) facilities face conflicting trade-offs between the operating temperature of the facility, reliability of compute nodes, energy costs, and computing performance. Intelligent management of the HPC facility typically involves taking a proactive approach by predicting the thermal implications of allocating tasks to different cores around the facility. This offers the benefit of operating the HPC facility at a hotter CRAC temperature while avoiding hotspots. However, such an approach can be a time-consuming process that requires complicated air flow models to be calculated for every mapping decision. We propose a framework in which offline analysis is used to assist an online resource manager by predicting the thermal implications of mapping a given workload. The goal is to maximize the reward earned from completing tasks by their individual deadlines throughout the day, while adhering to a daily energy budget and temperature threshold constraints. We show that our proposed techniques can earn significantly greater reward than traditional load balancing and thermal management schemes.
引用
收藏
页码:604 / 611
页数:8
相关论文
共 50 条
  • [41] Energy-Aware Resource Management for Computing Systems
    Siegel, Howard Jay
    Khemka, Bhavesh
    Friese, Ryan
    Pasricha, Sudeep
    Maciejewski, Anthony A.
    Koenig, Gregory A.
    Powers, Sarah
    Hilton, Marcia
    Rambharos, Rajendra
    Okonski, Gene
    Poole, Steve
    2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 7 - 12
  • [42] Energy-Aware Resource Management for Computing Systems
    Siegel, H. J.
    2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : XI - XII
  • [43] PETS: Performance, Energy and Thermal Aware Scheduler for Job Mapping with Resource Allocation in Heterogeneous Systems
    Alsubaihi, Shouq
    Gaudiot, Jean-Luc
    2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
  • [44] Energy-Efficient Resource Utilization for Heterogeneous Embedded Computing Systems
    Huang, Jing
    Li, Renfa
    An, Jiyao
    Ntalasha, Derrick
    Yang, Fan
    Li, Keqin
    IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (09) : 1518 - 1531
  • [45] Are CORBA services ready to support resource management middleware for heterogeneous computing?
    Duman, A
    Hensgen, D
    St John, D
    Kidd, T
    (HCW '99) - EIGHTH HETEROGENEOUS COMPUTING WORKSHOP, PROCEEDINGS, 1999, : 83 - 96
  • [46] Resource management in blockchain-enabled heterogeneous edge computing system
    Zhang P.
    Li S.
    Liu Y.
    Qin X.
    Xu X.
    Tongxin Xuebao/Journal on Communications, 2020, 41 (10): : 1 - 14
  • [47] A Distributed Cloud Resource Management Framework for High-Performance Computing (HPC) Applications
    Govindarajan, Kannan
    Kumar, Vivekanandan Suresh
    Somasundaram, Thamarai Selvi
    2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2017, : 1 - 6
  • [48] Online Energy Management for Smart Communities with Heterogeneous Demands
    Cao, Yongsheng
    Zhang, Guanglin
    Li, Demin
    Wang, Lin
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [49] Energy Optimization Heuristics for Budget-Constrained Workflow in Heterogeneous Computing System
    Jiang, Junqiang
    Li, Wenbin
    Pai, Li
    Yang, Bo
    Peng, Xin
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28 (09)
  • [50] Performance Evaluation of Resource Management in Cloud Computing Environments
    Batista, Bruno Guazzelli
    Estrella, Julio Cezar
    Gomes Ferreira, Carlos Henrique
    Leite Filho, Dionisio Machado
    Vasconcelos Nakamura, Luis Hideo
    Reiff-Marganiec, Stephan
    Santana, Marcos Jose
    Carlucci Santana, Regina Helena
    PLOS ONE, 2015, 10 (11):