Strategies for maximizing utilization on multi-CPU and multi-GPU heterogeneous architectures

被引:0
|
作者
Angeles Navarro
Antonio Vilches
Francisco Corbera
Rafael Asenjo
机构
[1] University of Malaga,Department of Computer Architecture
[2] Universidad de Málaga,Andalucía Tech, Department of Computer Architecture
来源
关键词
Heterogeneous computing; Dynamic scheduling; Adaptive partitioning; Task parallelism; Oversubscription; Synchronization;
D O I
暂无
中图分类号
学科分类号
摘要
This paper explores the possibility of efficiently executing a single application using multicores simultaneously with multiple GPU accelerators under a parallel task programming paradigm. In particular, we address the challenge of extending a parallel_for template to allow its exploitation on heterogeneous architectures. Due to the asymmetry of the computing resources, we propose in this work a dynamic scheduling strategy coupled with an adaptive partitioning scheme that resizes chunks to prevent underutilization and load imbalance of CPUs and GPUs. In this paper we also address the problem of the underutilization of the CPU core where a host thread operates. To solve it, we propose two different approaches: (1) a collaborative host thread strategy, in which the host thread, instead of busy-waiting for the GPU to complete, it carries out useful chunk processing; and (2) a host thread blocking strategy combined with oversubscription, that delegates on the OS the duty of scheduling threads to available CPU cores in order to guarantee that all cores are doing useful work. Using two benchmarks we evaluate the overhead introduced by our scheduling and partitioning algorithms, finding that it is negligible. We also evaluate the efficiency of the strategies proposed finding that allowing oversubscription controlled by the OS can be beneficial under certain scenarios.
引用
收藏
页码:756 / 771
页数:15
相关论文
共 50 条
  • [1] Strategies for maximizing utilization on multi-CPU and multi-GPU heterogeneous architectures
    Navarro, Angeles
    Vilches, Antonio
    Corbera, Francisco
    Asenjo, Rafael
    JOURNAL OF SUPERCOMPUTING, 2014, 70 (02): : 756 - 771
  • [2] Design and analysis of scheduling strategies for multi-CPU and multi-GPU architectures
    Lima, Joao V. F.
    Gautier, Thierry
    Danjean, Vincent
    Raffin, Bruno
    Maillard, Nicolas
    PARALLEL COMPUTING, 2015, 44 : 37 - 52
  • [3] Financial applications on multi-CPU and multi-GPU architectures
    Department of Computer Science and Electronics, Universidad de Cantabria, Santander, Spain
    不详
    J Supercomput, 2 (729-739):
  • [4] Financial applications on multi-CPU and multi-GPU architectures
    Castillo, Emilio
    Camarero, Cristobal
    Borrego, Ana
    Luis Bosque, Jose
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (02): : 729 - 739
  • [5] Financial applications on multi-CPU and multi-GPU architectures
    Emilio Castillo
    Cristóbal Camarero
    Ana Borrego
    Jose Luis Bosque
    The Journal of Supercomputing, 2015, 71 : 729 - 739
  • [6] Shot boundary detection using Zernike moments in multi-GPU multi-CPU architectures
    Toharia, Pablo
    Robles, Oscar D.
    Suarez, Ricardo
    Luis Bosque, Jose
    Pastor, Luis
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (09) : 1127 - 1133
  • [7] Multi-CPU/Multi-GPU Based Framework for Multimedia Processing
    Mahmoudi, Sidi Ahmed
    Manneback, Pierre
    COMPUTER SCIENCE AND ITS APPLICATIONS, CIIA 2015, 2015, 456 : 54 - 65
  • [8] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
    Lima, Joao V. F.
    Di Domenico, Daniel
    2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36
  • [9] Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations
    Hermann, Everton
    Raffin, Bruno
    Faure, Francois
    Gautier, Thierry
    Allard, Jeremie
    EURO-PAR 2010 - PARALLEL PROCESSING, PART II, 2010, 6272 : 235 - 246
  • [10] Parallel Branch-and-Bound in multi-core multi-CPU multi-GPU heterogeneous environments
    Trong-Tuan Vu
    Derbel, Bilel
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 56 : 95 - 109