Data-Oriented Runtime Scheduling Framework on Multi-GPUs

被引:0
|
作者
Li, Tao [1 ,2 ]
Zhao, Kezhao [1 ]
Dong, Qiankun [1 ]
Leng, Jiabing [1 ]
Yang, Yulu [1 ]
Ma, Wenjing [3 ]
机构
[1] Nankai Univ, Coll Comp & Control Engn, Tianjin, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Software, Lab Parallel Software & Comp Sci, State Key Lab Comp Sci, Beijing, Peoples R China
来源
2016 IEEE TRUSTCOM/BIGDATASE/ISPA | 2016年
基金
中国国家自然科学基金;
关键词
GPU; Heterogeneous system; Data-oriented DAG; task scheduling; TASK; FACTORIZATION; SYSTEM;
D O I
10.1109/TrustCom.2016.207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
GPU has been generally accepted as an efficient accelerator in the field of high performance computing (HPC). On some heterogeneous systems, multiple GPUs are installed on each computing node. To make things more complicated, these GPUs may even have different architectures. Therefore, it is a challenge to efficiently schedule tasks and data on heterogeneous system. In this paper, we present DoSFoG, a data-oriented runtime scheduling framework on heterogeneous system equipped with multiple GPUs. In DoSFoG, the data blocks, instead of tasks, are taken as the scheduling units. It uses a data-oriented directed acyclic graph (DoDAG) as representation of an application, which is proved to be equivalence to task DAG. Based on DoDAG, a runtime scheduling framework is designed. Besides, a hierarchical storage structure is carefully designed based on the various levels of memory in the system. Page-locked memory and soft cache on GPU device memory are used to improve the data transfer. DoSFoG is evaluated with different applications on a system equipped with different GPUs. The results show that DoSFoG can achieve high data locality, scalability, load balance and performance improvement for large size of data.
引用
收藏
页码:1311 / 1318
页数:8
相关论文
共 50 条
  • [21] Data-oriented parsing
    Klein, D
    COMPUTATIONAL LINGUISTICS, 2004, 30 (02) : 240 - 244
  • [22] A Theoretical Approach to the Data-Oriented Scheduling Strategies across Multiple Clouds
    Ma, Yongzheng
    Nan, Kai
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 942 - 943
  • [23] Training Deep Nets with Progressive Batch Normalization on Multi-GPUs
    Qin, Lianke
    Gong, Yifan
    Tang, Tianqi
    Wang, Yutian
    Jin, Jiangming
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (03) : 373 - 387
  • [24] Adaptive optimization modeling of preconditioned conjugate gradient on Multi-GPUs
    Gao J.
    Wang Y.
    Wang J.
    Liang R.
    ACM Transactions on Parallel Computing, 2016, 3 (03) : 1 - 33
  • [25] ZMCintegral: A package for multi-dimensional Monte Carlo integration on multi-GPUs
    Wu, Hong-Zhong
    Zhang, Jun-Jie
    Pang, Long-Gang
    Wang, Qun
    COMPUTER PHYSICS COMMUNICATIONS, 2020, 248 (248)
  • [26] Design of a Data-Oriented GPC
    Guan, Zhe
    Wakitani, Shin
    Yamamoto, Toru
    2013 INTERNATIONAL CONFERENCE ON ADVANCED MECHATRONIC SYSTEMS (ICAMECHS), 2013, : 555 - 558
  • [27] A Study of Graph Analytics for Massive Datasets on Distributed Multi-GPUs
    Jatala, Vishwesh
    Dathathri, Roshan
    Gill, Gurbinder
    Hoang, Loc
    Nandivada, V. Krishna
    Pingali, Keshav
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, 2020, : 84 - 94
  • [28] Training Deep Nets with Progressive Batch Normalization on Multi-GPUs
    Lianke Qin
    Yifan Gong
    Tianqi Tang
    Yutian Wang
    Jiangming Jin
    International Journal of Parallel Programming, 2019, 47 : 373 - 387
  • [29] A versatile tomographic forward- and backprojection approach on Multi-GPUs
    Fehringer, Andreas
    Lasser, Tobias
    Zanette, Irene
    Noel, Peter B.
    Pfeiffer, Franz
    MEDICAL IMAGING 2014: IMAGE PROCESSING, 2014, 9034
  • [30] cuFastTucker: A Novel Sparse FastTucker Decomposition For HHLST on Multi-GPUs
    Li, Zixuan
    Hu, Yikun
    Li, Mengquan
    Yang, Wangdong
    Li, Kenli
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2024, 11 (02)