Parallel multiprocessing and scheduling on the heterogeneous Xeon+FPGA platform

被引:0
|
作者
Andrés Rodríguez
Angeles Navarro
Rafael Asenjo
Francisco Corbera
Rubén Gran
Darío Suárez
Jose Nunez-Yanez
机构
[1] Universidad de Málaga,Department of Computer Architecture
[2] Andalucía Tech,Computer Architecture Group
[3] Universidad de Zaragoza,Department of Electrical and Electronic Engineering
[4] University of Bristol,undefined
来源
关键词
Heterogeneous architecture; FPGA; Parallel_for template; Heterogeneous scheduling; Hybrid algorithm; Adaptive chunk size;
D O I
暂无
中图分类号
学科分类号
摘要
Heterogeneous computing that exploits simultaneous co-processing with different device types has been shown to be effective at both increasing performance and reducing energy consumption. In this paper, we extend a scheduling framework encapsulated in a high-level C++ template and previously developed for heterogeneous chips comprising CPU and GPU cores, to new high-performance platforms for the data center, which include a cache coherent FPGA fabric and many-core CPU resources. Our goal is to evaluate the suitability of our framework with these new FPGA-based platforms, identifying performance benefits and limitations.We target the state-of-the-art HARP processor that includes 14 high-end Xeon classes tightly coupled to a FPGA device located in the same package. We select eight benchmarks from the high-performance computing domain that have been ported and optimized for this heterogeneous platform. The results show that a dynamic and adaptive scheduler that exploits simultaneous processing among the devices can improve performance up to a factor of 8 × compared to the best alternative solutions that only use the CPU cores or the FPGA fabric. Moreover, our proposal achieves up to 15% and 37% of improvement compared to the best heterogeneous solutions found with a dynamic and static schedulers, respectively.
引用
收藏
页码:4645 / 4665
页数:20
相关论文
共 50 条
  • [21] ALMARVI Execution Platform: Heterogeneous Video Processing SoC Platform on FPGA
    Joost Hoozemans
    Jeroen van Straten
    Timo Viitanen
    Aleksi Tervo
    Jiri Kadlec
    Zaid Al-Ars
    Journal of Signal Processing Systems, 2019, 91 : 61 - 73
  • [22] Scheduling Parallel Task Graphs on Non-dedicated Heterogeneous Multicluster Platform with Moldable Task Duplication
    Zhang, Jinghui
    Luo, Junzhou
    Dong, Fang
    PROCEEDINGS OF THE 2013 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2013, : 313 - 318
  • [23] Scheduling of parallel applications on heterogeneous workstation clusters
    Schnor, B
    Petri, S
    Oleyniczak, R
    Langendorfer, H
    PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS - PROCEEDINGS OF THE ISCA 9TH INTERNATIONAL CONFERENCE, VOLS I AND II, 1996, : 330 - 337
  • [24] Serial and Parallel Interleaved Modular Multipliers on FPGA Platform
    Javeed, Khalid
    Wang, Xiaojun
    Scott, Mike
    2015 25TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, 2015,
  • [25] Parallel Sparse Cholesky Factorization on a Heterogeneous Platform
    Zou, Dan
    Dou, Yong
    Li, Rongchun
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2013, E96A (04) : 833 - 834
  • [26] A Parallel Platform for Fusion of Heterogeneous Stream Data
    Zhang, Shan
    Xu, Jielong
    Choi, Sora
    Tang, Jian
    Varshney, Pramod K.
    Chen, Zhenhua
    2018 21ST INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2018, : 588 - 594
  • [27] Improvement of workload balancing using parallel loop self-scheduling on Intel Xeon Phi
    Chao-Tung Yang
    Chao-Wei Huang
    Shuo-Tsung Chen
    The Journal of Supercomputing, 2017, 73 : 4981 - 5005
  • [28] Improvement of workload balancing using parallel loop self-scheduling on Intel Xeon Phi
    Yang, Chao-Tung
    Huang, Chao-Wei
    Chen, Shuo-Tsung
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (11): : 4981 - 5005
  • [29] Improvement of Workload Balancing Using Parallel Loop Self-Scheduling on Intel Xeon Phi
    Huang, Chao-Wei
    Wan, Zong-Yue
    Yang, Chao-Tung
    Liu, Jung-Chun
    Chen, Shuo-Tsung
    2015 SEVENTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2015, : 197 - 203
  • [30] A Parallel Nonlocal Means Algorithm for Remote Sensing Image Denoising on an Intel Xeon Phi Platform
    Huang, Fang
    Lan, Bo
    Tao, Jian
    Chen, Yinjie
    Tan, Xicheng
    Feng, Jie
    Ma, Yan
    IEEE ACCESS, 2017, 5 : 8559 - 8567