IMORC: An infrastructure and architecture template for implementing high-performance reconfigurable FPGA accelerators

被引:2
|
作者
Schumacher, Tobias [1 ]
Plessl, Christian [1 ]
Platzner, Marco [1 ]
机构
[1] Univ Gesamthsch Paderborn, Paderborn Ctr Parallel Comp, D-33098 Paderborn, Germany
关键词
Reconfigurable computing; kth nearest neighbor technique; FPGA; FRAMEWORK;
D O I
10.1016/j.micpro.2011.04.002
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The design, implementation and optimization of FPGA accelerators is a challenging task, especially when the accelerator comprises multiple compute cores distributed across CPU and FPGA resources and memories and exhibits data-dependent runtime behavior. In order to simplify the development of FPGA accelerators we propose IMORC, an infrastructure and architecture template that helps raising the level of abstraction. The IMORC development flow bases on a modeling technique for visualizing an application's communication demand and an architecture template that aids the developer in implementing the design. The architectural template consists of a versatile on-chip interconnect with asynchronous FIFOs and bitwidth conversion placed into the communication links, a performance monitoring infrastructure for collecting performance information during runtime and a set of generic infrastructure cores which are frequently needed in accelerator designs. We demonstrate the usefulness of the IMORC development flow by means of the case study of accelerating the kth nearest neighbor thinning problem, where IMORC greatly helps us in understanding the communication demand and in implementing the application. With the integrated performance monitoring infrastructure, we gain insights into the data-dependent behavior of the accelerator that helps us in identifying bottlenecks and optimizing the accelerator to achieve a speedup of 10x to 40x over an optimized CPU implementation. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:110 / 126
页数:17
相关论文
共 50 条
  • [21] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    Aafreen, R.
    Abhishek, R.
    Ajithkumar, B.
    Vaidyanathan, Arunkumar M.
    Barve, Indrajit V.
    Bhattramakki, Sahana
    Bhat, Shashank
    Girish, B. S.
    Ghalame, Atul
    Gupta, Y.
    Hayatnagarkar, Harshal G.
    Kamini, P. A.
    Karastergiou, A.
    Levin, L.
    Madhavi, S.
    Mekhala, M.
    Mickaliger, M.
    Mugundhan, V.
    Naidu, Arun
    Oppermann, J.
    Pandian, B. Arul
    Patra, N.
    Raghunathan, A.
    Roy, Jayanta
    Sethi, Shiv
    Shaw, B.
    Sherwin, K.
    Sinnen, O.
    Sinha, S. K.
    Srivani, K. S.
    Stappers, B.
    Subrahmanya, C. R.
    Prabu, Thiagaraj
    Vinutha, C.
    Wadadekar, Y. G.
    Wang, Haomiao
    Williams, C.
    JOURNAL OF ASTROPHYSICS AND ASTRONOMY, 2023, 44 (01)
  • [22] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    R. Aafreen
    R. Abhishek
    B. Ajithkumar
    Arunkumar M. Vaidyanathan
    Indrajit V. Barve
    Sahana Bhattramakki
    Shashank Bhat
    B. S. Girish
    Atul Ghalame
    Y. Gupta
    Harshal G. Hayatnagarkar
    P. A. Kamini
    A. Karastergiou
    L. Levin
    S. Madhavi
    M. Mekhala
    M. Mickaliger
    V. Mugundhan
    Arun Naidu
    J. Oppermann
    B. Arul Pandian
    N. Patra
    A. Raghunathan
    Jayanta Roy
    Shiv Sethi
    B. Shaw
    K. Sherwin
    O. Sinnen
    S. K. Sinha
    K. S. Srivani
    B. Stappers
    C. R. Subrahmanya
    Thiagaraj Prabu
    C. Vinutha
    Y. G. Wadadekar
    Haomiao Wang
    C. Williams
    Journal of Astrophysics and Astronomy, 44
  • [23] Design and implementation of dynamic and partial reconfigurable high-performance computing using FPGA
    Zhang, Xingjun
    Ding, Yanfei
    Huang, Yiyuan
    Dong, Xiaoshe
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2010, 38 (SUPPL. 1): : 82 - 86
  • [24] RosebudVirt: A High-Performance and Partially Reconfigurable FPGA Virtualization Framework for Multitenant Networks
    Chang, Yiwei
    Guo, Zhichuan
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2025, 33 (01) : 298 - 302
  • [25] OpenACC to FPGA: A Framework for Directive-based High-Performance Reconfigurable Computing
    Lee, Seyong
    Kim, Jungwon
    Vetter, Jeffrey S.
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 544 - 554
  • [26] Dynamically reconfigurable dataflow architecture for high-performance digital signal processing
    Voigt, S.
    Baesler, M.
    Teufel, T.
    JOURNAL OF SYSTEMS ARCHITECTURE, 2010, 56 (11) : 561 - 576
  • [27] A high-performance VLSI architecture for reconfigurable FIR using distributed arithmetic
    Mohanty, Basant Kumar
    Meher, Pramod Kumar
    Singhal, Subodh Kumar
    Swamy, M. N. S.
    INTEGRATION-THE VLSI JOURNAL, 2016, 54 : 37 - 46
  • [28] Two-level reconfigurable architecture for high-performance signal processing
    Johnsson, D
    Bengtsson, J
    Svensson, B
    ERSA '04: THE 2004 INTERNATIONAL CONFERENCE ON ENGINEERING OF RECONFIGURABLE SYSTEMS AND ALGORITHMS, 2004, : 177 - 183
  • [29] A high-performance reconfigurable VLSI architecture for VBSME in H.264
    Cao Wei
    Hou Hui
    Tong Jiarong
    Lai Jinmei
    Min Hao
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2008, 54 (03) : 1338 - 1345
  • [30] SROdcn: Scalable and Reconfigurable Optical DCN Architecture for High-Performance Computing
    Geresu, Kassahun
    Gu, Huaxi
    Yu, Xiaoshan
    Fadhel, Meaad
    Tian, Hui
    Wei, Wenting
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2025, 13 (01) : 245 - 258