OSCAR API for Real-Time Low-Power Multicores and Its Performance on Multicores and SMP Servers

被引:16
|
作者
Kimura, Keiji [1 ]
Mase, Masayoshi [1 ]
Mikami, Hiroki [1 ]
Miyamoto, Takamichi [1 ]
Shirako, Jun [1 ]
Kasahara, Hironori [1 ]
机构
[1] Waseda Univ, Dept Comp Sci & Engn, Shinjuku Ku, Tokyo, Japan
关键词
Multicore API; Parallelizing Compiler; Power Reduction;
D O I
10.1007/978-3-642-13374-9_13
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
OSCAR (Optimally Scheduled Advanced Multiprocessor) API has been designed for real-time embedded low-power multicores to generate parallel programs for various multicores from different vendors by using the OSCAR parallelizing compiler. The OSCAR API has been developed by Waseda University in collaboration with Fujitsu Laboratory, Hitachi, NEC, Panasonic, Renesas Technology, and Toshiba in an METI/NEDO project entitled "Multicore Technology for Realtime Consumer Electronics." By using the OSCAR API as an interface between the OSCAR compiler and backend compilers, the OSCAR compiler enables hierarchical multigrain parallel processing with memory optimization under capacity restriction for cache memory, local memory, distributed shared memory, and on-chip/off-chip shared memory; data transfer using a DMA controller; and power reduction control using DVFS (Dynamic Voltage and Frequency Scaling), clock gating, and power gating for various embedded multicores. In addition, a parallelized program automatically generated by the OSCAR, compiler with OSCAR API can be compiled by the ordinary OpenMP compilers since the OSCAR API is designed on a subset of the OpenMP. This paper describes the OSCAR API and its compatibility with the OSCAR compiler by showing code examples. Performance evaluations of the OSCAR compiler and the OSCAR. API are carried out using an IBM Power5+ workstation, an IBM Power6 high-end SMP server, and a newly developed consumer electronics multicore chip RP2 by Renesas, Hitachi and Waseda. From the results of scalability evaluation, it is found that on an average, the OSCAR compiler with the OSCAR API can exploit 5.8 times speedup over the sequential execution on the Power5+ workstation with eight cores and 2.9 times speedup on RP2 with four cores, respectively. In addition, the OSCAR compiler can accelerate an IBM XL Fortran compiler up to 3.3 times on the Power6 SMP server. Due to low-power optimization on RP2, the OSCAR compiler with the OSCAR API achieves a maximum power reduction of 84% in the real-time execution mode.
引用
收藏
页码:188 / 202
页数:15
相关论文
共 50 条
  • [1] Parallelizing Compiler Framework and API for Power Reduction and Software Productivity of Real-Time Heterogeneous Multicores
    Hayashi, Akihiro
    Wada, Yasutaka
    Watanabe, Takeshi
    Sekiguchi, Takeshi
    Mase, Masayoshi
    Shirako, Jun
    Kimura, Keiji
    Kasahara, Hironori
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 184 - 198
  • [2] On the Benefits of Multicores for Real-Time Systems
    Saidi, Selma
    2017 EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2017, : 383 - 389
  • [3] OSCAR Parallelizing and Power Reducing Compiler and API for Heterogeneous Multicores (Invited Paper)
    Kasahara, Hironori
    Kimura, Keiji
    Kitamura, Toshiaki
    Mikami, Hiroki
    Morita, Kazutaka
    Fujita, Kazuki
    Yamamoto, Kazuki
    Kawasumi, Tohma
    PROCEEDINGS OF PEHC 2021: WORKSHOP ON PROGRAMMING ENVIRONMENTS FOR HETEROGENEOUS COMPUTING, 2021, : 10 - 19
  • [4] FOS: a low-power cache organization for multicores
    José Puche
    Salvador Petit
    Julio Sahuquillo
    María Engracia Gómez
    The Journal of Supercomputing, 2019, 75 : 6542 - 6573
  • [5] FOS: a low-power cache organization for multicores
    Puche, Jose
    Petit, Salvador
    Sahuquillo, Julio
    Engracia Gomez, Maria
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (10): : 6542 - 6573
  • [6] A real-time capable coherent data cache for multicores
    Pyka, Arthur
    Rohde, Mathias
    Uhrig, Sascha
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (06): : 1342 - 1354
  • [7] The Shift to Multicores in Real-Time and Safety-Critical Systems
    Saidi, Selina
    Ernst, Rolf
    Uhrig, Sascha
    Theiling, Henrik
    de Dinechin, Benoit Dupont
    2015 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2015, : 220 - 229
  • [8] Deconstructing Bus Access Control Policies for Real-Time Multicores
    Jalle, Javier
    Abella, Jaume
    Quinones, Eduardo
    Fossati, Luca
    Zulianello, Marco
    Cazorla, Francisco J.
    2013 8TH IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL EMBEDDED SYSTEMS (SIES), 2013, : 31 - 38
  • [9] Predictability and Performance Aware Replacement Policy PVISAM for Unified Shared Caches in Real-time Multicores
    Haque, Mohammad Shihabul
    Easwaran, Arvind
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2720 - 2731
  • [10] Decomposition-Based Real-Time Scheduling of Parallel Tasks on Multicores Platforms
    Jiang, Xu
    Guan, Nan
    Long, Xiang
    Wan, Han
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2319 - 2332