Use of computation-unit integrated memories in high-level synthesis

被引:0
|
作者
Huang, Chao [1 ]
Ravi, Srivaths
Raghunathan, Anand
Jha, Niraj K.
机构
[1] Virginia Polytech Inst & State Univ, Bradley Dept Elect & Comp Engn, Blacksburg, VA 24061 USA
[2] Nippon Elect Co, Labs Amer, Princeton, NJ 08540 USA
[3] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
关键词
application-specific integrated circuits; controller/datapath; high-level synthesis; integrated memory;
D O I
10.1109/TCAD.2005.862749
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High-level synthesis (HLS) of memory-intensive applications has featured several innovations in terms of enhancements made to the basic memory organization and data layout. However, increasing performance and energy demands faced by application-specific integrated circuits (ASICs) are forcing designers to alter the fundamental architectural template of the HLS output, namely, a controller datapath associated with a memory subsystem (monolithic, partitioned, etc.). An architectural template for the HLS output that consists of a controller-datapath circuit associated with a memory subsystem into which computation units have been integrated is proposed. The enhanced memory subsystem is called computation-unit integrated memory (CIM). A CIM offers higher memory bandwidth (relative to what is offered through the system bus) to computation units present locally within it and reduces the overall communication between the memory subsystem and the controller datapath, thus providing a template highly suitable for deriving efficient implementations of memory-intensive applications. This paper addresses the challenge of providing a systematic synthesis framework for a CIM-based architecture. This framework can analyze the various tradeoffs involved in selecting suitable operations in a behavior for execution using a CIM and generate a high-performance low-overhead implementation. Efficient data reuse of register files have also been fully exploited to further improve system performance. Experiments with several behaviors indicate that an average performance improvement of 2.02 x (a maximum of 2.70 x) is possible with very low area overheads. The energy-delay product improves by an average of 2.5 x (maximum of 3.8 x).
引用
收藏
页码:1969 / 1989
页数:21
相关论文
共 50 条
  • [21] Eliminating memory bottlenecks for a JPEG encoder through distributed logic-memory architecture and computation-unit integrated memory
    Huang, C
    Ravi, S
    Raghunathan, A
    Jha, NK
    CICC: PROCEEDINGS OF THE IEEE 2005 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 2005, : 239 - 242
  • [22] Validating High-Level Synthesis
    Kundu, Sudipta
    Lerner, Sorin
    Gupta, Rajesh
    COMPUTER AIDED VERIFICATION, 2008, 5123 : 459 - 472
  • [23] OPTIMIZATIONS IN HIGH-LEVEL SYNTHESIS
    ROSENSTIEL, W
    MICROPROCESSING AND MICROPROGRAMMING, 1986, 18 (1-5): : 347 - 352
  • [24] HIGH-LEVEL SYNTHESIS - A TUTORIAL
    WU, ACH
    LIN, YL
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (03) : 209 - 218
  • [25] ALGORITHMS FOR HIGH-LEVEL SYNTHESIS
    PAULIN, PG
    KNIGHT, JP
    IEEE DESIGN & TEST OF COMPUTERS, 1989, 6 (06): : 18 - 31
  • [26] Introduction to high-level synthesis
    Gajski, Daniel D.
    Ramachandran, Loganath
    IEEE Design and Test of Computers, 1600, 11 (04): : 44 - 54
  • [27] INTRODUCTION TO HIGH-LEVEL SYNTHESIS
    GAJSKI, DD
    RAMACHANDRAN, L
    IEEE DESIGN & TEST OF COMPUTERS, 1994, 11 (04): : 44 - 54
  • [28] Incremental High-Level Synthesis
    Lavagno, Luciano
    Kondratyev, Alex
    Watanabe, Yosinori
    Zhu, Qiang
    Fujii, Mototsugu
    Tatesawa, Mitsuru
    Nakayama, Noriyasu
    2010 15TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2010), 2010, : 693 - 698
  • [29] An Introduction to High-Level Synthesis
    Coussy, Philippe
    Meredith, Michael
    Gajski, Daniel D.
    Takach, Andres
    IEEE DESIGN & TEST OF COMPUTERS, 2009, 26 (04): : 8 - 17
  • [30] THE STATUS OF HIGH-LEVEL SYNTHESIS
    WALKER, RA
    IEEE DESIGN & TEST OF COMPUTERS, 1994, 11 (04): : 42 - 43