Monolithic 3D stacked multiply-accumulate units

被引:4
|
作者
Lee, Young Seo [1 ]
Kim, Kyung Min [5 ]
Lee, Ji Heon [2 ]
Gong, Young-Ho [4 ]
Kim, Seon Wook [3 ]
Chung, Sung Woo [1 ]
机构
[1] Korea Univ, Dept Comp Sci, Seoul 02841, South Korea
[2] Korea Univ, Dept Semicond Syst Engn, Seoul 02841, South Korea
[3] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
[4] Kwangwoon Univ, Dept Comp Engn, Seoul 01897, South Korea
[5] SK Hynix, Icheon 17336, South Korea
基金
新加坡国家研究基金会;
关键词
Multiply-accumulate; M3D stacking; TSV-Based 3D stacking; ASIC;
D O I
10.1016/j.vlsi.2020.10.006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The monolithic 3D stacking (M3D) reduces the critical path delay, leveraging 1) short latency of a monolithic inter-tier via (MIV) and 2) short 2D interconnect and cell delay through smaller footprint. In this paper, we propose M3D stacked multiply-accumulate (MAC) units; MAC units have a relatively large number of long wires. With the Samsung 28 nm ASIC library, the M3D stacked MAC units reduce the critical path delay by up to 28.9%, compared to the conventional 2D structure. In addition, the M3D stacked MAC units reduce dynamic energy and leakage power by up to 9.6% and 21.7%, respectively. Compared to the TSV stacked MAC units, the M3D stacked MAC units consume less dynamic energy and leakage power by up to 37.1% and 73.6%, respectively. Though the 3D stacking technology inevitably causes higher peak temperature than the 2D structure, our thermal results show that the peak temperature of the M3D stacking is always lower than that of the TSV-based 3D stacking. Furthermore, when the size of the MAC unit is optimized in convolutional neural network (CNN) applications, the peak temperature of the M3D stacking is 88.3 degrees C at most, which is still under the threshold temperature.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 50 条
  • [21] Design and implementation of asynchronous parallel multiply-accumulate arithmetic architectures
    Rao, VM
    Nowrouzian, B
    38TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 761 - 764
  • [22] Approximate Multiply-Accumulate Array for Convolutional Neural Networks on FPGA
    Wang, Ziwei
    Trefzer, Martin A.
    Bale, Simon J.
    Tyrrell, Andy M.
    2019 14TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC 2019), 2019, : 35 - 42
  • [23] LOW POWER ENERGY EFFICIENT PIPELINED MULTIPLY-ACCUMULATE ARCHITECTURE
    Sakthivel, R.
    Sravanthi, K.
    Kittur, Harish M.
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 226 - 231
  • [24] Double Throughput Multiply-Accumulate Unit for FlexCore Processor Enhancements
    Hoang, Tung Thanh
    Sjalander, Magnus
    Larsson-Edefors, Per
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 2821 - 2827
  • [25] Multiply-accumulate architecture for a special class of optimal extension fields
    Sanu, MO
    Swartzlander, EE
    16TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURE AND PROCESSORS, PROCEEDINGS, 2005, : 134 - 139
  • [26] An Approximate Multiply-Accumulate Unit with Low Power and Reduced Area
    Yang, Tongxin
    Sato, Toshinori
    Ukezono, Tomoaki
    2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 386 - 391
  • [27] BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs
    Chen, Yuzong
    Abdelfattah, Mohamed S.
    2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 52 - 62
  • [28] Graphene-based Photonic-Electronic Multiply-Accumulate Neurons
    De Marinis, L.
    Kincaid, P. S.
    Contestabile, G.
    Gupta, S.
    Andriolli, N.
    2023 INTERNATIONAL CONFERENCE ON PHOTONICS IN SWITCHING AND COMPUTING, PSC, 2023,
  • [29] Demonstration of multiply-accumulate unit for programmable band-pass ADC
    Bunyk, PI
    Herr, QP
    Johnson, MW
    IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, 2005, 15 (02) : 392 - 395
  • [30] Efficient Posit Multiply-Accumulate Unit Generator for Deep Learning Applications
    Zhang, Hao
    He, Jiongrui
    Ko, Seok-Bum
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,