Monolithic 3D stacked multiply-accumulate units

被引:4
|
作者
Lee, Young Seo [1 ]
Kim, Kyung Min [5 ]
Lee, Ji Heon [2 ]
Gong, Young-Ho [4 ]
Kim, Seon Wook [3 ]
Chung, Sung Woo [1 ]
机构
[1] Korea Univ, Dept Comp Sci, Seoul 02841, South Korea
[2] Korea Univ, Dept Semicond Syst Engn, Seoul 02841, South Korea
[3] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
[4] Kwangwoon Univ, Dept Comp Engn, Seoul 01897, South Korea
[5] SK Hynix, Icheon 17336, South Korea
基金
新加坡国家研究基金会;
关键词
Multiply-accumulate; M3D stacking; TSV-Based 3D stacking; ASIC;
D O I
10.1016/j.vlsi.2020.10.006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The monolithic 3D stacking (M3D) reduces the critical path delay, leveraging 1) short latency of a monolithic inter-tier via (MIV) and 2) short 2D interconnect and cell delay through smaller footprint. In this paper, we propose M3D stacked multiply-accumulate (MAC) units; MAC units have a relatively large number of long wires. With the Samsung 28 nm ASIC library, the M3D stacked MAC units reduce the critical path delay by up to 28.9%, compared to the conventional 2D structure. In addition, the M3D stacked MAC units reduce dynamic energy and leakage power by up to 9.6% and 21.7%, respectively. Compared to the TSV stacked MAC units, the M3D stacked MAC units consume less dynamic energy and leakage power by up to 37.1% and 73.6%, respectively. Though the 3D stacking technology inevitably causes higher peak temperature than the 2D structure, our thermal results show that the peak temperature of the M3D stacking is always lower than that of the TSV-based 3D stacking. Furthermore, when the size of the MAC unit is optimized in convolutional neural network (CNN) applications, the peak temperature of the M3D stacking is 88.3 degrees C at most, which is still under the threshold temperature.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 50 条
  • [41] Silicon-Based Metastructure Optical Scattering Multiply-Accumulate Computation Chip
    Liu, Xu
    Zhu, Xudong
    Wang, Chunqing
    Cao, Yifan
    Wang, Baihang
    Ou, Hanwen
    Wu, Yizheng
    Mei, Qixun
    Zhang, Jialong
    Cong, Zhe
    Liu, Rentao
    NANOMATERIALS, 2022, 12 (13)
  • [42] An Evolutionary Normalization Algorithm for Signed Floating-Point Multiply-Accumulate Operation
    Sarma, Rajkumar
    Bhargava, Cherry
    Kotecha, Ketan
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (01): : 481 - 495
  • [43] MAXelerator: FPGA Accelerator for Privacy Preserving Multiply-Accumulate (MAC) on Cloud Servers
    Hussain, Siam U.
    Rouhani, Bita Darvish
    Ghasemzadeh, Mohammad
    Koushanfar, Farinaz
    2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [44] Limited Carry-Propagate Multiply-Accumulate Unit Design for Reconfigurable Systems
    Cini, Ugur
    Kocyigit, Gokhan
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2017, 23 (02) : 36 - 39
  • [45] Multiply-Accumulate Instruction Set Extension in a Soft-core RISC Processor
    Salim, Ahmad Jamal
    Samsudin, Nur Raihana
    Salim, Sani Irwan Md
    Soo, Yewguan
    2012 10TH IEEE INTERNATIONAL CONFERENCE ON SEMICONDUCTOR ELECTRONICS (ICSE), 2012, : 517 - 521
  • [46] Implementation of Low-Power Multiply-Accumulate (MAC) Unit for IoT Processors
    Mansour, Kareem
    Saeed, Ahmed
    2018 2ND EUROPEAN CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (EECS 2018), 2018, : 356 - 360
  • [47] SME: A Systolic Multiply-accumulate Engine for MLP-based Neural Network
    Wan, Haochuan
    Rao, Chaolin
    Zheng, Yueyang
    Zhou, Pingqiang
    Lou, Xin
    2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 270 - 274
  • [48] Memory System Designed for Multiply-Accumulate (MAC) Engine Based on Stochastic Computing
    Zhang, Xinyue
    Wang, Yuan
    Zhang, Yawen
    Song, Jiahao
    Zhang, Zuodong
    Cheng, Kaili
    Wang, Runsheng
    Huang, Ru
    17TH IEEE INTERNATIONAL CONFERENCE ON IC DESIGN AND TECHNOLOGY (ICICDT 2019), 2019,
  • [49] High-Accuracy Multiply-Accumulate (MAC) Technique for Unary Stochastic Computing
    Schober, Peter
    Najafi, M. Hassan
    TaheriNejad, Nima
    IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (06) : 1425 - 1439
  • [50] Understanding Timing Error Characteristics from Overclocked Systolic Multiply-Accumulate Arrays in FPGAs
    Chamberlin, Andrew
    Gerber, Andrew
    Palmer, Mason
    Goodale, Tim
    Gundi, Noel Daniel
    Chakraborty, Koushik
    Roy, Sanghamitra
    JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2024, 14 (01)