Monolithic 3D stacked multiply-accumulate units

被引:4
|
作者
Lee, Young Seo [1 ]
Kim, Kyung Min [5 ]
Lee, Ji Heon [2 ]
Gong, Young-Ho [4 ]
Kim, Seon Wook [3 ]
Chung, Sung Woo [1 ]
机构
[1] Korea Univ, Dept Comp Sci, Seoul 02841, South Korea
[2] Korea Univ, Dept Semicond Syst Engn, Seoul 02841, South Korea
[3] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
[4] Kwangwoon Univ, Dept Comp Engn, Seoul 01897, South Korea
[5] SK Hynix, Icheon 17336, South Korea
基金
新加坡国家研究基金会;
关键词
Multiply-accumulate; M3D stacking; TSV-Based 3D stacking; ASIC;
D O I
10.1016/j.vlsi.2020.10.006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The monolithic 3D stacking (M3D) reduces the critical path delay, leveraging 1) short latency of a monolithic inter-tier via (MIV) and 2) short 2D interconnect and cell delay through smaller footprint. In this paper, we propose M3D stacked multiply-accumulate (MAC) units; MAC units have a relatively large number of long wires. With the Samsung 28 nm ASIC library, the M3D stacked MAC units reduce the critical path delay by up to 28.9%, compared to the conventional 2D structure. In addition, the M3D stacked MAC units reduce dynamic energy and leakage power by up to 9.6% and 21.7%, respectively. Compared to the TSV stacked MAC units, the M3D stacked MAC units consume less dynamic energy and leakage power by up to 37.1% and 73.6%, respectively. Though the 3D stacking technology inevitably causes higher peak temperature than the 2D structure, our thermal results show that the peak temperature of the M3D stacking is always lower than that of the TSV-based 3D stacking. Furthermore, when the size of the MAC unit is optimized in convolutional neural network (CNN) applications, the peak temperature of the M3D stacking is 88.3 degrees C at most, which is still under the threshold temperature.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 50 条
  • [1] A fast multiply-accumulate architecture
    Grisamore, RT
    Swartzlander, EE
    ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS X, 2000, 4116 : 279 - 287
  • [2] Error Probability Models for Voltage-Scaled Multiply-Accumulate Units
    Rathore, Mallika
    Milder, Peter
    Salman, Emre
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (07) : 1665 - 1675
  • [3] A Method of Increasing Digital Filter Performance Based on Truncated Multiply-Accumulate Units
    Lyakhov, Pavel
    Valueva, Maria
    Valuev, Georgii
    Nagornov, Nikolai
    APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 11
  • [4] Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training
    Tatsumi, Mariko
    Filip, Silviu-Ioan
    White, Caroline
    Sentieys, Olivier
    Lemieux, Guy
    2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 28 - 36
  • [5] Time-Domain Multiply-Accumulate Unit
    Locatelli, Pedro Sartori
    Colombo, Dalton Martini
    El-Sankary, Kamal
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (06) : 762 - 775
  • [6] Modified Fused Multiply-Accumulate Chained Unit
    Nasiri, Nasibeh
    Segal, Oren
    Margala, Martin
    2014 IEEE 57TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2014, : 889 - 892
  • [7] Implementing multiply-accumulate operation in multiplication time
    Stelling, PF
    Oklobdzija, VG
    13TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 1997, : 99 - 106
  • [8] Survey of Precision-Scalable Multiply-Accumulate Units for Neural-Network Processing
    Camus, Vincent
    Enz, Christian
    Verhelst, Marian
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 57 - 61
  • [9] Photonic Multiply-Accumulate Operations for Neural Networks
    Nahmias, Mitchell A.
    de Lima, Thomas Ferreira
    Tait, Alexander N.
    Peng, Hsuan-Tung
    Shastri, Bhavin J.
    Prucnal, Paul R.
    IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, 2020, 26 (01)
  • [10] Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing
    Garland, James
    Gregg, David
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (03)