Monolithic 3D stacked multiply-accumulate units

被引：4

作者：

Lee, Young Seo ^{[1
]}

Kim, Kyung Min ^{[5
]}

Lee, Ji Heon ^{[2
]}

Gong, Young-Ho ^{[4
]}

Kim, Seon Wook ^{[3
]}

Chung, Sung Woo ^{[1
]}

机构：

[1] Korea Univ, Dept Comp Sci, Seoul 02841, South Korea

[2] Korea Univ, Dept Semicond Syst Engn, Seoul 02841, South Korea

[3] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea

[4] Kwangwoon Univ, Dept Comp Engn, Seoul 01897, South Korea

[5] SK Hynix, Icheon 17336, South Korea

来源：

INTEGRATION-THE VLSI JOURNAL | 2021年 / 76卷

基金：

新加坡国家研究基金会;

关键词：

Multiply-accumulate; M3D stacking; TSV-Based 3D stacking; ASIC;

D O I：

10.1016/j.vlsi.2020.10.006

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The monolithic 3D stacking (M3D) reduces the critical path delay, leveraging 1) short latency of a monolithic inter-tier via (MIV) and 2) short 2D interconnect and cell delay through smaller footprint. In this paper, we propose M3D stacked multiply-accumulate (MAC) units; MAC units have a relatively large number of long wires. With the Samsung 28 nm ASIC library, the M3D stacked MAC units reduce the critical path delay by up to 28.9%, compared to the conventional 2D structure. In addition, the M3D stacked MAC units reduce dynamic energy and leakage power by up to 9.6% and 21.7%, respectively. Compared to the TSV stacked MAC units, the M3D stacked MAC units consume less dynamic energy and leakage power by up to 37.1% and 73.6%, respectively. Though the 3D stacking technology inevitably causes higher peak temperature than the 2D structure, our thermal results show that the peak temperature of the M3D stacking is always lower than that of the TSV-based 3D stacking. Furthermore, when the size of the MAC unit is optimized in convolutional neural network (CNN) applications, the peak temperature of the M3D stacking is 88.3 degrees C at most, which is still under the threshold temperature.

引用

页码：183 / 189

页数：7

共 50 条

[1] A fast multiply-accumulate architecture
Grisamore, RT
Swartzlander, EE
ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS X, 2000, 4116 : 279 - 287
[2] Error Probability Models for Voltage-Scaled Multiply-Accumulate Units
Rathore, Mallika
Milder, Peter
Salman, Emre
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (07) : 1665 - 1675
[3] A Method of Increasing Digital Filter Performance Based on Truncated Multiply-Accumulate Units
Lyakhov, Pavel
Valueva, Maria
Valuev, Georgii
Nagornov, Nikolai
APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 11
[4] Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training
Tatsumi, Mariko
Filip, Silviu-Ioan
White, Caroline
Sentieys, Olivier
Lemieux, Guy
2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 28 - 36
[5] Time-Domain Multiply-Accumulate Unit
Locatelli, Pedro Sartori
Colombo, Dalton Martini
El-Sankary, Kamal
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (06) : 762 - 775
[6] Modified Fused Multiply-Accumulate Chained Unit
Nasiri, Nasibeh
Segal, Oren
Margala, Martin
2014 IEEE 57TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2014, : 889 - 892
[7] Implementing multiply-accumulate operation in multiplication time
Stelling, PF
Oklobdzija, VG
13TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 1997, : 99 - 106
[8] Survey of Precision-Scalable Multiply-Accumulate Units for Neural-Network Processing
Camus, Vincent
Enz, Christian
Verhelst, Marian
2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 57 - 61
[9] Photonic Multiply-Accumulate Operations for Neural Networks
Nahmias, Mitchell A.
de Lima, Thomas Ferreira
Tait, Alexander N.
Peng, Hsuan-Tung
Shastri, Bhavin J.
Prucnal, Paul R.
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, 2020, 26 (01)
[10] Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing
Garland, James
Gregg, David
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (03)

← 1 2 3 4 5 →