Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引：5

作者：

Shin, Hoon ^{[1
]}

Park, Rihae ^{[1
]}

Lee, Seung Yul ^{[1
]}

Park, Yeonhong ^{[1
]}

Lee, Hyunseung ^{[1
]}

Lee, Jae W. ^{[1
]}

机构：

[1] Seoul Natl Univ, Seoul, South Korea

来源：

PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022 | 2022年

关键词：

D O I：

10.1145/3489517.3530564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.

引用

页码：949 / 954

页数：6

共 50 条

[31] An Empirical Fault Vulnerability Exploration of ReRAM-Based Process-in-Memory CNN Accelerators
Dorostkar, Aniseh
Farbeh, Hamed
Zarandi, Hamid R.
IEEE TRANSACTIONS ON RELIABILITY, 2024, : 1 - 15
[32] Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators
Azamat, Azat
Asim, Faaiz
Kim, Jintae
Lee, Jongeun
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (12) : 4897 - 4908
[33] Optimizing ADC Utilization through Value-Aware Bypass in ReRAM-based DNN Accelerator
Yun, HanCheon
Shin, Hyein
Kang, Myeonggu
Kim, Lee-Sup
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1087 - 1092
[34] Effective Management of ReRAM-based Hybrid SSD for Multiple Node HDFS
Park, Nayoung
Lee, Byungjun
Kim, Kyung Tae
Youn, Hee Yong
INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING, 2015, 3 (03) : 167 - 176
[35] DL-RSIM: A Reliability and Deployment Strategy Simulation Framework for ReRAM-based CNN Accelerators
Lin, Wei-Ting
Cheng, Hsiang-Yun
Yang, Chia-Lin
Lin, Meng-Yao
Lien, Kai
Hu, Han-Wen
Chang, Hung-Sheng
Li, Hsiang-Pang
Chang, Meng-Fan
Tsou, Yen-Ting
Nien, Chin-Fu
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (03)
[36] DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning
Lin, Meng-Yao
Cheng, Hsiang-Yun
Lin, Wei-Ting
Yang, Tzu-Hsien
Tseng, I-Ching
Yang, Chia-Lin
Hu, Han-Wen
Chang, Hung-Sheng
Li, Hsiang-Pang
Chang, Meng-Fan
2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,
[37] REC: REtime Convolutional Layers to Fully Exploit Harvested Energy for ReRAM-based CNN Accelerators
Zhou, Kunyu
Qiu, Keni
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (06) : 33 - 33
[38] Offline Training-Based Mitigation of IR Drop for ReRAM-Based Deep Neural Network Accelerators
Lee, Sugil
Fouda, Mohammed E.
Lee, Jongeun
Eltawil, Ahmed M.
Kurdahi, Fadi
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (02) : 521 - 532
[39] RED: A ReRAM-based Deconvolution Accelerator
Fan, Zichen
Li, Ziru
Li, Bing
Chen, Yiran
Li, Hai
2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1763 - 1768
[40] Training-Free Stuck-At Fault Mitigation for ReRAM-Based Deep Learning Accelerators
Quan, Chenghao
Fouda, Mohammed E.
Lee, Sugil
Jung, Giju
Lee, Jongeun
Eltawil, Ahmed E.
Kurdahi, Fadi
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (07) : 2174 - 2186

← 1 2 3 4 5 →