Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引：5

作者：

Shin, Hoon ^{[1
]}

Park, Rihae ^{[1
]}

Lee, Seung Yul ^{[1
]}

Park, Yeonhong ^{[1
]}

Lee, Hyunseung ^{[1
]}

Lee, Jae W. ^{[1
]}

机构：

[1] Seoul Natl Univ, Seoul, South Korea

来源：

PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022 | 2022年

关键词：

D O I：

10.1145/3489517.3530564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.

引用

页码：949 / 954

页数：6

共 50 条

[1] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators
Huang, Sitao
Ankit, Aayush
Silveira, Plinio
Antunes, Rodrigo
Chalamalasetti, Sai Rahul
El Hajj, Izzat
Kim, Dong Eun
Aguiar, Glaucimar
Bruel, Pedro
Serebryakov, Sergey
Xu, Cong
Li, Can
Faraboschi, Paolo
Strachan, John Paul
Chen, Deming
Roy, Kaushik
Hwu, Wen-mei
Milojicic, Dejan
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377
[2] APQ: Automated DNN Pruning and Quantization for ReRAM-Based Accelerators
Yang, Siling
He, Shuibing
Duan, Hexiao
Chen, Weijian
Zhang, Xuechen
Wu, Tong
Yin, Yanlong
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (09) : 2498 - 2511
[3] Design Framework for ReRAM-Based DNN Accelerators with Accuracy and Hardware Evaluation
Kao, Hsu-Yu
Huang, Shih-Hsu
Cheng, Wei-Kai
ELECTRONICS, 2022, 11 (13)
[4] ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
Xu, Jiahong
Li, Haikun
Duan, Zhuohui
Liao, Xiaofei
Jin, Hai
Yang, Xiaokang
Li, Huize
Liu, Cong
Mao, Fubing
Zhang, Yu
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (03)
[5] Re2fresh: A Framework for Mitigating Read Disturbance in ReRAM-based DNN Accelerators
Shin, Hyein
Kang, Myeonggu
Kim, Lee-Sup
2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
[6] SRA: A Secure ReRAM-Based DNN Accelerator
Zhao, Lei
Zhang, Youtao
Yang, Jun
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 355 - 360
[7] Runtime Row/Column Activation Pruning for ReRAM-based Processing-in-Memory DNN Accelerators
Jiang, Xikun
Shen, Zhaoyan
Sun, Siqing
Yin, Ping
Jia, Zhiping
Ju, Lei
Zhang, Zhiyong
Yu, Dongxiao
2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
[8] Hardware attacks on ReRAM-based AI accelerators
Heidary, Masoud
Joardar, Biresh Kumar
17TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS 2024, 2024,
[9] Boosting ReRAM-based DNN by Row Activation Oversubscription
Guo, Mengyu
Zhang, Zihan
Jiang, Jianfei
Wang, Qin
Jing, Naifeng
27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 604 - 609
[10] CRPIM: An efficient compute-reuse scheme for ReRAM-based Processing-in-Memory DNN accelerators
Hong, Shihao
Chung, Yeh-Ching
JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 153

← 1 2 3 4 5 →