Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引:5
|
作者
Shin, Hoon [1 ]
Park, Rihae [1 ]
Lee, Seung Yul [1 ]
Park, Yeonhong [1 ]
Lee, Hyunseung [1 ]
Lee, Jae W. [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
关键词
D O I
10.1145/3489517.3530564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.
引用
收藏
页码:949 / 954
页数:6
相关论文
共 50 条
  • [31] An Empirical Fault Vulnerability Exploration of ReRAM-Based Process-in-Memory CNN Accelerators
    Dorostkar, Aniseh
    Farbeh, Hamed
    Zarandi, Hamid R.
    IEEE TRANSACTIONS ON RELIABILITY, 2024, : 1 - 15
  • [32] Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators
    Azamat, Azat
    Asim, Faaiz
    Kim, Jintae
    Lee, Jongeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (12) : 4897 - 4908
  • [33] Optimizing ADC Utilization through Value-Aware Bypass in ReRAM-based DNN Accelerator
    Yun, HanCheon
    Shin, Hyein
    Kang, Myeonggu
    Kim, Lee-Sup
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1087 - 1092
  • [34] Effective Management of ReRAM-based Hybrid SSD for Multiple Node HDFS
    Park, Nayoung
    Lee, Byungjun
    Kim, Kyung Tae
    Youn, Hee Yong
    INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING, 2015, 3 (03) : 167 - 176
  • [35] DL-RSIM: A Reliability and Deployment Strategy Simulation Framework for ReRAM-based CNN Accelerators
    Lin, Wei-Ting
    Cheng, Hsiang-Yun
    Yang, Chia-Lin
    Lin, Meng-Yao
    Lien, Kai
    Hu, Han-Wen
    Chang, Hung-Sheng
    Li, Hsiang-Pang
    Chang, Meng-Fan
    Tsou, Yen-Ting
    Nien, Chin-Fu
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (03)
  • [36] DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning
    Lin, Meng-Yao
    Cheng, Hsiang-Yun
    Lin, Wei-Ting
    Yang, Tzu-Hsien
    Tseng, I-Ching
    Yang, Chia-Lin
    Hu, Han-Wen
    Chang, Hung-Sheng
    Li, Hsiang-Pang
    Chang, Meng-Fan
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,
  • [37] REC: REtime Convolutional Layers to Fully Exploit Harvested Energy for ReRAM-based CNN Accelerators
    Zhou, Kunyu
    Qiu, Keni
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (06) : 33 - 33
  • [38] Offline Training-Based Mitigation of IR Drop for ReRAM-Based Deep Neural Network Accelerators
    Lee, Sugil
    Fouda, Mohammed E.
    Lee, Jongeun
    Eltawil, Ahmed M.
    Kurdahi, Fadi
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (02) : 521 - 532
  • [39] RED: A ReRAM-based Deconvolution Accelerator
    Fan, Zichen
    Li, Ziru
    Li, Bing
    Chen, Yiran
    Li, Hai
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1763 - 1768
  • [40] Training-Free Stuck-At Fault Mitigation for ReRAM-Based Deep Learning Accelerators
    Quan, Chenghao
    Fouda, Mohammed E.
    Lee, Sugil
    Jung, Giju
    Lee, Jongeun
    Eltawil, Ahmed E.
    Kurdahi, Fadi
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (07) : 2174 - 2186