Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引:5
|
作者
Shin, Hoon [1 ]
Park, Rihae [1 ]
Lee, Seung Yul [1 ]
Park, Yeonhong [1 ]
Lee, Hyunseung [1 ]
Lee, Jae W. [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
关键词
D O I
10.1145/3489517.3530564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.
引用
收藏
页码:949 / 954
页数:6
相关论文
共 50 条
  • [1] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators
    Huang, Sitao
    Ankit, Aayush
    Silveira, Plinio
    Antunes, Rodrigo
    Chalamalasetti, Sai Rahul
    El Hajj, Izzat
    Kim, Dong Eun
    Aguiar, Glaucimar
    Bruel, Pedro
    Serebryakov, Sergey
    Xu, Cong
    Li, Can
    Faraboschi, Paolo
    Strachan, John Paul
    Chen, Deming
    Roy, Kaushik
    Hwu, Wen-mei
    Milojicic, Dejan
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377
  • [2] APQ: Automated DNN Pruning and Quantization for ReRAM-Based Accelerators
    Yang, Siling
    He, Shuibing
    Duan, Hexiao
    Chen, Weijian
    Zhang, Xuechen
    Wu, Tong
    Yin, Yanlong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (09) : 2498 - 2511
  • [3] Design Framework for ReRAM-Based DNN Accelerators with Accuracy and Hardware Evaluation
    Kao, Hsu-Yu
    Huang, Shih-Hsu
    Cheng, Wei-Kai
    ELECTRONICS, 2022, 11 (13)
  • [4] ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
    Xu, Jiahong
    Li, Haikun
    Duan, Zhuohui
    Liao, Xiaofei
    Jin, Hai
    Yang, Xiaokang
    Li, Huize
    Liu, Cong
    Mao, Fubing
    Zhang, Yu
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (03)
  • [5] Re2fresh: A Framework for Mitigating Read Disturbance in ReRAM-based DNN Accelerators
    Shin, Hyein
    Kang, Myeonggu
    Kim, Lee-Sup
    2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [6] SRA: A Secure ReRAM-Based DNN Accelerator
    Zhao, Lei
    Zhang, Youtao
    Yang, Jun
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 355 - 360
  • [7] Runtime Row/Column Activation Pruning for ReRAM-based Processing-in-Memory DNN Accelerators
    Jiang, Xikun
    Shen, Zhaoyan
    Sun, Siqing
    Yin, Ping
    Jia, Zhiping
    Ju, Lei
    Zhang, Zhiyong
    Yu, Dongxiao
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [8] Hardware attacks on ReRAM-based AI accelerators
    Heidary, Masoud
    Joardar, Biresh Kumar
    17TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS 2024, 2024,
  • [9] Boosting ReRAM-based DNN by Row Activation Oversubscription
    Guo, Mengyu
    Zhang, Zihan
    Jiang, Jianfei
    Wang, Qin
    Jing, Naifeng
    27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 604 - 609
  • [10] CRPIM: An efficient compute-reuse scheme for ReRAM-based Processing-in-Memory DNN accelerators
    Hong, Shihao
    Chung, Yeh-Ching
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 153