Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引:5
|
作者
Shin, Hoon [1 ]
Park, Rihae [1 ]
Lee, Seung Yul [1 ]
Park, Yeonhong [1 ]
Lee, Hyunseung [1 ]
Lee, Jae W. [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
关键词
D O I
10.1145/3489517.3530564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.
引用
收藏
页码:949 / 954
页数:6
相关论文
共 50 条
  • [41] Model Inference Optimization on ReRAM-Based Accelerators with Intra- and Inter-OU Similarity
    Li, Tao
    Zhao, Qinghang
    2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 666 - 671
  • [42] MaxTracker: Continuously Tracking the Maximum Computation Progress for Energy Harvesting ReRAM-based CNN Accelerators
    Qiu, Keni
    Jao, Nicholas
    Zhou, Kunyu
    Liu, Yongpan
    Sampson, Jack
    Kandemir, Mahmut Taylan
    Narayanan, Vijaykrishnan
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [43] A Framework for Area-efficient Multi-task BERT Execution on ReRAM-based Accelerators
    Kang, Myeonggu
    Shin, Hyein
    Shin, Jaekang
    Kim, Lee-Sup
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [44] Spara: An Energy-Efficient ReRAM-Based Accelerator for Sparse Graph Analytics Applications
    Zheng, Long
    Zhao, Jieshan
    Huang, Yu
    Wang, Qinggang
    Zeng, Zhen
    Xue, Jingling
    Liao, Xiaofei
    Jin, Hai
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, 2020, : 696 - 707
  • [45] ReSpar: Reordering Algorithm for ReRAM-based Sparse Matrix-Vector Multiplication Accelerator
    Hsiao, Yi-Jou
    Nien, Chin-Fu
    Cheng, Hsiang-Yun
    2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 260 - 268
  • [46] ReRAM-based Accelerator for Deep Learning
    Li, Bing
    Song, Linghao
    Chen, Fan
    Qian, Xuehai
    Chen, Yiran
    Li, Hai
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 815 - 820
  • [47] VECOM: Variation-Resilient Encoding and Offset Compensation Schemes for Reliable ReRAM-Based DNN Accelerator
    Jang, Je-Woo
    Thai-Hoang Nguyen
    Yang, Joon-Sung
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [48] Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
    Yuan, Geng
    Liao, Zhiheng
    Ma, Xiaolong
    Cai, Yuxuan
    Kong, Zhenglun
    Shen, Xuan
    Fu, Jingyan
    Li, Zhengang
    Zhang, Chengming
    Peng, Hongwu
    Liu, Ning
    Ren, Ao
    Wang, Jinhui
    Wang, Yanzhi
    PROCEEDINGS OF THE 2021 TWENTY SECOND INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2021), 2021, : 135 - 141
  • [49] Learning to Predict IR Drop with Effective Training for ReRAM-based Neural Network Hardware
    Lee, Sugil
    Jung, Giju
    Fouda, Mohammed E.
    Lee, Jongeun
    Eltawil, Ahmed
    Kurdahi, Fadi
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [50] ReRAM-based Synaptic Device for Neuromorphic Computing
    Jang, Jun-Woo
    Park, Sangsu
    Jeong, Yoon-Ha
    Hwang, Hyunsang
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 1054 - 1057