Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引:0
|
作者
Azamat, Azat [1 ]
Asim, Faaiz [2 ]
Kim, Jintae [3 ]
Lee, Jongeun [2 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea
[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea
[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea
关键词
Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;
D O I
10.1109/TCAD.2023.3294461
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).
引用
收藏
页码:4897 / 4908
页数:12
相关论文
共 50 条
  • [11] A Cascaded ReRAM-based Crossbar Architecture for Transformer Neural Network Acceleration
    Xu, Jiahong
    Liu, Haikun
    Peng, Xiaoyang
    Duan, Zhuohui
    Liao, Xiaofei
    Jin, Hai
    ACM Transactions on Design Automation of Electronic Systems, 2024, 30 (01)
  • [12] Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators
    Joardar, Biresh Kumar
    Doppa, Janardhan Rao
    Li, Hai
    Chakrabarty, Krishnendu
    Pande, Partha Pratim
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [13] Effective Zero Compression on ReRAM-based Sparse DNN Accelerators
    Shin, Hoon
    Park, Rihae
    Lee, Seung Yul
    Park, Yeonhong
    Lee, Hyunseung
    Lee, Jae W.
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 949 - 954
  • [14] Partial-sum Quantization for near ADC-Less Compute-In-Memory Accelerators
    Saxena, Utkarsh
    Roy, Kaushik
    2023 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED, 2023,
  • [15] PPO-Based Automated Quantization for ReRAM-Based Hardware Accelerator
    Wei Z.
    Zhang X.
    Zhuo Z.
    Ji Z.
    Li Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (03): : 518 - 532
  • [16] Cycle-to-Cycle Variation Suppression in ReRAM-Based AI Accelerators
    Fu, Jingyan
    Liao, Zhiheng
    Wang, Jinhui
    2023 IEEE PHYSICAL ASSURANCE AND INSPECTION OF ELECTRONICS, PAINE, 2023, : 47 - 52
  • [17] Partial Sum Quantization for Computing-In-Memory-Based Neural Network Accelerator
    Bai, Jinyu
    Xue, Wenlu
    Fan, Yunqian
    Sun, Sifan
    Kang, Wang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (08) : 3049 - 3053
  • [18] Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-Based DNN Accelerators
    Wu, Xueying
    Hanson, Edward
    Wang, Nansu
    Zheng, Qilin
    Yang, Xiaoxuan
    Yang, Huanrui
    Li, Shiyu
    Cheng, Feng
    Pande, Partha Pratim
    Doppa, Janardhan Rao
    Chakrabarty, Krishnendu
    Li, Hai
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (12) : 4558 - 4571
  • [19] A Reduced Architecture for ReRAM-Based Neural Network Accelerator and Its Software Stack
    Ji, Yu
    Liu, Zixin
    Zhang, Youhui
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (03) : 316 - 331
  • [20] ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration
    Long, Yun
    Na, Taesik
    Mukhopadhyay, Saibal
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (12) : 2781 - 2794