Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引:0
|
作者
Azamat, Azat [1 ]
Asim, Faaiz [2 ]
Kim, Jintae [3 ]
Lee, Jongeun [2 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea
[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea
[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea
关键词
Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;
D O I
10.1109/TCAD.2023.3294461
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).
引用
收藏
页码:4897 / 4908
页数:12
相关论文
共 50 条
  • [1] Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators
    Azamat, Azat
    Asim, Faaiz
    Lee, Jongeun
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [2] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators
    Huang, Sitao
    Ankit, Aayush
    Silveira, Plinio
    Antunes, Rodrigo
    Chalamalasetti, Sai Rahul
    El Hajj, Izzat
    Kim, Dong Eun
    Aguiar, Glaucimar
    Bruel, Pedro
    Serebryakov, Sergey
    Xu, Cong
    Li, Can
    Faraboschi, Paolo
    Strachan, John Paul
    Chen, Deming
    Roy, Kaushik
    Hwu, Wen-mei
    Milojicic, Dejan
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377
  • [3] APQ: Automated DNN Pruning and Quantization for ReRAM-Based Accelerators
    Yang, Siling
    He, Shuibing
    Duan, Hexiao
    Chen, Weijian
    Zhang, Xuechen
    Wu, Tong
    Yin, Yanlong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (09) : 2498 - 2511
  • [4] Trained Biased Number Representation for ReRAM-Based Neural Network Accelerators
    Wang, Weijia
    Lin, Bill
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (02)
  • [5] A Quantized Training Framework for Robust and Accurate ReRAM-based Neural Network Accelerators
    Zhang, Chenguang
    Zhou, Pingqiang
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 43 - 48
  • [6] ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
    Xu, Jiahong
    Li, Haikun
    Duan, Zhuohui
    Liao, Xiaofei
    Jin, Hai
    Yang, Xiaokang
    Li, Huize
    Liu, Cong
    Mao, Fubing
    Zhang, Yu
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (03)
  • [7] Offline Training-Based Mitigation of IR Drop for ReRAM-Based Deep Neural Network Accelerators
    Lee, Sugil
    Fouda, Mohammed E.
    Lee, Jongeun
    Eltawil, Ahmed M.
    Kurdahi, Fadi
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (02) : 521 - 532
  • [8] Hardware attacks on ReRAM-based AI accelerators
    Heidary, Masoud
    Joardar, Biresh Kumar
    17TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS 2024, 2024,
  • [9] Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators
    Kim, Yulhwa
    Kim, Hyungjun
    Kim, Jae-Joon
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (04)
  • [10] Device Modeling Bias in ReRAM-Based Neural Network Simulations
    Yousuf, Osama
    Hossen, Imtiaz
    Daniels, Matthew W.
    Lueker-Boden, Martin
    Dienstfrey, Andrew
    Adam, Gina C.
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2023, 13 (01) : 382 - 394