Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引：0

作者：

Azamat, Azat ^{[1
]}

Asim, Faaiz ^{[2
]}

Kim, Jintae ^{[3
]}

Lee, Jongeun ^{[2
]}

机构：

[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea

[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea

[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 12期

关键词：

Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;

D O I：

10.1109/TCAD.2023.3294461

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).

引用

页码：4897 / 4908

页数：12

共 50 条

[11] A Cascaded ReRAM-based Crossbar Architecture for Transformer Neural Network Acceleration
Xu, Jiahong
Liu, Haikun
Peng, Xiaoyang
Duan, Zhuohui
Liao, Xiaofei
Jin, Hai
ACM Transactions on Design Automation of Electronic Systems, 2024, 30 (01)
[12] Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators
Joardar, Biresh Kumar
Doppa, Janardhan Rao
Li, Hai
Chakrabarty, Krishnendu
Pande, Partha Pratim
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
[13] Effective Zero Compression on ReRAM-based Sparse DNN Accelerators
Shin, Hoon
Park, Rihae
Lee, Seung Yul
Park, Yeonhong
Lee, Hyunseung
Lee, Jae W.
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 949 - 954
[14] Partial-sum Quantization for near ADC-Less Compute-In-Memory Accelerators
Saxena, Utkarsh
Roy, Kaushik
2023 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED, 2023,
[15] PPO-Based Automated Quantization for ReRAM-Based Hardware Accelerator
Wei Z.
Zhang X.
Zhuo Z.
Ji Z.
Li Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (03): : 518 - 532
[16] Cycle-to-Cycle Variation Suppression in ReRAM-Based AI Accelerators
Fu, Jingyan
Liao, Zhiheng
Wang, Jinhui
2023 IEEE PHYSICAL ASSURANCE AND INSPECTION OF ELECTRONICS, PAINE, 2023, : 47 - 52
[17] Partial Sum Quantization for Computing-In-Memory-Based Neural Network Accelerator
Bai, Jinyu
Xue, Wenlu
Fan, Yunqian
Sun, Sifan
Kang, Wang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (08) : 3049 - 3053
[18] Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-Based DNN Accelerators
Wu, Xueying
Hanson, Edward
Wang, Nansu
Zheng, Qilin
Yang, Xiaoxuan
Yang, Huanrui
Li, Shiyu
Cheng, Feng
Pande, Partha Pratim
Doppa, Janardhan Rao
Chakrabarty, Krishnendu
Li, Hai
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (12) : 4558 - 4571
[19] A Reduced Architecture for ReRAM-Based Neural Network Accelerator and Its Software Stack
Ji, Yu
Liu, Zixin
Zhang, Youhui
IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (03) : 316 - 331
[20] ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration
Long, Yun
Na, Taesik
Mukhopadhyay, Saibal
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (12) : 2781 - 2794

← 1 2 3 4 5 →