Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引：0

作者：

Azamat, Azat ^{[1
]}

Asim, Faaiz ^{[2
]}

Kim, Jintae ^{[3
]}

Lee, Jongeun ^{[2
]}

机构：

[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea

[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea

[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 12期

关键词：

Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;

D O I：

10.1109/TCAD.2023.3294461

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).

引用

页码：4897 / 4908

页数：12

共 50 条

[1] Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators
Azamat, Azat
Asim, Faaiz
Lee, Jongeun
2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
[2] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators
Huang, Sitao
Ankit, Aayush
Silveira, Plinio
Antunes, Rodrigo
Chalamalasetti, Sai Rahul
El Hajj, Izzat
Kim, Dong Eun
Aguiar, Glaucimar
Bruel, Pedro
Serebryakov, Sergey
Xu, Cong
Li, Can
Faraboschi, Paolo
Strachan, John Paul
Chen, Deming
Roy, Kaushik
Hwu, Wen-mei
Milojicic, Dejan
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377
[3] APQ: Automated DNN Pruning and Quantization for ReRAM-Based Accelerators
Yang, Siling
He, Shuibing
Duan, Hexiao
Chen, Weijian
Zhang, Xuechen
Wu, Tong
Yin, Yanlong
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (09) : 2498 - 2511
[4] Trained Biased Number Representation for ReRAM-Based Neural Network Accelerators
Wang, Weijia
Lin, Bill
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (02)
[5] A Quantized Training Framework for Robust and Accurate ReRAM-based Neural Network Accelerators
Zhang, Chenguang
Zhou, Pingqiang
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 43 - 48
[6] ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
Xu, Jiahong
Li, Haikun
Duan, Zhuohui
Liao, Xiaofei
Jin, Hai
Yang, Xiaokang
Li, Huize
Liu, Cong
Mao, Fubing
Zhang, Yu
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (03)
[7] Offline Training-Based Mitigation of IR Drop for ReRAM-Based Deep Neural Network Accelerators
Lee, Sugil
Fouda, Mohammed E.
Lee, Jongeun
Eltawil, Ahmed M.
Kurdahi, Fadi
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (02) : 521 - 532
[8] Hardware attacks on ReRAM-based AI accelerators
Heidary, Masoud
Joardar, Biresh Kumar
17TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS 2024, 2024,
[9] Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators
Kim, Yulhwa
Kim, Hyungjun
Kim, Jae-Joon
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (04)
[10] Device Modeling Bias in ReRAM-Based Neural Network Simulations
Yousuf, Osama
Hossen, Imtiaz
Daniels, Matthew W.
Lueker-Boden, Martin
Dienstfrey, Andrew
Adam, Gina C.
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2023, 13 (01) : 382 - 394

← 1 2 3 4 5 →