Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators

被引：3

作者：

Kim, Yulhwa ^{[1
]}

Kim, Hyungjun ^{[1
]}

Kim, Jae-Joon ^{[2
]}

机构：

[1] Pohang Univ Sci & Technol, 77 Cheongam Ro, Pohang 37673, Gyeongsangbuk D, South Korea

[2] Seoul Natl Univ, 1 Gwanak Ro, Seoul 08826, South Korea

来源：

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS | 2022年 / 18卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Computing-in-memory; processing-in-memory; neural networks; analog computing;

D O I：

10.1145/3528104

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In Analog Computing-in-Memory (CIM) neural network accelerators, analog-to-digital converters (ADCs) are required to convert the analog partial sums generated from a CIM array to digital values. The overhead from ADCs substantially degrades the energy efficiency of CIM accelerators so that previous works attempted to lower the ADC resolution considering the distribution of the partial sums. Despite the efforts, the required ADC resolution still remains relatively high. In this article, we propose the data-driven partial sum quantization scheme, which exhaustively searches for the optimal quantization range with little computational burden. We also report that analyzing the characteristics of the partial sum distributions at each layer gives an additional information to further reduce the ADC resolution compared to previous works that mostly used the characteristics of the partial sum distributions of the entire network. Based on the finer-level data-driven approach combined with retraining, we present a methodology for extreme partial-sum quantization. Experimental results show that the proposed method can reduce the ADC resolution to 2 to 3 bits for CIFAR-10 dataset, which is the smaller ADC bit resolution than any previous CIM-based NN accelerators.

引用

页数：19

共 50 条

[21] Memristor-based Deep Spiking Neural Network with a Computing-In-Memory Architecture
Nowshin, Fabiha
Yi, Yang
PROCEEDINGS OF THE TWENTY THIRD INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2022), 2022, : 163 - 168
[22] A Quantization Model Based on a Floating-point Computing-in-Memory Architecture
Clien, Xi
Guo, An
Xu, Xinbing
Si, Xin
Yang, Jun
2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 493 - 496
[23] Computing-in-Memory with SRAM and RRAM for Binary Neural Netowrks
Sun, Xiaoyu
Liu, Rui
Peng, Xiaochen
Yu, Shimeng
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1245 - 1248
[24] Design Framework for SRAM-Based Computing-In-Memory Edge CNN Accelerators
Wang, Yimin
Zou, Zhuo
Zheng, Lirong
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[25] CIM2PQ: An Arraywise and Hardware-Friendly Mixed Precision Quantization Method for Analog Computing-In-Memory
Sun, Sifan
Bai, Jinyu
Shi, Zhaoyu
Zhao, Weisheng
Kang, Wang
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (07) : 2084 - 2097
[26] Impacts and solutions of nonvolatile-memory-induced weight error in the computing-in-memory neural network system
Lin, Yu-Hsuan
Lee, Dai-Ying
Wang, Chao-Hung
Wei, Ming-Liang
Lee, Ming-Hsiu
Lung, Hsiang-Lan
Hsieh, Kuang-Yeu
Wang, Keh-Chung
Lu, Chih-Yuan
JAPANESE JOURNAL OF APPLIED PHYSICS, 2020, 59 (SG)
[27] On the Accuracy of Analog Neural Network Inference Accelerators
Xiao, T. Patrick
Feinberg, Ben
Bennett, Christopher H.
Prabhakar, Venkatraman
Saxena, Prashant
Agrawal, Vineet
Agarwal, Sapan
Marinella, Matthew J.
IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2022, 22 (04) : 26 - 48
[28] NavCim: Comprehensive Design Space Exploration for Analog Computing-in-Memory Architectures
Park, Juseong
Kim, Boseok
Sung, Hyojin
PROCEEDINGS OF THE 2024 THE INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2024, 2024, : 168 - 182
[29] Flash memory based computing-in-memory system to solve partial differential equations
Feng, Yang
Wang, Fei
Zhan, Xuepeng
Li, Yuan
Chen, Jiezhi
SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
[30] Flash memory based computing-in-memory system to solve partial differential equations
Yang FENG
Fei WANG
Xuepeng ZHAN
Yuan LI
Jiezhi CHEN
Science China(Information Sciences), 2021, 64 (06) : 257 - 258

← 1 2 3 4 5 →