Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators

被引：3

作者：

Kim, Yulhwa ^{[1
]}

Kim, Hyungjun ^{[1
]}

Kim, Jae-Joon ^{[2
]}

机构：

[1] Pohang Univ Sci & Technol, 77 Cheongam Ro, Pohang 37673, Gyeongsangbuk D, South Korea

[2] Seoul Natl Univ, 1 Gwanak Ro, Seoul 08826, South Korea

来源：

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS | 2022年 / 18卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Computing-in-memory; processing-in-memory; neural networks; analog computing;

D O I：

10.1145/3528104

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In Analog Computing-in-Memory (CIM) neural network accelerators, analog-to-digital converters (ADCs) are required to convert the analog partial sums generated from a CIM array to digital values. The overhead from ADCs substantially degrades the energy efficiency of CIM accelerators so that previous works attempted to lower the ADC resolution considering the distribution of the partial sums. Despite the efforts, the required ADC resolution still remains relatively high. In this article, we propose the data-driven partial sum quantization scheme, which exhaustively searches for the optimal quantization range with little computational burden. We also report that analyzing the characteristics of the partial sum distributions at each layer gives an additional information to further reduce the ADC resolution compared to previous works that mostly used the characteristics of the partial sum distributions of the entire network. Based on the finer-level data-driven approach combined with retraining, we present a methodology for extreme partial-sum quantization. Experimental results show that the proposed method can reduce the ADC resolution to 2 to 3 bits for CIFAR-10 dataset, which is the smaller ADC bit resolution than any previous CIM-based NN accelerators.

引用

页数：19

共 50 条

[41] Small Memory Footprint Neural Network Accelerators
Seto, Kenshu
Nejatollahi, Hamid
An, Jiyoung
Kang, Sujin
Dutt, Nikil
PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 253 - 258
[42] Design of Computing-in-Memory (CIM) with Vertical Split-Gate Flash Memory for Deep Neural Network (DNN) Inference Accelerator
Lue, Hang-Ting
Hu, Han-Wen
Hsu, Tzu-Hsuan
Hsu, Po-Kai
Wang, Keh-Chung
Lu, Chih-Yuan
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[43] A large-scale in-memory computing for deep neural network with trained quantization
Cheng, Yuan
Wang, Chao
Chen, Hai-Bao
Yu, Hao
INTEGRATION-THE VLSI JOURNAL, 2019, 69 : 345 - 355
[44] A hybrid precision low power computing-in-memory architecture for neural networks
Xu, Rui
Tao, Linfeng
Wang, Tianqi
Jin, Xi
Li, Chenxia
Li, Zhengda
Ren, Jun
MICROPROCESSORS AND MICROSYSTEMS, 2021, 80
[45] Resilient Neural Network Training for Accelerators with Computing Errors
Xu, Dawen
Xing, Kouzi
Liu, Cheng
Wang, Ying
Dai, Yulin
Cheng, Long
Li, Huawei
Zhang, Lei
2019 IEEE 30TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2019), 2019, : 99 - 102
[46] Flash Memory based Computing-In-Memory to Solve Time-dependent Partial Differential Equations
Feng, Yang
Zhan, Xuepeng
Chen, Jiezhi
2020 IEEE SILICON NANOELECTRONICS WORKSHOP (SNW), 2020, : 27 - 28
[47] Efficient Neural Network Accelerators with Optical Computing and Communication
Xia, Chengpeng
Chen, Yawen
Zhang, Haibo
Zhang, Hao
Dai, Fei
Wu, Jigang
COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (01) : 513 - 535
[48] A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory
Wei, Wei-Chen
Jhang, Chuan-Jia
Chen, Yi-Ren
Xue, Cheng-Xin
Sie, Syuan-Hao
Lee, Jye-Luen
Kuo, Hao-Wen
Lu, Chih-Cheng
Chang, Meng-Fan
Tang, Kea-Tiong
IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2020, 6 (01): : 45 - 52
[49] Measurement of Aging Effect on an Analog Computing-In-Memory Macro in 28nm CMOS
Wang, Wei-Chun
Zhang, Shida
Sharma, Sudarshan
Lee, Minah
Mukhopadhyay, Saibal
2024 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM, IRPS 2024, 2024,
[50] TGBNN: Training Algorithm of Binarized Neural Network With Ternary Gradients for MRAM-Based Computing-in-Memory Architecture
Fujiwara, Yuya
Kawahara, Takayuki
IEEE ACCESS, 2024, 12 : 150962 - 150974

← 1 2 3 4 5 →