Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators

被引:3
|
作者
Kim, Yulhwa [1 ]
Kim, Hyungjun [1 ]
Kim, Jae-Joon [2 ]
机构
[1] Pohang Univ Sci & Technol, 77 Cheongam Ro, Pohang 37673, Gyeongsangbuk D, South Korea
[2] Seoul Natl Univ, 1 Gwanak Ro, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Computing-in-memory; processing-in-memory; neural networks; analog computing;
D O I
10.1145/3528104
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In Analog Computing-in-Memory (CIM) neural network accelerators, analog-to-digital converters (ADCs) are required to convert the analog partial sums generated from a CIM array to digital values. The overhead from ADCs substantially degrades the energy efficiency of CIM accelerators so that previous works attempted to lower the ADC resolution considering the distribution of the partial sums. Despite the efforts, the required ADC resolution still remains relatively high. In this article, we propose the data-driven partial sum quantization scheme, which exhaustively searches for the optimal quantization range with little computational burden. We also report that analyzing the characteristics of the partial sum distributions at each layer gives an additional information to further reduce the ADC resolution compared to previous works that mostly used the characteristics of the partial sum distributions of the entire network. Based on the finer-level data-driven approach combined with retraining, we present a methodology for extreme partial-sum quantization. Experimental results show that the proposed method can reduce the ADC resolution to 2 to 3 bits for CIFAR-10 dataset, which is the smaller ADC bit resolution than any previous CIM-based NN accelerators.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Memristor-based Deep Spiking Neural Network with a Computing-In-Memory Architecture
    Nowshin, Fabiha
    Yi, Yang
    PROCEEDINGS OF THE TWENTY THIRD INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2022), 2022, : 163 - 168
  • [22] A Quantization Model Based on a Floating-point Computing-in-Memory Architecture
    Clien, Xi
    Guo, An
    Xu, Xinbing
    Si, Xin
    Yang, Jun
    2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 493 - 496
  • [23] Computing-in-Memory with SRAM and RRAM for Binary Neural Netowrks
    Sun, Xiaoyu
    Liu, Rui
    Peng, Xiaochen
    Yu, Shimeng
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1245 - 1248
  • [24] Design Framework for SRAM-Based Computing-In-Memory Edge CNN Accelerators
    Wang, Yimin
    Zou, Zhuo
    Zheng, Lirong
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [25] CIM2PQ: An Arraywise and Hardware-Friendly Mixed Precision Quantization Method for Analog Computing-In-Memory
    Sun, Sifan
    Bai, Jinyu
    Shi, Zhaoyu
    Zhao, Weisheng
    Kang, Wang
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (07) : 2084 - 2097
  • [26] Impacts and solutions of nonvolatile-memory-induced weight error in the computing-in-memory neural network system
    Lin, Yu-Hsuan
    Lee, Dai-Ying
    Wang, Chao-Hung
    Wei, Ming-Liang
    Lee, Ming-Hsiu
    Lung, Hsiang-Lan
    Hsieh, Kuang-Yeu
    Wang, Keh-Chung
    Lu, Chih-Yuan
    JAPANESE JOURNAL OF APPLIED PHYSICS, 2020, 59 (SG)
  • [27] On the Accuracy of Analog Neural Network Inference Accelerators
    Xiao, T. Patrick
    Feinberg, Ben
    Bennett, Christopher H.
    Prabhakar, Venkatraman
    Saxena, Prashant
    Agrawal, Vineet
    Agarwal, Sapan
    Marinella, Matthew J.
    IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2022, 22 (04) : 26 - 48
  • [28] NavCim: Comprehensive Design Space Exploration for Analog Computing-in-Memory Architectures
    Park, Juseong
    Kim, Boseok
    Sung, Hyojin
    PROCEEDINGS OF THE 2024 THE INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2024, 2024, : 168 - 182
  • [29] Flash memory based computing-in-memory system to solve partial differential equations
    Feng, Yang
    Wang, Fei
    Zhan, Xuepeng
    Li, Yuan
    Chen, Jiezhi
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
  • [30] Flash memory based computing-in-memory system to solve partial differential equations
    Yang FENG
    Fei WANG
    Xuepeng ZHAN
    Yuan LI
    Jiezhi CHEN
    Science China(Information Sciences), 2021, 64 (06) : 257 - 258