Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators

被引:3
|
作者
Kim, Yulhwa [1 ]
Kim, Hyungjun [1 ]
Kim, Jae-Joon [2 ]
机构
[1] Pohang Univ Sci & Technol, 77 Cheongam Ro, Pohang 37673, Gyeongsangbuk D, South Korea
[2] Seoul Natl Univ, 1 Gwanak Ro, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Computing-in-memory; processing-in-memory; neural networks; analog computing;
D O I
10.1145/3528104
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In Analog Computing-in-Memory (CIM) neural network accelerators, analog-to-digital converters (ADCs) are required to convert the analog partial sums generated from a CIM array to digital values. The overhead from ADCs substantially degrades the energy efficiency of CIM accelerators so that previous works attempted to lower the ADC resolution considering the distribution of the partial sums. Despite the efforts, the required ADC resolution still remains relatively high. In this article, we propose the data-driven partial sum quantization scheme, which exhaustively searches for the optimal quantization range with little computational burden. We also report that analyzing the characteristics of the partial sum distributions at each layer gives an additional information to further reduce the ADC resolution compared to previous works that mostly used the characteristics of the partial sum distributions of the entire network. Based on the finer-level data-driven approach combined with retraining, we present a methodology for extreme partial-sum quantization. Experimental results show that the proposed method can reduce the ADC resolution to 2 to 3 bits for CIFAR-10 dataset, which is the smaller ADC bit resolution than any previous CIM-based NN accelerators.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Small Memory Footprint Neural Network Accelerators
    Seto, Kenshu
    Nejatollahi, Hamid
    An, Jiyoung
    Kang, Sujin
    Dutt, Nikil
    PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 253 - 258
  • [42] Design of Computing-in-Memory (CIM) with Vertical Split-Gate Flash Memory for Deep Neural Network (DNN) Inference Accelerator
    Lue, Hang-Ting
    Hu, Han-Wen
    Hsu, Tzu-Hsuan
    Hsu, Po-Kai
    Wang, Keh-Chung
    Lu, Chih-Yuan
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [43] A large-scale in-memory computing for deep neural network with trained quantization
    Cheng, Yuan
    Wang, Chao
    Chen, Hai-Bao
    Yu, Hao
    INTEGRATION-THE VLSI JOURNAL, 2019, 69 : 345 - 355
  • [44] A hybrid precision low power computing-in-memory architecture for neural networks
    Xu, Rui
    Tao, Linfeng
    Wang, Tianqi
    Jin, Xi
    Li, Chenxia
    Li, Zhengda
    Ren, Jun
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 80
  • [45] Resilient Neural Network Training for Accelerators with Computing Errors
    Xu, Dawen
    Xing, Kouzi
    Liu, Cheng
    Wang, Ying
    Dai, Yulin
    Cheng, Long
    Li, Huawei
    Zhang, Lei
    2019 IEEE 30TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2019), 2019, : 99 - 102
  • [46] Flash Memory based Computing-In-Memory to Solve Time-dependent Partial Differential Equations
    Feng, Yang
    Zhan, Xuepeng
    Chen, Jiezhi
    2020 IEEE SILICON NANOELECTRONICS WORKSHOP (SNW), 2020, : 27 - 28
  • [47] Efficient Neural Network Accelerators with Optical Computing and Communication
    Xia, Chengpeng
    Chen, Yawen
    Zhang, Haibo
    Zhang, Hao
    Dai, Fei
    Wu, Jigang
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (01) : 513 - 535
  • [48] A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory
    Wei, Wei-Chen
    Jhang, Chuan-Jia
    Chen, Yi-Ren
    Xue, Cheng-Xin
    Sie, Syuan-Hao
    Lee, Jye-Luen
    Kuo, Hao-Wen
    Lu, Chih-Cheng
    Chang, Meng-Fan
    Tang, Kea-Tiong
    IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2020, 6 (01): : 45 - 52
  • [49] Measurement of Aging Effect on an Analog Computing-In-Memory Macro in 28nm CMOS
    Wang, Wei-Chun
    Zhang, Shida
    Sharma, Sudarshan
    Lee, Minah
    Mukhopadhyay, Saibal
    2024 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM, IRPS 2024, 2024,
  • [50] TGBNN: Training Algorithm of Binarized Neural Network With Ternary Gradients for MRAM-Based Computing-in-Memory Architecture
    Fujiwara, Yuya
    Kawahara, Takayuki
    IEEE ACCESS, 2024, 12 : 150962 - 150974