RNSiM: Efficient Deep Neural Network Accelerator Using Residue Number Systems

被引:7
|
作者
Roohi, Arman [1 ]
Taheri, MohammadReza
Angizi, Shaahin [2 ]
Fan, Deliang [3 ]
机构
[1] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA
[2] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA
[3] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ USA
关键词
residue number system; processing-in-Memory; convolutional neural network; accelerator;
D O I
10.1109/ICCAD51958.2021.9643531
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose an efficient convolutional neural network (CNN) accelerator design, entitled RNSiM, based on the Residue Number System (RNS) as an alternative for the conventional binary number representation. Instead of traditional arithmetic implementation that suffers from the inevitable lengthy carry propagation chain, the novelty of RNSiM lies in that all the data, including stored weights and communication/computation, are performed in the RNS domain. Due to the inherent parallelism of the RNS arithmetic, power and latency are significantly reduced. Moreover, an enhanced integrated intermodulo operation core is developed to decrease the overhead imposed by non-modular operations. Further improvement in systems' performance efficiency is achieved by developing efficient Processing-in-Memory (PIM) designs using various volatile CMOS and non-volatile Post-CMOS technologies to accelerate RNS-based multiplication-and-accumulations (MACs). The RNSiM accelerator's performance on different datasets, including MNIST, SVHN, and CIFAR-10, is evaluated. With almost the same accuracy to the baseline CNN, the RNSiM accelerator can significantly increase both energy-efficiency and speedup compared with the state-of-the-art FPGA, GPU, and PIM designs. RNSiM and other RNS-PIMs, based on our method, reduce the energy consumption by orders of 28-77x and 331-897x compared with the FPGA and the GPU platforms, respectively.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] RECOM: An Efficient Resistive Accelerator for Compressed Deep Neural Networks
    Ji, Houxiang
    Song, Linghao
    Jiang, Li
    Li, Ha
    Chen, Yiran
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 237 - 240
  • [32] Residue Number System Design Automation for Neural Network Acceleration
    Lin, Liang-Yu
    Schroff, Jerrae
    Lin, Tsu-Ping
    Huang, Tsung-Chu
    2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [33] CSDSE: An efficient design space exploration framework for deep neural network accelerator based on cooperative search
    Feng, Kaijie
    Fan, Xiaoya
    An, Jianfeng
    Wang, Haoyang
    Li, Chuxi
    NEUROCOMPUTING, 2025, 623
  • [34] A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator
    Li, Xueming
    Huang, Hongmin
    Chen, Taosheng
    Gao, Huaien
    Hu, Xianghong
    Xiong, Xiaoming
    MICROELECTRONICS JOURNAL, 2022, 128
  • [35] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Choi, Seungkyu
    Sim, Jaehyeong
    Kang, Myeonggu
    Choi, Yeongjae
    Kim, Hyeonuk
    Kim, Lee-Sup
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702
  • [36] MOSDA: On-chip memory optimized sparse deep neural network accelerator with efficient index matching
    Xu, Hongjie
    Shiomi, Jun
    Onodera, Hidetoshi
    IEEE Open Journal of Circuits and Systems, 2021, 2 : 144 - 155
  • [37] SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning
    Sunny, Febin
    Nikdast, Mandi
    Pasricha, Sudeep
    27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 214 - 219
  • [38] UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision
    Lee, Jinmook
    Kim, Changhyeon
    Kang, Sanghoon
    Shin, Dongjoo
    Kim, Sangyeob
    Yoo, Hoi-Jun
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 173 - 185
  • [39] The Design and Implementation of Scalable Deep Neural Network Accelerator Cores
    Sakamoto, Ryuichi
    Takata, Ryo
    Ishii, Jun
    Kondo, Masaaki
    Nakamura, Hiroshi
    Ohkubo, Tetsui
    Kojima, Takuya
    Amano, Hideharu
    2017 IEEE 11TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2017), 2017, : 13 - 20
  • [40] Deep Neural Network Training Accelerator Designs in ASIC and FPGA
    Venkataramanaiah, Shreyas K.
    Yin, Shihui
    Cao, Yu
    Seo, Jae-Sun
    2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 21 - 22