RNSiM: Efficient Deep Neural Network Accelerator Using Residue Number Systems

被引：7

作者：

Roohi, Arman ^{[1
]}

Taheri, MohammadReza

Angizi, Shaahin ^{[2
]}

Fan, Deliang ^{[3
]}

机构：

[1] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA

[2] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA

[3] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ USA

来源：

2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD) | 2021年

关键词：

residue number system; processing-in-Memory; convolutional neural network; accelerator;

D O I：

10.1109/ICCAD51958.2021.9643531

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we propose an efficient convolutional neural network (CNN) accelerator design, entitled RNSiM, based on the Residue Number System (RNS) as an alternative for the conventional binary number representation. Instead of traditional arithmetic implementation that suffers from the inevitable lengthy carry propagation chain, the novelty of RNSiM lies in that all the data, including stored weights and communication/computation, are performed in the RNS domain. Due to the inherent parallelism of the RNS arithmetic, power and latency are significantly reduced. Moreover, an enhanced integrated intermodulo operation core is developed to decrease the overhead imposed by non-modular operations. Further improvement in systems' performance efficiency is achieved by developing efficient Processing-in-Memory (PIM) designs using various volatile CMOS and non-volatile Post-CMOS technologies to accelerate RNS-based multiplication-and-accumulations (MACs). The RNSiM accelerator's performance on different datasets, including MNIST, SVHN, and CIFAR-10, is evaluated. With almost the same accuracy to the baseline CNN, the RNSiM accelerator can significantly increase both energy-efficiency and speedup compared with the state-of-the-art FPGA, GPU, and PIM designs. RNSiM and other RNS-PIMs, based on our method, reduce the energy consumption by orders of 28-77x and 331-897x compared with the FPGA and the GPU platforms, respectively.

引用

页数：9

共 50 条

[31] RECOM: An Efficient Resistive Accelerator for Compressed Deep Neural Networks
Ji, Houxiang
Song, Linghao
Jiang, Li
Li, Ha
Chen, Yiran
PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 237 - 240
[32] Residue Number System Design Automation for Neural Network Acceleration
Lin, Liang-Yu
Schroff, Jerrae
Lin, Tsu-Ping
Huang, Tsung-Chu
2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
[33] CSDSE: An efficient design space exploration framework for deep neural network accelerator based on cooperative search
Feng, Kaijie
Fan, Xiaoya
An, Jianfeng
Wang, Haoyang
Li, Chuxi
NEUROCOMPUTING, 2025, 623
[34] A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator
Li, Xueming
Huang, Hongmin
Chen, Taosheng
Gao, Huaien
Hu, Xianghong
Xiong, Xiaoming
MICROELECTRONICS JOURNAL, 2022, 128
[35] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
Choi, Seungkyu
Sim, Jaehyeong
Kang, Myeonggu
Choi, Yeongjae
Kim, Hyeonuk
Kim, Lee-Sup
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702
[36] MOSDA: On-chip memory optimized sparse deep neural network accelerator with efficient index matching
Xu, Hongjie
Shiomi, Jun
Onodera, Hidetoshi
IEEE Open Journal of Circuits and Systems, 2021, 2 : 144 - 155
[37] SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning
Sunny, Febin
Nikdast, Mandi
Pasricha, Sudeep
27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 214 - 219
[38] UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision
Lee, Jinmook
Kim, Changhyeon
Kang, Sanghoon
Shin, Dongjoo
Kim, Sangyeob
Yoo, Hoi-Jun
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 173 - 185
[39] The Design and Implementation of Scalable Deep Neural Network Accelerator Cores
Sakamoto, Ryuichi
Takata, Ryo
Ishii, Jun
Kondo, Masaaki
Nakamura, Hiroshi
Ohkubo, Tetsui
Kojima, Takuya
Amano, Hideharu
2017 IEEE 11TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2017), 2017, : 13 - 20
[40] Deep Neural Network Training Accelerator Designs in ASIC and FPGA
Venkataramanaiah, Shreyas K.
Yin, Shihui
Cao, Yu
Seo, Jae-Sun
2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 21 - 22

← 1 2 3 4 5 →