Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency

被引：2

作者：

Yang, Li ^{[1
]}

He, Zhezhi ^{[1
]}

Angizi, Shaahin ^{[1
]}

Fan, Deliang ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

来源：

2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) | 2020年

基金：

美国国家科学基金会;

关键词：

Processing-in-Memory; Dynamic neural network;

D O I：

10.1109/SOCC49529.2020.9524770

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the widely deployment of powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN in a hardware-aware manner to reduce the computing complexity, while maintaining accuracy, such as weight quantization, pruning, convolution decomposition, etc. However, in typical DNN compression methods, a smaller, but fixed, network structure is generated from a relative large background model for resource limited hardware accelerator deployment. However, such optimization lacks the ability to tune its structure on-the-fly to best fit for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review two of our prior works [1], [2] to address this issue, discussing how to construct a dynamic DNN structure through either uniform or non-uniform channel selection based sub-network sampling. The constructed dynamic DNN could tune its computing path to involve different number of channels, thus providing the ability to trade-off between speed, power and accuracy on-the-fly after model deployment. Correspondingly, an emerging Spin-Orbit Torque Magnetic Random-Access-Memory (SOT-MRAM) based Processing-In-Memory (PIM) accelerator will also be discussed for such dynamic neural network structure.

引用

页码：117 / 122

页数：6

共 50 条

[41] Low-power architecture with scratch-pad memory for accelerating embedded applications with run-time reuse
Milidonis, A.
Porpodas, V.
Alachiotis, N.
Kakarountas, A. P.
Michail, H.
Panagiotakopoulos, G.
Goutis, C. E.
IET COMPUTERS AND DIGITAL TECHNIQUES, 2009, 3 (01): : 109 - 123
[42] Run-time reconfigurable adaptive signal processing system with asynchronous dynamic pipelining: A case study of DLMS ADFE
Chen, SZ
Zhang, T
2004 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS DESIGN AND IMPLEMENTATION, PROCEEDINGS, 2004, : 158 - 163
[43] Power and Area Optimization for Run-Time Reconfiguration System On Programmable Chip Based on Magnetic Random Access Memory
Zhao, Weisheng
Belhaire, Eric
Chappert, Claude
Mazoyer, Pascale
IEEE TRANSACTIONS ON MAGNETICS, 2009, 45 (02) : 776 - 780
[44] A Flexible Processing-in -Memory Accelerator for Dynamic Channel-Adaptive Deep Neural Networks
Yang, Li
Angizi, Shaahin
Fan, Deliang
2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 313 - 318
[45] In-Memory Neural Network Accelerator based on eDRAM Cell with Enhanced Retention Time
Lee, Inhwan
Kim, Eunhwan
Kang, Nameun
Oh, Hyunmyung
Kim, Jae-Joon
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[46] Inter-Hierarchical Power Analysis Methodology to Reduce Multiple Orders of Magnitude Run-Time without Compromizing Accuracy
Nan, Haiqing
Choi, Ken
2009 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2009), 2009, : 556 - 559
[47] Parasitic-Aware Modeling and Neural Network Training Scheme for Energy-Efficient Processing-in-Memory With Resistive Crossbar Array
Cao, Tiancheng
Liu, Chen
Gao, Yuan
Goh, Wang Ling
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (02) : 436 - 444
[48] Millipede: A user-level NT-based distributed shared memory system with thread migration and dynamic run-time optimization of memory references
Itzkovitz, A
Schuster, A
Shalev, L
PROCEEDINGS OF THE USENIX WINDOWS NT WORKSHOP, 1997, : 148 - 148
[49] Data Pruning-enabled High Performance and Reliable Graph Neural Network Training on ReRAM-based Processing-in-Memory Accelerators
Ogbogu, Chukwufumnanya
Joardar, Biresh
Chakrabarty, Krishnendu
Doppa, Jana
Pande, Partha Pratim
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (05)
[50] An Area- and Energy-Efficient Spiking Neural Network With Spike-Time-Dependent Plasticity Realized With SRAM Processing-in-Memory Macro and On-Chip Unsupervised Learning
Liu, Shuang
Wang, J. J.
Zhou, J. T.
Hu, S. G.
Yu, Q.
Chen, T. P.
Liu, Y.
IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2023, 17 (01) : 92 - 104

← 1 2 3 4 5 →