Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency

被引：2

作者：

Yang, Li ^{[1
]}

He, Zhezhi ^{[1
]}

Angizi, Shaahin ^{[1
]}

Fan, Deliang ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

来源：

2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) | 2020年

基金：

美国国家科学基金会;

关键词：

Processing-in-Memory; Dynamic neural network;

D O I：

10.1109/SOCC49529.2020.9524770

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the widely deployment of powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN in a hardware-aware manner to reduce the computing complexity, while maintaining accuracy, such as weight quantization, pruning, convolution decomposition, etc. However, in typical DNN compression methods, a smaller, but fixed, network structure is generated from a relative large background model for resource limited hardware accelerator deployment. However, such optimization lacks the ability to tune its structure on-the-fly to best fit for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review two of our prior works [1], [2] to address this issue, discussing how to construct a dynamic DNN structure through either uniform or non-uniform channel selection based sub-network sampling. The constructed dynamic DNN could tune its computing path to involve different number of channels, thus providing the ability to trade-off between speed, power and accuracy on-the-fly after model deployment. Correspondingly, an emerging Spin-Orbit Torque Magnetic Random-Access-Memory (SOT-MRAM) based Processing-In-Memory (PIM) accelerator will also be discussed for such dynamic neural network structure.

引用

页码：117 / 122

页数：6

共 50 条

[31] A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator
Liu, Hanwen
Chen, Yi
Zeng, Zihang
Zhang, Malu
Qu, Hong
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[32] Memory level neural network: A time-varying neural network for memory input processing
Gong, Chao
Zhou, Xianwei
Lu, Xing
Lin, Fuhong
NEUROCOMPUTING, 2021, 425 : 256 - 265
[33] Processing-in-Memory (PIM) Based Defect Prediction of Metal Surfaces Using Spiking Neural Network
Siyad, Mohammed B.
Mohan, R.
JOURNAL OF THE CHINESE SOCIETY OF MECHANICAL ENGINEERS, 2023, 44 (05): : 379 - 388
[34] Content Addressable Memory Based Binarized Neural Network Accelerator Using Time-Domain Signal Processing
Choi, Woong
Jeong, Kwanghyo
Choi, Kyungrak
Lee, Kyeongho
Park, Jongsun
2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
[35] Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Effciency, and Power-Intermittency Resilience
Roohi, Arman
Angizi, Shaahin
Fan, Deliang
DeMara, Ronald F.
PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 8 - 13
[36] DIET-PIM: Dynamic Importance-based Early Termination for Energy-Efficient Processing-in-Memory Accelerator
Chang, Cheng-Yang
Huang, Chi-Tse
Wu, An-Yeu
2024 IEEE THE 20TH ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS 2024, 2024, : 30 - 34
[37] Evaluating the run-time performance of Kahn process network implementation techniques on shared-memory multiprocessors
Vrba, Zeljko
Halvorsen, Pal
Griwodz, Carsten
CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 639 - 644
[38] Task Parallelism-Aware Deep Neural Network Scheduling on Multiple Hybrid Memory Cube-Based Processing-in-Memory
Lee, Young Sik
Han, Tae Hee
IEEE ACCESS, 2021, 9 : 68561 - 68572
[39] Designing an MPSoC Architecture with Run-time and Evolvable Task Decomposition and Scheduling: A Neural Network Case Study
Vakili, Shervin
Fakhraie, S. Mehdi
Mohammadi, Siamak
IIT: 2008 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2008, : 524 - 528
[40] Lightweight Run-Time Working Memory Compression for Deployment of Deep Neural Networks on Resource-Constrained MCUs
Wang, Zhepeng
Wu, Yawen
Jia, Zhenge
Shi, Yiyu
Hu, Jingtong
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 607 - 614

← 1 2 3 4 5 →