Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency

被引：2

作者：

Yang, Li ^{[1
]}

He, Zhezhi ^{[1
]}

Angizi, Shaahin ^{[1
]}

Fan, Deliang ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

来源：

2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC) | 2020年

基金：

美国国家科学基金会;

关键词：

Processing-in-Memory; Dynamic neural network;

D O I：

10.1109/SOCC49529.2020.9524770

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the widely deployment of powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN in a hardware-aware manner to reduce the computing complexity, while maintaining accuracy, such as weight quantization, pruning, convolution decomposition, etc. However, in typical DNN compression methods, a smaller, but fixed, network structure is generated from a relative large background model for resource limited hardware accelerator deployment. However, such optimization lacks the ability to tune its structure on-the-fly to best fit for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review two of our prior works [1], [2] to address this issue, discussing how to construct a dynamic DNN structure through either uniform or non-uniform channel selection based sub-network sampling. The constructed dynamic DNN could tune its computing path to involve different number of channels, thus providing the ability to trade-off between speed, power and accuracy on-the-fly after model deployment. Correspondingly, an emerging Spin-Orbit Torque Magnetic Random-Access-Memory (SOT-MRAM) based Processing-In-Memory (PIM) accelerator will also be discussed for such dynamic neural network structure.

引用

页码：117 / 122

页数：6

共 50 条

[21] Hardware spiking neural network with run-time reconfigurable connectivity in an autonomous robot
Roggen, D
Hofmann, S
Thoma, Y
Floreano, D
2003 NASA/DOD CONFERENCE ON EVOLVABLE HARDWARE, 2003, : 189 - 198
[22] Quant-PIM: An Energy-Efficient Processing-in-Memory Accelerator for Layerwise Quantized Neural Networks
Lee, Young Seo
Chung, Eui-Young
Gong, Young-Ho
Chung, Sung Woo
IEEE EMBEDDED SYSTEMS LETTERS, 2021, 13 (04) : 162 - 165
[23] Run-Time Power-Down Strategies for Real-Time SDRAM Memory Controllers
Chandrasekar, Karthik
Akesson, Benny
Goossens, Kees
2012 49TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2012, : 988 - 993
[24] Energy Harvesting-assisted Ultra-Low-Power Processing-in-Memory Accelerator for ML Applications
Shukla, Sanket
Bavikadi, Sathwika
Dinakarrao, Sai Manoj Pudukotai
PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 633 - 638
[25] Run-time software monitor of the power consumption of wireless network interface cards
Lattanzi, E
Acquaviva, A
Bogliolo, A
INTEGRATED CIRCUIT AND SYSTEM DESIGN: POWER AND TIMING MODELING, OPTIMIZATION AND SIMULATION, 2004, 3254 : 352 - 361
[26] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Chi, Ping
Li, Shuangchen
Xu, Cong
Zhang, Tao
Zhao, Jishen
Liu, Yongpan
Wang, Yu
Xie, Yuan
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
[27] Circuit switched run-time adaptive network-on-chip for image processing applications
Braun, Lars
Huebner, Michael
Becker, Juergen
Perschke, Thomas
Schatz, Volker
Bach, Stefan
2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 2007, : 688 - 691
[28] Run-time Non-uniform Quantization for Dynamic Neural Networks in Wireless Communication
Allwin, Priscilla Sharon
Gomony, Manil Dev
Geilen, Marc
29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 915 - 920
[29] Dynamic Performance and Power Optimization with Heterogeneous Processing-in-Memory for AI Applications on Edge Devices
Jeon, Sangmin
Lee, Kangju
Lee, Kyeongwon
Lee, Woojoo
MICROMACHINES, 2024, 15 (10)
[30] An analytical, dynamic, power-performance router model for run-time NoC optimizations
Zoni, Davide
Terraneo, Federico
Fornaciari, William
2013 IEEE 26TH INTERNATIONAL SOC CONFERENCE (SOCC), 2013, : 290 - 295

← 1 2 3 4 5 →