Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency

被引:2
|
作者
Yang, Li [1 ]
He, Zhezhi [1 ]
Angizi, Shaahin [1 ]
Fan, Deliang [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
Processing-in-Memory; Dynamic neural network;
D O I
10.1109/SOCC49529.2020.9524770
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the widely deployment of powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN in a hardware-aware manner to reduce the computing complexity, while maintaining accuracy, such as weight quantization, pruning, convolution decomposition, etc. However, in typical DNN compression methods, a smaller, but fixed, network structure is generated from a relative large background model for resource limited hardware accelerator deployment. However, such optimization lacks the ability to tune its structure on-the-fly to best fit for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review two of our prior works [1], [2] to address this issue, discussing how to construct a dynamic DNN structure through either uniform or non-uniform channel selection based sub-network sampling. The constructed dynamic DNN could tune its computing path to involve different number of channels, thus providing the ability to trade-off between speed, power and accuracy on-the-fly after model deployment. Correspondingly, an emerging Spin-Orbit Torque Magnetic Random-Access-Memory (SOT-MRAM) based Processing-In-Memory (PIM) accelerator will also be discussed for such dynamic neural network structure.
引用
收藏
页码:117 / 122
页数:6
相关论文
共 50 条
  • [31] A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator
    Liu, Hanwen
    Chen, Yi
    Zeng, Zihang
    Zhang, Malu
    Qu, Hong
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [32] Memory level neural network: A time-varying neural network for memory input processing
    Gong, Chao
    Zhou, Xianwei
    Lu, Xing
    Lin, Fuhong
    NEUROCOMPUTING, 2021, 425 : 256 - 265
  • [33] Processing-in-Memory (PIM) Based Defect Prediction of Metal Surfaces Using Spiking Neural Network
    Siyad, Mohammed B.
    Mohan, R.
    JOURNAL OF THE CHINESE SOCIETY OF MECHANICAL ENGINEERS, 2023, 44 (05): : 379 - 388
  • [34] Content Addressable Memory Based Binarized Neural Network Accelerator Using Time-Domain Signal Processing
    Choi, Woong
    Jeong, Kwanghyo
    Choi, Kyungrak
    Lee, Kyeongho
    Park, Jongsun
    2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [35] Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Effciency, and Power-Intermittency Resilience
    Roohi, Arman
    Angizi, Shaahin
    Fan, Deliang
    DeMara, Ronald F.
    PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 8 - 13
  • [36] DIET-PIM: Dynamic Importance-based Early Termination for Energy-Efficient Processing-in-Memory Accelerator
    Chang, Cheng-Yang
    Huang, Chi-Tse
    Wu, An-Yeu
    2024 IEEE THE 20TH ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS 2024, 2024, : 30 - 34
  • [37] Evaluating the run-time performance of Kahn process network implementation techniques on shared-memory multiprocessors
    Vrba, Zeljko
    Halvorsen, Pal
    Griwodz, Carsten
    CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 639 - 644
  • [38] Task Parallelism-Aware Deep Neural Network Scheduling on Multiple Hybrid Memory Cube-Based Processing-in-Memory
    Lee, Young Sik
    Han, Tae Hee
    IEEE ACCESS, 2021, 9 : 68561 - 68572
  • [39] Designing an MPSoC Architecture with Run-time and Evolvable Task Decomposition and Scheduling: A Neural Network Case Study
    Vakili, Shervin
    Fakhraie, S. Mehdi
    Mohammadi, Siamak
    IIT: 2008 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2008, : 524 - 528
  • [40] Lightweight Run-Time Working Memory Compression for Deployment of Deep Neural Networks on Resource-Constrained MCUs
    Wang, Zhepeng
    Wu, Yawen
    Jia, Zhenge
    Shi, Yiyu
    Hu, Jingtong
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 607 - 614