Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency

被引:2
|
作者
Yang, Li [1 ]
He, Zhezhi [1 ]
Angizi, Shaahin [1 ]
Fan, Deliang [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
Processing-in-Memory; Dynamic neural network;
D O I
10.1109/SOCC49529.2020.9524770
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the widely deployment of powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN in a hardware-aware manner to reduce the computing complexity, while maintaining accuracy, such as weight quantization, pruning, convolution decomposition, etc. However, in typical DNN compression methods, a smaller, but fixed, network structure is generated from a relative large background model for resource limited hardware accelerator deployment. However, such optimization lacks the ability to tune its structure on-the-fly to best fit for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review two of our prior works [1], [2] to address this issue, discussing how to construct a dynamic DNN structure through either uniform or non-uniform channel selection based sub-network sampling. The constructed dynamic DNN could tune its computing path to involve different number of channels, thus providing the ability to trade-off between speed, power and accuracy on-the-fly after model deployment. Correspondingly, an emerging Spin-Orbit Torque Magnetic Random-Access-Memory (SOT-MRAM) based Processing-In-Memory (PIM) accelerator will also be discussed for such dynamic neural network structure.
引用
收藏
页码:117 / 122
页数:6
相关论文
共 50 条
  • [41] Low-power architecture with scratch-pad memory for accelerating embedded applications with run-time reuse
    Milidonis, A.
    Porpodas, V.
    Alachiotis, N.
    Kakarountas, A. P.
    Michail, H.
    Panagiotakopoulos, G.
    Goutis, C. E.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2009, 3 (01): : 109 - 123
  • [42] Run-time reconfigurable adaptive signal processing system with asynchronous dynamic pipelining: A case study of DLMS ADFE
    Chen, SZ
    Zhang, T
    2004 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS DESIGN AND IMPLEMENTATION, PROCEEDINGS, 2004, : 158 - 163
  • [43] Power and Area Optimization for Run-Time Reconfiguration System On Programmable Chip Based on Magnetic Random Access Memory
    Zhao, Weisheng
    Belhaire, Eric
    Chappert, Claude
    Mazoyer, Pascale
    IEEE TRANSACTIONS ON MAGNETICS, 2009, 45 (02) : 776 - 780
  • [44] A Flexible Processing-in -Memory Accelerator for Dynamic Channel-Adaptive Deep Neural Networks
    Yang, Li
    Angizi, Shaahin
    Fan, Deliang
    2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 313 - 318
  • [45] In-Memory Neural Network Accelerator based on eDRAM Cell with Enhanced Retention Time
    Lee, Inhwan
    Kim, Eunhwan
    Kang, Nameun
    Oh, Hyunmyung
    Kim, Jae-Joon
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [46] Inter-Hierarchical Power Analysis Methodology to Reduce Multiple Orders of Magnitude Run-Time without Compromizing Accuracy
    Nan, Haiqing
    Choi, Ken
    2009 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2009), 2009, : 556 - 559
  • [47] Parasitic-Aware Modeling and Neural Network Training Scheme for Energy-Efficient Processing-in-Memory With Resistive Crossbar Array
    Cao, Tiancheng
    Liu, Chen
    Gao, Yuan
    Goh, Wang Ling
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (02) : 436 - 444
  • [48] Millipede: A user-level NT-based distributed shared memory system with thread migration and dynamic run-time optimization of memory references
    Itzkovitz, A
    Schuster, A
    Shalev, L
    PROCEEDINGS OF THE USENIX WINDOWS NT WORKSHOP, 1997, : 148 - 148
  • [49] Data Pruning-enabled High Performance and Reliable Graph Neural Network Training on ReRAM-based Processing-in-Memory Accelerators
    Ogbogu, Chukwufumnanya
    Joardar, Biresh
    Chakrabarty, Krishnendu
    Doppa, Jana
    Pande, Partha Pratim
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (05)
  • [50] An Area- and Energy-Efficient Spiking Neural Network With Spike-Time-Dependent Plasticity Realized With SRAM Processing-in-Memory Macro and On-Chip Unsupervised Learning
    Liu, Shuang
    Wang, J. J.
    Zhou, J. T.
    Hu, S. G.
    Yu, Q.
    Chen, T. P.
    Liu, Y.
    IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2023, 17 (01) : 92 - 104