Scaling for edge inference of deep neural networks

被引:301
|
作者
Xu, Xiaowei [1 ]
Ding, Yukun [1 ]
Hu, Sharon Xiaobo [1 ]
Niemier, Michael [1 ]
Cong, Jason [2 ]
Hu, Yu [3 ]
Shi, Yiyu [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci, Notre Dame, IN 46556 USA
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[3] Huazhong Univ Sci & Technol, Sch Opt & Elect Informat, Wuhan, Hubei, Peoples R China
来源
NATURE ELECTRONICS | 2018年 / 1卷 / 04期
关键词
ENERGY;
D O I
10.1038/s41928-018-0059-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks offer considerable potential across a range of applications, from advanced manufacturing to autonomous cars. A clear trend in deep neural networks is the exponential growth of network size and the associated increases in computational complexity and memory consumption. However, the performance and energy efficiency of edge inference, in which the inference (the application of a trained network to new data) is performed locally on embedded platforms that have limited area and power budget, is bounded by technology scaling. Here we analyse recent data and show that there are increasing gaps between the computational complexity and energy efficiency required by data scientists and the hardware capacity made available by hardware architects. We then discuss various architecture and algorithm innovations that could help to bridge the gaps.
引用
收藏
页码:216 / 222
页数:7
相关论文
共 50 条
  • [41] Redundant feature pruning for accelerated inference in deep neural networks
    Ayinde, Babajide O.
    Inanc, Tamer
    Zurada, Jacek M.
    NEURAL NETWORKS, 2019, 118 : 148 - 158
  • [42] ACCURATE AND EFFICIENT FIXED POINT INFERENCE FOR DEEP NEURAL NETWORKS
    Rajagopal, Vasanthakumar
    Ramasamy, Chandra Kumar
    Vishnoi, Ashok
    Gadde, Raj Narayana
    Miniskar, Narasinga Rao
    Pasupuleti, Sirish Kumar
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1847 - 1851
  • [43] Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing
    Li, En
    Zeng, Liekang
    Zhou, Zhi
    Chen, Xu
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (01) : 447 - 457
  • [44] Scaling-Based Weight Normalization for Deep Neural Networks
    Yuan, Qunyong
    Xiao, Nanfeng
    IEEE ACCESS, 2019, 7 : 7286 - 7295
  • [45] Adaptive temperature scaling for Robust calibration of deep neural networks
    Balanya, Sergio A.
    Maronas, Juan
    Ramos, Daniel
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (14): : 8073 - 8095
  • [46] Toward Energy-Quality Scaling in Deep Neural Networks
    Anderson, Jeff
    Alkabani, Yousra
    El-Ghazawi, Tarek
    IEEE DESIGN & TEST, 2021, 38 (04) : 27 - 36
  • [47] GRAPH EXPANSIONS OF DEEP NEURAL NETWORKS AND THEIR UNIVERSAL SCALING LIMITS
    Cirone, Nicola Muça
    Hamdan, Jad
    Salvi, Cristopher
    arXiv,
  • [48] Scaling Deep Spiking Neural Networks with Binary Stochastic Activations
    Roy, Deboleena
    Chakraborty, Indranil
    Roy, Kaushik
    2019 IEEE INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING (IEEE ICCC 2019), 2019, : 50 - 58
  • [49] Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit Approach
    Lu, Bingqian
    Yang, Jianyi
    Xu, Jie
    Ren, Shaolei
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21409 - 21420
  • [50] DISSEC: A distributed deep neural network inference scheduling strategy for edge clusters
    Li, Qiang
    Huang, Liang
    Tong, Zhao
    Du, Ting-Ting
    Zhang, Jin
    Wang, Sheng-Chun
    NEUROCOMPUTING, 2022, 500 (449-460) : 449 - 460