Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

被引:8
|
作者
Eckert, Charles [1 ]
Wang, Xiaowei [1 ]
Wang, Jingcheng [2 ]
Subramaniyan, Arun [1 ]
Iyer, Ravi [3 ]
Sylvester, Dennis [4 ]
Blaauw, David [5 ]
Das, Reetuparna [1 ]
机构
[1] Univ Michigan, Dept Comp Sci & Engn, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA
[3] Intel Corp, Santa Clara, CA 95051 USA
[4] Univ Michigan, Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
D O I
10.1109/MM.2019.2908101
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article presents Neural Cache architecture, which repurposes cache structures to transform them into massively parallel compute units capable of running inferences for deep neural networks. Techniques to do in situ arithmetic in SRAM arrays create efficient data mapping, and reducing data movement is proposed. Neural Cache architecture is capable of fully executing convolutional, fully connected, and pooling layers in cache. Our experimental results show that the proposed architecture can improve efficiency over a GPU by 128 x while requiring a minimal area overhead of 2%.
引用
收藏
页码:11 / 19
页数:9
相关论文
共 50 条
  • [41] Optimizing Energy Utilization of Flexible Deep Neural Network Accelerators via Cache Incorporation
    Hensley, Dalton
    Zhang, Wei
    2022 IEEE 19TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2022), 2022, : 681 - 686
  • [42] Telepathic Headache: Mitigating Cache Side-Channel Attacks on Convolutional Neural Networks
    Chabanne, Herve
    Danger, Jean-Luc
    Guiga, Linda
    Kuhne, Ulrich
    APPLIED CRYPTOGRAPHY AND NETWORK SECURITY (ACNS 2021), PT I, 2021, 12726 : 363 - 392
  • [43] Dual Cache for Long Document Neural Coreference Resolution
    Guo, Qipeng
    Hu, Xiangkun
    Zhang, Yue
    Qiu, Xipeng
    Zhang, Zheng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15272 - 15285
  • [44] Look-Up Table based Energy Efficient Processing in Cache Support for Neural Network Acceleration
    Ramanathan, Akshay Krishna
    Kalsi, Gurpreet S.
    Srinivasa, Srivatsa
    Chandran, Tarun Makesh
    Pillai, Kamlesh R.
    Omer, Om J.
    Narayanan, Vijaykrishnan
    Subramoney, Sreenivas
    2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 88 - 101
  • [45] BitCluster: Fine-Grained Weight Quantization for Load-Balanced Bit-Serial Neural Network Accelerators
    Li, Ang
    Mo, Huiyu
    Zhu, Wenping
    Li, Qiang
    Yin, Shouyi
    Wei, Shaojun
    Liu, Leibo
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4747 - 4757
  • [46] A zeroing neural dynamics based acceleration optimization approach for optimizers in deep neural networks
    Liao, Shan
    Li, Shubin
    Liu, Jiayong
    Huang, Haoen
    Xiao, Xiuchun
    NEURAL NETWORKS, 2022, 150 : 440 - 461
  • [47] A Scalable System-on-Chip Acceleration for Deep Neural Networks
    Shehzad, Faisal
    Rashid, Muhammad
    Sinky, Mohammed H.
    Alotaibi, Saud S.
    Zia, Muhammad Yousuf Irfan
    IEEE ACCESS, 2021, 9 : 95412 - 95426
  • [48] Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy
    Zeng, Zihao
    Liu, Chubo
    Tang, Zhuo
    Chang, Wanli
    Li, Kenli
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1165 - 1170
  • [49] Acceleration Strategies for Speech Recognition based on Deep Neural Networks
    Tian, Chao
    Liu, Jia
    Peng, Zhaomeng
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 5181 - 5185
  • [50] Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
    Wang, Xijun
    Kan, Meina
    Shan, Shiguang
    Chen, Xilin
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9041 - 9050