Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks

被引：8

作者：

Eckert, Charles ^{[1
]}

Wang, Xiaowei ^{[1
]}

Wang, Jingcheng ^{[2
]}

Subramaniyan, Arun ^{[1
]}

Iyer, Ravi ^{[3
]}

Sylvester, Dennis ^{[4
]}

Blaauw, David ^{[5
]}

Das, Reetuparna ^{[1
]}

机构：

[1] Univ Michigan, Dept Comp Sci & Engn, Ann Arbor, MI 48109 USA

[2] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA

[3] Intel Corp, Santa Clara, CA 95051 USA

[4] Univ Michigan, Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA

[5] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

IEEE MICRO | 2019年 / 39卷 / 03期

关键词：

D O I：

10.1109/MM.2019.2908101

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article presents Neural Cache architecture, which repurposes cache structures to transform them into massively parallel compute units capable of running inferences for deep neural networks. Techniques to do in situ arithmetic in SRAM arrays create efficient data mapping, and reducing data movement is proposed. Neural Cache architecture is capable of fully executing convolutional, fully connected, and pooling layers in cache. Our experimental results show that the proposed architecture can improve efficiency over a GPU by 128 x while requiring a minimal area overhead of 2%.

引用

页码：11 / 19

页数：9

共 50 条

[31] Cache-locality Based Adaptive Warp Scheduling for Neural Network Acceleration on GPGPUs
Hu, Weiming
Zhou, Yi
Quan, Ying
Wang, Yuanfeng
Lou, Xin
2022 IEEE 35TH INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (IEEE SOCC 2022), 2022, : 190 - 195
[32] GENES-IV - A BIT-SERIAL PROCESSING ELEMENT FOR A MULTIMODEL NEURAL-NETWORK ACCELERATOR
IENNE, P
VIREDAZ, MA
JOURNAL OF VLSI SIGNAL PROCESSING, 1995, 9 (03): : 257 - 273
[33] Neural Language Modeling With Implicit Cache Pointers
Li, Ke
Povey, Daniel
Khudanpur, Sanjeev
INTERSPEECH 2020, 2020, : 3625 - 3629
[34] Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks
Magalhaes, Bruno R. C.
Sterling, Thomas
Hines, Michael
Schurmann, Felix
COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 421 - 434
[35] Cache Management in Information-Centric Networks using Convolutional Neural Network
Chiu, Kelvin H. T.
Zhang, Jun
Bensaou, Brahim
2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
[36] Cache Compression with Golomb-Rice Code and Quantization for Convolutional Neural Networks
Bae, Seung-Hwan
Lee, Hyuk-Jae
Kim, Hyun
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[37] Bit-serial convolution with prediction threshold for convolutional neural networks: Electrical Engineering Subject Index: EL7 Signal Processing
Hsiao, Jen-Hao
Chin, Wen-Long
Wu, Yu-Feng
Chang, Deng-Kai
Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A, 2022, 45 (03): : 266 - 272
[38] Bit-serial convolution with prediction threshold for convolutional neural networks Electrical Engineering Subject Index: EL7 Signal Processing
Hsiao, Jen-Hao
Chin, Wen-Long
Wu, Yu-Feng
Chang, Deng-Kai
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2022, 45 (03) : 266 - 272
[39] GENES IV: a bit-serial processing element for a multi-model neural-network accelerator
Swiss Federal Inst of Technology, Lausanne, Switzerland
J VLSI Signal Process, 3 (257-273):
[40] Acceleration of Deep Recurrent Neural Networks with an FPGA cluster
Sun, Yuxi
Ben Ahmed, Akram
Amano, Hideharu
PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON HIGHLY EFFICIENT ACCELERATORS AND RECONFIGURABLE TECHNOLOGIES (HEART), 2019,

← 1 2 3 4 5 →