Pod-racing: bulk-bitwise to floating-point compute in racetrack memory for machine learning at the edge

被引：3

作者：

Ollivier, Sebastien ^{[1
]}

Zhang, Xinyi ^{[1
]}

Tang, Yue ^{[1
]}

Choudhuri, Chayanika ^{[1
]}

Hu, Jingtong ^{[1
]}

Jones, Alex K. ^{[1
]}

机构：

[1] Univ Pittsburgh, Pittsburgh, PA 15260 USA

来源：

IEEE MICRO | 2022年 / 42卷 / 06期

基金：

美国国家科学基金会;

关键词：

Floating-point arithmetic; Nanowires; Field programmable gate arrays; Random access memory; Convolutional neural networks; Common Information Model (computing); Edge computing; Machine learning;

D O I：

10.1109/MM.2022.3195761

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional neural networks (CNNs) have become a ubiquitous algorithm with growing applications in mobile and edge settings. We describe a compute-in-memory (CIM) technique called POD-RACING using Racetrack memory (RM) to accelerate CNNs for edge systems. Using transverse read, a technique that can determine the number of "1"s in multiple adjacent domains, POD-RACING can efficiently implement multioperand bulk-bitwise and addition computations, and two-operand multiplication. We discuss how POD-RACING can implement both variable precision integer and floating point arithmetic using digital CIM. This allows both CNN inference and on-device training without expensive data movement to the cloud. Based on these functions we demonstrate the implementation of several CNNs with backpropagation using RM CIM and compare these to the state-of-the-art implementations of CNN inference and training. During training, POD-RACING improves efficiency by 2x, energy consumption by $\geq$>= 27%, and increases throughput by $\geq$>= 18% versus a state-of-the-art field-programmable gate array accelerator.

引用

页码：9 / 16

页数：8

共 3 条

[1] An energy-efficient floating-point compute SRAM with pipelined in-memory bit-parallel exponent and bitwise mantissa processing
Mai, Yangzhan
Wang, Mingyu
Zhong, Baiqing
Zhang, Chuanghao
Zhang, Yicong
Yu, Zhiyi
ELECTRONICS LETTERS, 2023, 59 (14)
[2] A 22 nm Floating-Point ReRAM Compute-in-Memory Macro Using Residue-Shared ADC for AI Edge Device
Hsu, Hung-Hsi
Wen, Tai-Hao
Khwa, Win-San
Huang, Wei-Hsing
Ke, Zhao-En
Chin, Yu-Hsiang
Wen, Hua-Jin
Chang, Yu-Chen
Hsu, Wei-Ting
Lele, Ashwin Sanjay
Zhang, Bo
Wu, Ping-Sheng
Lo, Chung-Chuan
Liu, Ren-Shuo
Hsieh, Chih-Cheng
Tang, Kea-Tiong
Teng, Shih-Hsin
Chou, Chung-Cheng
Chih, Yu-Der
Chang, Tsung-Yung Jonathan
Chang, Meng-Fan
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2025, 60 (01) : 171 - 183
[3] A Floating-Point 6T SRAM In-Memory-Compute Macro Using Hybrid-Domain Structure for Advanced AI Edge Chips
Wu, Ping-Chun
Su, Jian-Wei
Hong, Li-Yang
Ren, Jin-Sheng
Chien, Chih-Han
Chen, Ho-Yu
Ke, Chao-En
Hsiao, Hsu-Ming
Li, Sih-Han
Sheu, Shyh-Shyuan
Lo, Wei-Chung
Chang, Shih-Chieh
Lo, Chung-Chuan
Liu, Ren-Shuo
Hsieh, Chih-Cheng
Tang, Kea-Tiong
Chang, Meng-Fan
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (01) : 196 - 207

← 1 →