DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded Systems

被引：2

作者：

Khan, Osama ^{[1
]}

Park, Gwanjong ^{[1
]}

Seo, Euiseong ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Corp Collaborat Ctr, 2066 Seobu Ro, Suwon 16419, Gyeonggi Do, South Korea

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2023年 / 22卷 / 05期

基金：

新加坡国家研究基金会;

关键词：

On-device learning; embedded systems; backpropagation; machine learning; Internet-of-Things;

D O I：

10.1145/3609121

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The use of deep neural network (DNN) applications in microcontroller unit (MCU) embedded systems is getting popular. However, the DNN models in such systems frequently suffer from accuracy loss due to the dataset shift problem. On-device learning resolves this problem by updating the model parameters on-site with the real-world data, thus localizing the model to its surroundings. However, the backpropagation step during on-device learning requires the output of every layer computed during the forward pass to be stored in memory. This is usually infeasible in MCU devices as they are equipped only with a few KBs of SRAM. Given their energy limitation and the timeliness requirements, using flash memory to store the output of every layer is not practical either. Although there have been proposed a few research results to enable on-device learning under stringent memory conditions, they require the modification of the target models or the use of non-conventional gradient computation strategies. This paper proposes DaCapo, a backpropagation scheme that enables on-device learning in memory-constrained embedded systems. DaCapo stores only the output of certain layers, known as checkpoints, in SRAM, and discards the others. The discarded outputs are recomputed during backpropagation from the nearest checkpoint in front of them. In order to minimize the recomputation occurrences, DaCapo optimally plans the checkpoints to be stored in the SRAM area at a particular phase of the backpropagation and thus replaces the checkpoints stored in memory as the backpropagation progresses. We implemented the proposed scheme in an STM32F429ZI board and evaluated it with five representative DNN models. Our evaluation showed that DaCapo improved backpropagation time by up to 22% and saved energy consumption by up to 28% in comparison to AIfES, a machine learning platform optimized for MCU devices. In addition, our proposed approach enabled the training of MobileNet, which the MCU device had been previously unable to train.

引用

页数：23

共 50 条

[1] On-NAS: On-Device Neural Architecture Search on Memory-Constrained Intelligent Embedded Systems
Kim, Bosung
Lee, Seulki
PROCEEDINGS OF THE 21ST ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2023, 2023, : 152 - 166
[2] Flexible Intrusion Detection Systems for Memory-Constrained Embedded Systems
Tabrizi, Farid Molazem
Pattabiraman, Karthik
2015 ELEVENTH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC), 2015, : 1 - 12
[3] Middleware specialization for memory-constrained networked embedded systems
Subramonian, V
Xing, GL
Gill, C
Lu, CY
Cytron, R
RTAS 2004: 10TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2004, : 306 - 313
[4] Adaptive Flash Sorting for Memory-Constrained Embedded Devices
Lawrence, Ramon
36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 321 - 326
[5] Linear-Feedback Shift Register Seed Determination for Memory-Constrained Embedded Systems
Puga, Gerardo L.
2017 IEEE URUCON, 2017,
[6] Efficient External Sorting for Memory-Constrained Embedded Devices with Flash Memory
Jackson, Riley
Gresl, Jonathan
Lawrence, Ramon
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (04)
[7] The Interval Page Table: Virtual Memory Support in Real-Time and Memory-Constrained Embedded Systems
Zhou, Xiangrong
Petrov, Peter
SBCCI2007: 20TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN, 2007, : 294 - 299
[8] Improving Java']Java virtual machine reliability for memory-constrained embedded systems.
Chen, GY
Kandemir, M
42nd Design Automation Conference, Proceedings 2005, 2005, : 690 - 695
[9] Studying storage-recomputation tradeoffs in memory-constrained embedded processing
Kandemir, M
Li, FH
Chen, GL
Chen, GY
Ozturk, O
DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1026 - 1031
[10] Resource-Constrained On-Device Learning by Dynamic Averaging
Heppe, Lukas
Kamp, Michael
Adilova, Linara
Heinrich, Danny
Piatkowski, Nico
Morik, Katharina
ECML PKDD 2020 WORKSHOPS, 2020, 1323 : 129 - 144

← 1 2 3 4 5 →