DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded Systems

被引:2
|
作者
Khan, Osama [1 ]
Park, Gwanjong [1 ]
Seo, Euiseong [1 ]
机构
[1] Sungkyunkwan Univ, Corp Collaborat Ctr, 2066 Seobu Ro, Suwon 16419, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
On-device learning; embedded systems; backpropagation; machine learning; Internet-of-Things;
D O I
10.1145/3609121
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The use of deep neural network (DNN) applications in microcontroller unit (MCU) embedded systems is getting popular. However, the DNN models in such systems frequently suffer from accuracy loss due to the dataset shift problem. On-device learning resolves this problem by updating the model parameters on-site with the real-world data, thus localizing the model to its surroundings. However, the backpropagation step during on-device learning requires the output of every layer computed during the forward pass to be stored in memory. This is usually infeasible in MCU devices as they are equipped only with a few KBs of SRAM. Given their energy limitation and the timeliness requirements, using flash memory to store the output of every layer is not practical either. Although there have been proposed a few research results to enable on-device learning under stringent memory conditions, they require the modification of the target models or the use of non-conventional gradient computation strategies. This paper proposes DaCapo, a backpropagation scheme that enables on-device learning in memory-constrained embedded systems. DaCapo stores only the output of certain layers, known as checkpoints, in SRAM, and discards the others. The discarded outputs are recomputed during backpropagation from the nearest checkpoint in front of them. In order to minimize the recomputation occurrences, DaCapo optimally plans the checkpoints to be stored in the SRAM area at a particular phase of the backpropagation and thus replaces the checkpoints stored in memory as the backpropagation progresses. We implemented the proposed scheme in an STM32F429ZI board and evaluated it with five representative DNN models. Our evaluation showed that DaCapo improved backpropagation time by up to 22% and saved energy consumption by up to 28% in comparison to AIfES, a machine learning platform optimized for MCU devices. In addition, our proposed approach enabled the training of MobileNet, which the MCU device had been previously unable to train.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] On-NAS: On-Device Neural Architecture Search on Memory-Constrained Intelligent Embedded Systems
    Kim, Bosung
    Lee, Seulki
    PROCEEDINGS OF THE 21ST ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2023, 2023, : 152 - 166
  • [2] Flexible Intrusion Detection Systems for Memory-Constrained Embedded Systems
    Tabrizi, Farid Molazem
    Pattabiraman, Karthik
    2015 ELEVENTH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC), 2015, : 1 - 12
  • [3] Middleware specialization for memory-constrained networked embedded systems
    Subramonian, V
    Xing, GL
    Gill, C
    Lu, CY
    Cytron, R
    RTAS 2004: 10TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2004, : 306 - 313
  • [4] Adaptive Flash Sorting for Memory-Constrained Embedded Devices
    Lawrence, Ramon
    36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 321 - 326
  • [5] Linear-Feedback Shift Register Seed Determination for Memory-Constrained Embedded Systems
    Puga, Gerardo L.
    2017 IEEE URUCON, 2017,
  • [6] Efficient External Sorting for Memory-Constrained Embedded Devices with Flash Memory
    Jackson, Riley
    Gresl, Jonathan
    Lawrence, Ramon
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (04)
  • [7] The Interval Page Table: Virtual Memory Support in Real-Time and Memory-Constrained Embedded Systems
    Zhou, Xiangrong
    Petrov, Peter
    SBCCI2007: 20TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN, 2007, : 294 - 299
  • [8] Improving Java']Java virtual machine reliability for memory-constrained embedded systems.
    Chen, GY
    Kandemir, M
    42nd Design Automation Conference, Proceedings 2005, 2005, : 690 - 695
  • [9] Studying storage-recomputation tradeoffs in memory-constrained embedded processing
    Kandemir, M
    Li, FH
    Chen, GL
    Chen, GY
    Ozturk, O
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1026 - 1031
  • [10] Resource-Constrained On-Device Learning by Dynamic Averaging
    Heppe, Lukas
    Kamp, Michael
    Adilova, Linara
    Heinrich, Danny
    Piatkowski, Nico
    Morik, Katharina
    ECML PKDD 2020 WORKSHOPS, 2020, 1323 : 129 - 144