DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded Systems

被引:2
|
作者
Khan, Osama [1 ]
Park, Gwanjong [1 ]
Seo, Euiseong [1 ]
机构
[1] Sungkyunkwan Univ, Corp Collaborat Ctr, 2066 Seobu Ro, Suwon 16419, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
On-device learning; embedded systems; backpropagation; machine learning; Internet-of-Things;
D O I
10.1145/3609121
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The use of deep neural network (DNN) applications in microcontroller unit (MCU) embedded systems is getting popular. However, the DNN models in such systems frequently suffer from accuracy loss due to the dataset shift problem. On-device learning resolves this problem by updating the model parameters on-site with the real-world data, thus localizing the model to its surroundings. However, the backpropagation step during on-device learning requires the output of every layer computed during the forward pass to be stored in memory. This is usually infeasible in MCU devices as they are equipped only with a few KBs of SRAM. Given their energy limitation and the timeliness requirements, using flash memory to store the output of every layer is not practical either. Although there have been proposed a few research results to enable on-device learning under stringent memory conditions, they require the modification of the target models or the use of non-conventional gradient computation strategies. This paper proposes DaCapo, a backpropagation scheme that enables on-device learning in memory-constrained embedded systems. DaCapo stores only the output of certain layers, known as checkpoints, in SRAM, and discards the others. The discarded outputs are recomputed during backpropagation from the nearest checkpoint in front of them. In order to minimize the recomputation occurrences, DaCapo optimally plans the checkpoints to be stored in the SRAM area at a particular phase of the backpropagation and thus replaces the checkpoints stored in memory as the backpropagation progresses. We implemented the proposed scheme in an STM32F429ZI board and evaluated it with five representative DNN models. Our evaluation showed that DaCapo improved backpropagation time by up to 22% and saved energy consumption by up to 28% in comparison to AIfES, a machine learning platform optimized for MCU devices. In addition, our proposed approach enabled the training of MobileNet, which the MCU device had been previously unable to train.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Anomaly Prediction Based on k-means Clustering for Memory-constrained Embedded Devices
    Kitagawa, Yuto
    Ishigoka, Tasuku
    Azumi, Takuya
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 26 - 33
  • [22] Efficient Flash Indexing for Time Series Data on Memory-constrained Embedded Sensor Devices
    Fazackerley, Scott
    Ould-Khessal, Nadir
    Lawrence, Ramon
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON SENSOR NETWORKS (SENSORNETS), 2021, : 92 - 99
  • [23] Access Pattern-Based Code Compression For Memory-Constrained Systems
    Ozturk, Ozcan
    Kandemir, Mahmut
    Chen, Guangyu
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2008, 13 (04)
  • [24] A proposal of metaheuristics to schedule independent tasks in heterogeneous memory-constrained systems
    Cuenca, Javier
    Gimenez, Domingo
    Lopez, Jose-Juan
    Martinez-Gallar, Juan-Pedro
    2007 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2007, : 422 - +
  • [25] On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems
    Cioflan, Cristian
    Cavigelli, Lukas
    Rusci, Manuele
    de Prado, Miguel
    Benini, Luca
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 6 - 10
  • [26] Cooperative Inference of DNNs for Delay- and Memory-Constrained Wireless IoT Systems
    Yun, Sangseok
    Choi, Wan
    Kim, Il-Min
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (17): : 16113 - 16127
  • [27] Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits
    Xu, Xiao
    Zhao, Qing
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 2371 - 2382
  • [28] Towards virtual memory support in real-time and memory-constrained embedded applications: the interval page table
    Zhou, X.
    Petrov, P.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2011, 5 (04): : 287 - 295
  • [29] Efficient On-device Transfer Learning using Activation Memory Reduction
    Yoosefi, Amin
    Mousavi, Hamid
    Daneshtalab, Masoud
    Kargahi, Mehdi
    2023 EIGHTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING, FMEC, 2023, : 210 - 215
  • [30] On-device Online Learning and Semantic Management of TinyML Systems
    Ren, Haoyu
    Anicic, Darko
    Li, Xue
    Runkler, Thomas
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (04)