Memory efficient data-free distillation for continual learning

被引：4

作者：

Li, Xiaorong ^{[1
]}

Wang, Shipeng ^{[1
]}

Sun, Jian ^{[1
]}

Xu, Zongben ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian, Shaanxi, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 144卷

基金：

国家重点研发计划;

关键词：

Continual learning; Catastrophic forgetting; Knowledge distillation;

D O I：

10.1016/j.patcog.2023.109875

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks suffer from the catastrophic forgetting phenomenon when trained on sequential tasks in continual learning, especially when data from previous tasks are unavailable. To mitigate catastrophic forgetting, various methods either store data from previous tasks, which may raise privacy concerns, or require large memory storage. Particularly, the distillation-based methods mitigate catastrophic forgetting by using proxy datasets. However, proxy datasets may not match the distributions of the original datasets of previous tasks. To address these problems in a setting where the full training data of previous tasks are unavailable and memory resources are limited, we propose a novel data-free distillation method. Our method encodes knowledge of previous tasks into network parameter gradients by Taylor expansion, deducing a regularizer relying on gradients in network training loss. To improve memory efficiency, we design an approach to compressing the gradients in the regularizer. Moreover, we theoretically analyze the approximation error of our method. Experimental results on multiple datasets demonstrate that our proposed method outperforms the existing approaches in continual learning.

引用

页数：9

共 50 条

[1] Variational Data-Free Knowledge Distillation for Continual Learning
Li, Xiaorong
Wang, Shipeng
Sun, Jian
Xu, Zongben
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12618 - 12634
[2] Effective and efficient conditional contrast for data-free knowledge distillation with low memory
Jiang, Chenyang
Li, Zhendong
Yang, Jun
Wu, Yiqiang
Li, Shuai
JOURNAL OF SUPERCOMPUTING, 2025, 81 (04):
[3] A novel data-free continual learning method with contrastive reversion
Wu, Chu
Xie, Runshan
Wang, Shitong
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (02) : 505 - 518
[4] Latent Coreset Sampling based Data-Free Continual Learning
Wang, Zhuoyi
Li, Dingcheng
Li, Ping
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2078 - 2087
[5] A novel data-free continual learning method with contrastive reversion
Chu Wu
Runshan Xie
Shitong Wang
International Journal of Machine Learning and Cybernetics, 2024, 15 : 505 - 518
[6] Data-Free Knowledge Distillation for Heterogeneous Federated Learning
Zhu, Zhuangdi
Hong, Junyuan
Zhou, Jiayu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[7] DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning
Luo, Kangyang
Wang, Shuai
Fu, Yexuan
Li, Xiang
Lan, Yunshi
Gao, Ming
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
Smith, James Seale
Cascante-Bonilla, Paola
Arbelle, Assaf
Kim, Donghyun
Panda, Rameswar
Cox, David
Yang, Diyi
Kira, Zsolt
Feris, Rogerio
Karlinsky, Leonid
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14994 - 15004
[9] A Category-Aware Curriculum Learning for Data-Free Knowledge Distillation
Li, Xiufang
Jiao, Licheng
Sun, Qigong
Liu, Fang
Liu, Xu
Li, Lingling
Chen, Puhua
Yang, Shuyuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9603 - 9618
[10] NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation
Tran, Minh-Tuan
Le, Trung
Le, Xuan-May
Harandi, Mehrtash
Tran, Quan Hung
Phung, Dinh
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23860 - 23869

← 1 2 3 4 5 →