Data Augmented Flatness-aware Gradient Projection for Continual Learning

被引:5
|
作者
Yang, Enneng [1 ]
Shen, Li [2 ]
Wang, Zhenyi [3 ]
Liu, Shiwei [4 ]
Guo, Guibing [1 ]
Wang, Xingwei [1 ]
机构
[1] Northeastern Univ, Shenyang, Peoples R China
[2] JD Explore Acad, Beijing, Peoples R China
[3] Univ Maryland, Baltimore, MD 21201 USA
[4] Univ Texas Austin, Austin, TX 78712 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.00518
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of continual learning (CL) is to continuously learn new tasks without forgetting previously learned old tasks. To alleviate catastrophic forgetting, gradient projection based CL methods require that the gradient updates of new tasks are orthogonal to the subspace spanned by old tasks. This limits the learning process and leads to poor performance on the new task due to the projection constraint being too strong. In this paper, we first revisit the gradient projection method from the perspective of flatness of loss surface, and find that unflatness of the loss surface leads to catastrophic forgetting of the old tasks when the projection constraint is reduced to improve the performance of new tasks. Based on our findings, we propose a Data Augmented Flatness-aware Gradient Projection (DFGP) method to solve the problem, which consists of three modules: data and weight perturbation, flatness-aware optimization, and gradient projection. Specifically, we first perform a flatness-aware perturbation on the task data and current weights to find the case that makes the task loss worst. Next, flatnessaware optimization optimizes both the loss and the flatness of the loss surface on raw and worst-case perturbed data to obtain a flatness-aware gradient. Finally, gradient projection updates the network with the flatness-aware gradient along directions orthogonal to the subspace of the old tasks. Extensive experiments on four datasets show that our method improves the flatness of loss surface and the performance of new tasks, and achieves state-of-the-art (SOTA) performance in the average accuracy of all tasks.
引用
收藏
页码:5607 / 5616
页数:10
相关论文
共 50 条
  • [21] Measuring Asymmetric Gradient Discrepancy in Parallel Continual Learning
    Lyu, Fan
    Sun, Qing
    Shang, Fanhua
    Wan, Liang
    Feng, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11377 - 11386
  • [22] Gradient based sample selection for online continual learning
    Aljundi, Rahaf
    Lin, Min
    Goujaud, Baptiste
    Bengio, Yoshua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [23] Gradient Regularized Contrastive Learning for Continual Domain Adaptation
    Tang, Shixiang
    Su, Peng
    Chen, Dapeng
    Ouyang, Wanli
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2665 - 2673
  • [24] Preserving Linear Separability in Continual Learning by Backward Feature Projection
    Gu, Qiao
    Shim, Dongsub
    Shkurti, Florian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24286 - 24295
  • [25] Continual learning for seizure prediction via memory projection strategy
    Shi, Yufei
    Tang, Shishi
    Li, Yuxuan
    He, Zhipeng
    Tang, Shengsheng
    Wang, Ruixuan
    Zheng, Weishi
    Chen, Ziyi
    Zhou, Yi
    Computers in Biology and Medicine, 2024, 181
  • [26] Exploring Data Geometry for Continual Learning
    Gao, Zhi
    Xu, Chen
    Li, Feng
    Jia, Yunde
    Harandi, Mehrtash
    Wu, Yuwei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24325 - 24334
  • [27] Task Relation-aware Continual User Representation Learning
    Kim, Sein
    Lee, Namkyeong
    Kim, Donghyun
    Yang, Minchul
    Park, Chanyoung
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1107 - 1119
  • [28] NPCL: Neural Processes for Uncertainty-Aware Continual Learning
    Jha, Saurav
    Gong, Dong
    Zhao, He
    Yao, Lina
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [29] Fast Context Adaptation in Cost-Aware Continual Learning
    Lahmer, Seyyidahmed
    Mason, Federico
    Chiariotti, Federico
    Zanella, Andrea
    IEEE Transactions on Machine Learning in Communications and Networking, 2024, 2 : 479 - 494
  • [30] Hierarchical Question-Aware Context Learning with Augmented Data for Biomedical Question Answering
    Du, Yongping
    Guo, Wenyang
    Zhao, Yiliang
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 370 - 375