Data Augmented Flatness-aware Gradient Projection for Continual Learning

被引:5
|
作者
Yang, Enneng [1 ]
Shen, Li [2 ]
Wang, Zhenyi [3 ]
Liu, Shiwei [4 ]
Guo, Guibing [1 ]
Wang, Xingwei [1 ]
机构
[1] Northeastern Univ, Shenyang, Peoples R China
[2] JD Explore Acad, Beijing, Peoples R China
[3] Univ Maryland, Baltimore, MD 21201 USA
[4] Univ Texas Austin, Austin, TX 78712 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.00518
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of continual learning (CL) is to continuously learn new tasks without forgetting previously learned old tasks. To alleviate catastrophic forgetting, gradient projection based CL methods require that the gradient updates of new tasks are orthogonal to the subspace spanned by old tasks. This limits the learning process and leads to poor performance on the new task due to the projection constraint being too strong. In this paper, we first revisit the gradient projection method from the perspective of flatness of loss surface, and find that unflatness of the loss surface leads to catastrophic forgetting of the old tasks when the projection constraint is reduced to improve the performance of new tasks. Based on our findings, we propose a Data Augmented Flatness-aware Gradient Projection (DFGP) method to solve the problem, which consists of three modules: data and weight perturbation, flatness-aware optimization, and gradient projection. Specifically, we first perform a flatness-aware perturbation on the task data and current weights to find the case that makes the task loss worst. Next, flatnessaware optimization optimizes both the loss and the flatness of the loss surface on raw and worst-case perturbed data to obtain a flatness-aware gradient. Finally, gradient projection updates the network with the flatness-aware gradient along directions orthogonal to the subspace of the old tasks. Extensive experiments on four datasets show that our method improves the flatness of loss surface and the performance of new tasks, and achieves state-of-the-art (SOTA) performance in the average accuracy of all tasks.
引用
收藏
页码:5607 / 5616
页数:10
相关论文
共 50 条
  • [1] Flatness-Aware Sequential Learning Generates Resilient Backdoors
    Pham, Hoang
    Ta, The-Anh
    Tran, Anh
    Doan, Khoa D.
    COMPUTER VISION - ECCV 2024, PT LXXXVII, 2025, 15145 : 89 - 107
  • [2] Flatness-Aware Minimization for Domain Generalization
    Zhang, Xingxuan
    Xu, Renzhe
    Yu, Han
    Dong, Yancheng
    Tian, Pengfei
    Cui, Peng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5166 - 5179
  • [3] Class Gradient Projection For Continual Learning
    Chen, Cheng
    Zhang, Ji
    Song, Jingkuan
    Gao, Lianli
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5575 - 5583
  • [4] Continual Learning with Scaled Gradient Projection
    Saha, Gobinda
    Roy, Kaushik
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9677 - 9685
  • [5] Restricted orthogonal gradient projection for continual learning
    Yang, Zeyuan
    Yang, Zonghan
    Liu, Yichen
    Li, Peng
    Liu, Yang
    AI OPEN, 2023, 4 : 98 - 110
  • [6] Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
    Shen, Lingfeng
    Tan, Weiting
    Zheng, Boyuan
    Khashabi, Daniel
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7795 - 7817
  • [7] Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning
    Deng, Danruo
    Chen, Guangyong
    Hao, Jianye
    Wang, Qiong
    Pheng-Ann Heng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Rethinking Gradient Projection Continual Learning: Stability / Plasticity Feature Space Decoupling
    Zhao, Zhen
    Zhang, Zhizhong
    Tan, Xin
    Liu, Jun
    Qu, Yanyun
    Xie, Yuan
    Ma, Lizhuang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3718 - 3727
  • [9] UniGrad-FS: Unified Gradient Projection With Flatter Sharpness for Continual Learning
    Li, Wei
    Feng, Tao
    Yuan, Hangjie
    Bian, Ang
    Du, Guodong
    Liang, Sixin
    Gan, Jianhong
    Liu, Ziwei
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (12) : 13873 - 13882
  • [10] Adversary Aware Continual Learning
    Umer, Muhammad
    Polikar, Robi
    IEEE ACCESS, 2024, 12 : 126108 - 126121