Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning

被引:3
|
作者
Kim, Tae-Heon [1 ]
Moon, Hyung-Jun [2 ]
Cho, Sung-Bae [1 ,2 ]
机构
[1] Yonsei Univ, Dept Comp Sci, Seoul 03722, South Korea
[2] Yonsei Univ, Dept Artificial Intelligence, Seoul 03722, South Korea
关键词
Continual learning; Memory replay; Sample generation; Multivariate gaussian distribution; Expectation-maximization;
D O I
10.1007/978-3-031-21753-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Continual learning is a novel learning setup for an environment where data are introduced sequentially, and a model continually learns new tasks. However, the model forgets the learned knowledge as it learns new classes. There is an approach that keeps a few previous data, but this causes other problems such as overfitting and class imbalance. In this paper, we propose a method that retrains a network with generated representations from an estimated multivariate Gaussian distribution. The representations are the vectors coming from CNN that is trained using a gradient regularization to prevent a distribution shift, allowing the stored means and covariances to create realistic representations. The generated vectors contain every class seen so far, which helps preventing the forgetting. Our 6-fold cross-validation experiment shows that the proposed method outperforms the existing continual learning methods by 1.14%p and 4.60%p in CIFAR10 and CIFAR100, respectively. Moreover, we visualize the generated vectors using t-SNE to confirm the validity of multivariate Gaussian mixture to estimate the distribution of the data representations.
引用
收藏
页码:359 / 368
页数:10
相关论文
共 50 条
  • [1] Energy Minimum Regularization in Continual Learning
    Li, Xiaobin
    Shan, Lianlei
    Li, Minglong
    Wang, Weiqiang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6404 - 6409
  • [2] Latent spectral regularization for continual learning
    Frascaroli, Emanuele
    Benaglia, Riccardo
    Boschini, Matteo
    Moschella, Luca
    Fiorini, Cosimo
    Rodola, Emanuele
    Calderara, Simone
    PATTERN RECOGNITION LETTERS, 2024, 184 : 119 - 125
  • [3] Orthogonal Gradient Descent for Continual Learning
    Farajtabar, Mehrdad
    Azizan, Navid
    Mott, Alex
    Li, Ang
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3762 - 3772
  • [4] Gradient Episodic Memory for Continual Learning
    Lopez-Paz, David
    Ranzato, Marc'Aurelio
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Continual Learning with Scaled Gradient Projection
    Saha, Gobinda
    Roy, Kaushik
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9677 - 9685
  • [6] Class Gradient Projection For Continual Learning
    Chen, Cheng
    Zhang, Ji
    Song, Jingkuan
    Gao, Lianli
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5575 - 5583
  • [7] Efficient continual learning in neural networks with embedding regularization
    Pomponi, Jary
    Scardapane, Simone
    Lomonaco, Vincenzo
    Uncini, Aurelio
    NEUROCOMPUTING, 2020, 397 : 139 - 148
  • [8] Uncertainty-based Continual Learning with Adaptive Regularization
    Ahn, Hongjoon
    Cha, Sungmin
    Lee, Donggyu
    Moon, Taesup
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Homeostasis-Inspired Continual Learning: Learning to Control Structural Regularization
    Kim, Joonyoung
    Seo, Hyowoon
    Choi, Wan
    Jung, Kyomin
    IEEE ACCESS, 2021, 9 (09): : 9690 - 9698
  • [10] Continual Learning of Knowledge Graph Embeddings
    Daruna, Angel
    Gupta, Mehul
    Sridharan, Mohan
    Chernova, Sonia
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 1128 - 1135