CONTINUAL LEARNING WITH FOUNDATION MODELS: AN EMPIRICAL STUDY OF LATENT REPLAY

被引:0
|
作者
Ostapenko, Oleksiy [1 ,2 ,3 ]
Lesort, Timothee [1 ,2 ]
Rodriguez, Pau [3 ]
Arefin, Md Rifat [1 ,2 ]
Douillard, Arthur [4 ,6 ]
Rish, Irina [1 ,2 ,7 ]
Charlin, Laurent [1 ,5 ,7 ]
机构
[1] Mila Quebec AI Inst, Montreal, PQ, Canada
[2] Univ Montreal, Montreal, PQ, Canada
[3] ServiceNow, Santa Clara, CA 94043 USA
[4] Heuritech, Paris, France
[5] HEC Montreal, Montreal, PQ, Canada
[6] Sorbonne Univ, Paris, France
[7] Canada CIFAR AI Chair, Montreal, PQ, Canada
来源
CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199 | 2022年 / 199卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rapid development of large-scale pre-training has resulted in foundation models that can act as effective feature extractors on a variety of downstream tasks and domains. Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios. Our goal is twofold. First, we want to understand the compute-accuracy trade-off between CL in the raw-data space and in the latent space of pre-trained encoders. Second, we investigate how the characteristics of the encoder, the pre-training algorithm and data, as well as of the resulting latent space affect CL performance. For this, we compare the efficacy of various pre-trained models in large-scale benchmarking scenarios with a vanilla replay setting applied in the latent and in the raw-data space. Notably, this study shows how transfer, forgetting, task similarity and learning are dependent on the input data characteristics and not necessarily on the CL algorithms. First, we show that under some circumstances reasonable CL performance can readily be achieved with a non-parametric classifier at negligible compute. We then show how models pre-trained on broader data result in better performance for various replay sizes. We explain this with representational similarity and transfer properties of these representations. Finally, we show the effectiveness of self-supervised (SSL) pre-training for downstream domains that are out-of-distribution as compared to the pre-training domain. We point out and validate several research directions that can further increase the efficacy of latent CL including representation ensembling. The diverse set of datasets used in this study can serve as a compute-efficient playground for further CL research. Codebase is available under https://github.com/oleksost/latent_CL.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Latent Replay for Real-Time Continual Learning
    Pellegrini, Lorenzo
    Graffieti, Gabriele
    Lomonaco, Vincenzo
    Maltoni, Davide
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10203 - 10209
  • [2] A Benchmark and Empirical Analysis for Replay Strategies in Continual Learning
    Yang, Qihan
    Feng, Fan
    Chan, Rosa H. M.
    CONTINUAL SEMI-SUPERVISED LEARNING, CSSL 2021, 2022, 13418 : 75 - 90
  • [3] BinPlay: A Binary Latent Autoencoder for Generative Replay Continual Learning
    Deja, Kamil
    Wawrzynski, Pawel
    Marczak, Daniel
    Masarczyk, Wojciech
    Trzcinski, Tomasz
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] Experience Replay for Continual Learning
    Rolnick, David
    Ahuja, Arun
    Schwarz, Jonathan
    Lillicrap, Timothy P.
    Wayne, Greg
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Marginal Replay vs Conditional Replay for Continual Learning
    Lesort, Timothee
    Gepperth, Alexander
    Stoian, Andrei
    Filliat, David
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 466 - 480
  • [6] Retrospective Adversarial Replay for Continual Learning
    Kumari, Lilly
    Wang, Shengjie
    Zhou, Tianyi
    Bilmes, Jeff
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] Knowledge Capture and Replay for Continual Learning
    Gopalakrishnan, Saisubramaniam
    Singh, Pranshu Ranjan
    Fayek, Haytham
    Ramasamy, Savitha
    Ambikapathi, ArulMurugan
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 337 - 345
  • [8] Continual Learning with Deep Generative Replay
    Shin, Hanul
    Lee, Jung Kwon
    Kim, Jaehong
    Kim, Jiwon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [9] Hierarchical Correlations Replay for Continual Learning
    Wang, Qiang
    Liu, Jiayi
    Ji, Zhong
    Pang, Yanwei
    Zhang, Zhongfei
    KNOWLEDGE-BASED SYSTEMS, 2022, 250
  • [10] Generative negative replay for continual learning
    Graffieti, Gabriele
    Maltoni, Davide
    Pellegrini, Lorenzo
    Lomonaco, Vincenzo
    NEURAL NETWORKS, 2023, 162 : 369 - 383