CONTINUAL LEARNING WITH FOUNDATION MODELS: AN EMPIRICAL STUDY OF LATENT REPLAY

被引：0

作者：

Ostapenko, Oleksiy ^{[1
,2
,3
]}

Lesort, Timothee ^{[1
,2
]}

Rodriguez, Pau ^{[3
]}

Arefin, Md Rifat ^{[1
,2
]}

Douillard, Arthur ^{[4
,6
]}

Rish, Irina ^{[1
,2
,7
]}

Charlin, Laurent ^{[1
,5
,7
]}

机构：

[1] Mila Quebec AI Inst, Montreal, PQ, Canada

[2] Univ Montreal, Montreal, PQ, Canada

[3] ServiceNow, Santa Clara, CA 94043 USA

[4] Heuritech, Paris, France

[5] HEC Montreal, Montreal, PQ, Canada

[6] Sorbonne Univ, Paris, France

[7] Canada CIFAR AI Chair, Montreal, PQ, Canada

来源：

CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199 | 2022年 / 199卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Rapid development of large-scale pre-training has resulted in foundation models that can act as effective feature extractors on a variety of downstream tasks and domains. Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios. Our goal is twofold. First, we want to understand the compute-accuracy trade-off between CL in the raw-data space and in the latent space of pre-trained encoders. Second, we investigate how the characteristics of the encoder, the pre-training algorithm and data, as well as of the resulting latent space affect CL performance. For this, we compare the efficacy of various pre-trained models in large-scale benchmarking scenarios with a vanilla replay setting applied in the latent and in the raw-data space. Notably, this study shows how transfer, forgetting, task similarity and learning are dependent on the input data characteristics and not necessarily on the CL algorithms. First, we show that under some circumstances reasonable CL performance can readily be achieved with a non-parametric classifier at negligible compute. We then show how models pre-trained on broader data result in better performance for various replay sizes. We explain this with representational similarity and transfer properties of these representations. Finally, we show the effectiveness of self-supervised (SSL) pre-training for downstream domains that are out-of-distribution as compared to the pre-training domain. We point out and validate several research directions that can further increase the efficacy of latent CL including representation ensembling. The diverse set of datasets used in this study can serve as a compute-efficient playground for further CL research. Codebase is available under https://github.com/oleksost/latent_CL.

引用

页数：32

共 50 条

[1] Latent Replay for Real-Time Continual Learning
Pellegrini, Lorenzo
Graffieti, Gabriele
Lomonaco, Vincenzo
Maltoni, Davide
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10203 - 10209
[2] A Benchmark and Empirical Analysis for Replay Strategies in Continual Learning
Yang, Qihan
Feng, Fan
Chan, Rosa H. M.
CONTINUAL SEMI-SUPERVISED LEARNING, CSSL 2021, 2022, 13418 : 75 - 90
[3] BinPlay: A Binary Latent Autoencoder for Generative Replay Continual Learning
Deja, Kamil
Wawrzynski, Pawel
Marczak, Daniel
Masarczyk, Wojciech
Trzcinski, Tomasz
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[4] Experience Replay for Continual Learning
Rolnick, David
Ahuja, Arun
Schwarz, Jonathan
Lillicrap, Timothy P.
Wayne, Greg
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[5] Marginal Replay vs Conditional Replay for Continual Learning
Lesort, Timothee
Gepperth, Alexander
Stoian, Andrei
Filliat, David
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 466 - 480
[6] Retrospective Adversarial Replay for Continual Learning
Kumari, Lilly
Wang, Shengjie
Zhou, Tianyi
Bilmes, Jeff
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[7] Knowledge Capture and Replay for Continual Learning
Gopalakrishnan, Saisubramaniam
Singh, Pranshu Ranjan
Fayek, Haytham
Ramasamy, Savitha
Ambikapathi, ArulMurugan
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 337 - 345
[8] Continual Learning with Deep Generative Replay
Shin, Hanul
Lee, Jung Kwon
Kim, Jaehong
Kim, Jiwon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[9] Hierarchical Correlations Replay for Continual Learning
Wang, Qiang
Liu, Jiayi
Ji, Zhong
Pang, Yanwei
Zhang, Zhongfei
KNOWLEDGE-BASED SYSTEMS, 2022, 250
[10] Generative negative replay for continual learning
Graffieti, Gabriele
Maltoni, Davide
Pellegrini, Lorenzo
Lomonaco, Vincenzo
NEURAL NETWORKS, 2023, 162 : 369 - 383

← 1 2 3 4 5 →