Continual pre-training mitigates forgetting in language and vision

被引：0

作者：

Cossu, Andrea ^{[1
]}

Carta, Antonio ^{[1
]}

Passaro, Lucia ^{[1
]}

Lomonaco, Vincenzo ^{[1
]}

Tuytelaars, Tinne ^{[2
]}

Bacciu, Davide ^{[1
]}

机构：

[1] Univ Pisa, Largo B Pontecorvo 3, I-56127 Pisa, Italy

[2] Katholieke Univ Leuven, Kasteelpk Arenberg 10, B-3001 Leuven, Belgium

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

欧盟地平线“2020”;

关键词：

Continual-learning; Lifelong-learning; Pre-training; Self-supervised; Forgetting;

D O I：

10.1016/j.neunet.2024.106492

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained models are commonly used in Continual Learning to initialize the model before training on the stream of non-stationary data. However, pre-training is rarely applied during Continual Learning. We investigate the characteristics of the Continual Pre-Training scenario, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks. We introduce an evaluation protocol for Continual Pre-Training which monitors forgetting against a Forgetting Control dataset not present in the continual stream. We disentangle the impact on forgetting of 3 main factors: the input modality (NLP, Vision), the architecture type (Transformer, ResNet) and the pre-training protocol (supervised, self-supervised). Moreover, we propose a Sample-Efficient Pre-training method (SEP) that speeds up the pre- training phase. We show that the pre-training protocol is the most important factor accounting for forgetting. Surprisingly, we discovered that self-supervised continual pre-training in both NLP and Vision is sufficient to mitigate forgetting without the use of any Continual Learning strategy. Other factors, like model depth, input modality and architecture type are not as crucial.

引用

页数：14

共 50 条

[31] Weakly Supervised Vision-and-Language Pre-training with Relative Representations
Chen, Chi
Li, Peng
Sun, Maosong
Liu, Yang
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8341 - 8355
[32] Unsupervised Domain Adaption Harnessing Vision-Language Pre-Training
Zhou, Wenlve
Zhou, Zhiheng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 8201 - 8214
[33] Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability
Chang, Tyler A.
Tu, Zhuowen
Bergen, Benjamin K.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1346 - 1362
[34] Multimodal Pre-training Method for Vision-language Understanding and Generation
Liu T.-Y.
Wu Z.-X.
Chen J.-J.
Jiang Y.-G.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2024 - 2034
[35] Unified Vision-Language Pre-Training for Image Captioning and VQA
Zhou, Luowei
Palangi, Hamid
Zhang, Lei
Hu, Houdong
Corso, Jason J.
Gao, Jianfeng
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13041 - 13049
[36] Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-training
Chen, Xiaofei
He, Yuting
Xue, Cheng
Ge, Rongjun
Li, Shuo
Yang, Guanyu
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 405 - 415
[37] Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-training
Zhang, Wenyu
Shen, Li
Foo, Chuan-Sheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 844 - 866
[38] Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Dou, Zi-Yi
Kamath, Aishwarya
Gan, Zhe
Zhang, Pengchuan
Wang, Jianfeng
Li, Linjie
Liu, Zicheng
Liu, Ce
LeCun, Yann
Peng, Nanyun
Gao, Jianfeng
Wang, Lijuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[39] Vision-Language Pre-Training: Basics, Recent Advances, and Future Trends
Gan, Zhe
Li, Linjie
Li, Chunyuan
Wang, Lijuan
Liu, Zicheng
Gao, Jianfeng
FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2022, 14 (3-4): : 163 - 352
[40] Knowledge Enhanced Pre-Training Model for Vision-Language-Navigation Task
HUANG Jitao
ZENG Guohui
HUANG Bo
GAO Yongbin
LIU Jin
SHI Zhicai
Wuhan University Journal of Natural Sciences, 2021, 26 (02) : 147 - 155

← 1 2 3 4 5 →