Continual pre-training mitigates forgetting in language and vision

被引：0

作者：

Cossu, Andrea ^{[1
]}

Carta, Antonio ^{[1
]}

Passaro, Lucia ^{[1
]}

Lomonaco, Vincenzo ^{[1
]}

Tuytelaars, Tinne ^{[2
]}

Bacciu, Davide ^{[1
]}

机构：

[1] Univ Pisa, Largo B Pontecorvo 3, I-56127 Pisa, Italy

[2] Katholieke Univ Leuven, Kasteelpk Arenberg 10, B-3001 Leuven, Belgium

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

欧盟地平线“2020”;

关键词：

Continual-learning; Lifelong-learning; Pre-training; Self-supervised; Forgetting;

D O I：

10.1016/j.neunet.2024.106492

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained models are commonly used in Continual Learning to initialize the model before training on the stream of non-stationary data. However, pre-training is rarely applied during Continual Learning. We investigate the characteristics of the Continual Pre-Training scenario, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks. We introduce an evaluation protocol for Continual Pre-Training which monitors forgetting against a Forgetting Control dataset not present in the continual stream. We disentangle the impact on forgetting of 3 main factors: the input modality (NLP, Vision), the architecture type (Transformer, ResNet) and the pre-training protocol (supervised, self-supervised). Moreover, we propose a Sample-Efficient Pre-training method (SEP) that speeds up the pre- training phase. We show that the pre-training protocol is the most important factor accounting for forgetting. Surprisingly, we discovered that self-supervised continual pre-training in both NLP and Vision is sufficient to mitigate forgetting without the use of any Continual Learning strategy. Other factors, like model depth, input modality and architecture type are not as crucial.

引用

页数：14

共 50 条

[21] Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision
Wang, Tzu-Jui Julius
Laaksonen, Jorma
Langer, Tomas
Arponen, Heikki
Bishop, Tom E.
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1073 - 1083
[22] Enhancing Dynamic Image Advertising with Vision-Language Pre-training
Wen, Zhoufutu
Zhao, Xinyu
Jin, Zhipeng
Yang, Yi
Jia, Wei
Chen, Xiaodong
Li, Shuanglong
Liu, Lin
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3310 - 3314
[23] Too Large; Data Reduction for Vision-Language Pre-Training
Wang, Alex Jinpeng
Lin, Kevin Qinghong
Zhang, David Junhao
Lei, Stan Weixian
Shou, Mike Zheng
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3124 - 3134
[24] Scaling Up Vision-Language Pre-training for Image Captioning
Hu, Xiaowei
Gan, Zhe
Wang, Jianfeng
Yang, Zhengyuan
Liu, Zicheng
Lu, Yumao
Wang, Lijuan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17959 - 17968
[25] Superpixel semantics representation and pre-training for vision-language tasks
Zhang, Siyu
Chen, Yeming
Sun, Yaoru
Wang, Fang
Yang, Jun
Bai, Lizhi
Gao, Shangce
NEUROCOMPUTING, 2025, 615
[26] Vision-Language Pre-Training for Boosting Scene Text Detectors
Song, Sibo
Wan, Jianqiang
Yang, Zhibo
Tang, Jun
Cheng, Wenqing
Bai, Xiang
Yao, Cong
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15660 - 15670
[27] Retrieval-based Knowledge Augmented Vision Language Pre-training
Rao, Jiahua
Shan, Zifei
Liu, Longpo
Zhou, Yao
Yang, Yuedong
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5399 - 5409
[28] Knowledge distilled pre-training model for vision-language-navigation
Huang, Bo
Zhang, Shuai
Huang, Jitao
Yu, Yijun
Shi, Zhicai
Xiong, Yujie
APPLIED INTELLIGENCE, 2023, 53 (05) : 5607 - 5619
[29] Towards Adversarial Attack on Vision-Language Pre-training Models
Zhang, Jiaming
Yi, Qi
Sang, Jitao
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5005 - 5013
[30] MAFA: Managing False Negatives for Vision-Language Pre-training
Byun, Jaeseok
Kim, Dohoon
Moon, Taesup
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27304 - 27314

← 1 2 3 4 5 →