Context-Aware Robust Fine-Tuning

被引:0
|
作者
Xiaofeng Mao
Yufeng Chen
Xiaojun Jia
Rong Zhang
Hui Xue
Zhao Li
机构
[1] Alibaba Group,Institute of Information Engineering
[2] Chinese Academy of Sciences,undefined
[3] Zhejiang University,undefined
来源
关键词
Pre-trained models; CLIP; Fine-tuning; Robustness;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive language-image pre-trained (CLIP) models have zero-shot ability of classifying an image belonging to “[CLASS]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathtt {[CLASS]}$$\end{document}” by using similarity between the image and the prompt sentence “a [CONTEXT]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathtt {[CONTEXT]}$$\end{document} of [CLASS]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathtt {[CLASS]}$$\end{document}”. Based on exhaustive text cues in “[CONTEXT]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathtt {[CONTEXT]}$$\end{document}”, CLIP model is aware of different contexts, e.g. background, style, viewpoint, and exhibits unprecedented robustness against a wide range of distribution shifts. However, recent works find further fine-tuning of CLIP models improves accuracy but sacrifices the robustness on downstream tasks. We conduct an empirical investigation to show fine-tuning will corrupt the context-aware ability of pre-trained CLIP features. To solve this problem, we propose Context-Aware Robust Fine-tuning (CAR-FT). CAR-FT regularizes the model during fine-tuning to capture the context information. Specifically, we use zero-shot prompt weights to get the context distribution contained in the image. By minimizing the Kullback–Leibler divergence (KLD) between context distributions induced by original/fine-tuned CLIP models, CAR-FT makes the context-aware ability of CLIP inherited into downstream tasks, and achieves both higher in-distribution (ID) and out-of-distribution (OOD) accuracy. The experimental results show CAR-FT achieves superior robustness on five OOD test datasets of ImageNet, and meanwhile brings accuracy gains on nine downstream tasks. Additionally, CAR-FT surpasses previous domain generalization (DG) methods and gets 78.5% averaged accuracy on DomainBed benchmark, building the new state-of-the-art.
引用
收藏
页码:1685 / 1700
页数:15
相关论文
共 50 条
  • [1] Context-Aware Robust Fine-Tuning
    Mao, Xiaofeng
    Chen, Yufeng
    Jia, Xiaojun
    Zhang, Rong
    Xue, Hui
    Li, Zhao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (05) : 1685 - 1700
  • [2] Context-Aware and Semantic-Consistent Spatial Interactions for One-Shot Object Detection Without Fine-Tuning
    Yang, Hanqing
    Cai, Sijia
    Deng, Bing
    Ye, Jieping
    Lin, Guosheng
    Zhang, Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5424 - 5439
  • [3] Fine-tuning in the context of Bayesian theory testing
    Luke A. Barnes
    European Journal for Philosophy of Science, 2018, 8 : 253 - 269
  • [4] Fast Trainable Projection for Robust Fine-Tuning
    Tian, Junjiao
    Liu, Yen-Cheng
    Smith, James Seale
    Kira, Zsolt
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Fine-tuning in the context of Bayesian theory testing
    Barnes, Luke A.
    EUROPEAN JOURNAL FOR PHILOSOPHY OF SCIENCE, 2018, 8 (02) : 253 - 269
  • [6] Context-aware features and robust image representations
    Martins, P.
    Carvalho, P.
    Gatta, C.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (02) : 339 - 348
  • [7] A Framework for Programming Robust Context-Aware Applications
    Kulkarni, Devdatta
    Tripathi, Anand
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2010, 36 (02) : 184 - 197
  • [8] Robust Estimation Using Context-Aware Filtering
    Ivanov, Radoslav
    Atanasov, Nikolay
    Pajic, Miroslav
    Pappas, George
    Lee, Insup
    2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2015, : 590 - 597
  • [9] Context-aware generative prompt tuning for relation extraction
    Liu, Xiaoyong
    Wen, Handong
    Xu, Chunlin
    Du, Zhiguo
    Li, Huihui
    Hu, Miao
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (12) : 5495 - 5508
  • [10] Fine-tuning
    不详
    AVIATION WEEK & SPACE TECHNOLOGY, 2001, 155 (02): : 21 - 21