Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning

被引:12
|
作者
Lu, Xiaocheng [1 ]
Guo, Song [1 ,2 ]
Liu, Ziming [1 ]
Guo, Jingcai [1 ,2 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] Hong Kong Polytech Univ, Shenzhen Res Inst, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.02256
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compositional Zero-Shot Learning (CZSL) aims to recognize novel concepts formed by known states and objects during training. Existing methods either learn the combined state-object representation, challenging the generalization of unseen compositions, or design two classifiers to identify state and object separately from image features, ignoring the intrinsic relationship between them. To jointly eliminate the above issues and construct a more robust CZSL system, we propose a novel framework termed Decomposed Fusion with Soft Prompt (DFSP)1, by involving vision-language models (VLMs) for unseen composition recognition. Specifically, DFSP constructs a vector combination of learnable soft prompts with state and object to establish the joint representation of them. In addition, a cross-modal decomposed fusion module is designed between the language and image branches, which decomposes state and object among language features instead of image features. Notably, being fused with the decomposed features, the image features can be more expressive for learning the relationship with states and objects, respectively, to improve the response of unseen compositions in the pair space, hence narrowing the domain gap between seen and unseen sets. Experimental results on three challenging benchmarks demonstrate that our approach significantly outperforms other state-of-the-art methods by large margins.
引用
收藏
页码:23560 / 23569
页数:10
相关论文
共 50 条
  • [41] Dynamic visual-guided selection for zero-shot learning
    Yuan Zhou
    Lei Xiang
    Fan Liu
    Haoran Duan
    Yang Long
    The Journal of Supercomputing, 2024, 80 : 4401 - 4419
  • [42] TransZero: Attribute-Guided Transformer for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Liu, Yang
    Xie, Guo-Sen
    Sun, Baigui
    Li, Hao
    Peng, Qinmu
    Lu, Ke
    You, Xinge
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 330 - 338
  • [43] ENHANCING CLASS UNDERSTANDING VIA PROMPT-TUNING FOR ZERO-SHOT TEXT CLASSIFICATION
    Dan, Yuhao
    Zhou, Jie
    Chen, Qin
    Bai, Qingchun
    He, Liang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4303 - 4307
  • [44] Compositional Zero-Shot Artistic Font Synthesis
    Li, Xiang
    Wu, Lei
    Wang, Changshuo
    Meng, Lei
    Meng, Xiangxu
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1098 - 1106
  • [45] A causal view of compositional zero-shot recognition
    Atzmon, Yuval
    Kreuk, Felix
    Shalit, Uri
    Chechik, Gal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [46] Zero-shot learning based on the fusion of global and local representations
    Qiang, Wang
    Mou, HongJin
    Jia, Wang
    Wei, Chunxiao
    Yu, Zhou
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)
  • [47] Prompt-Based Joint Contrastive Learning for Zero-Shot Relation Extraction
    Zou, Jianjian
    Xiao, Yuhui
    Zhou, Sichi
    Li, Wei
    Yang, Qun
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 : 419 - 431
  • [48] Disentangling Before Composing: Learning Invariant Disentangled Features for Compositional Zero-Shot Learning
    Zhang, Tian
    Liang, Kongming
    Du, Ruoyi
    Chen, Wei
    Ma, Zhanyu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 1132 - 1147
  • [49] Ordinal Zero-Shot Learning
    Huo, Zengwei
    Geng, Xin
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1916 - 1922
  • [50] Zero-Shot Kernel Learning
    Zhang, Hongguang
    Koniusz, Piotr
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7670 - 7679