Dynamic visual-guided selection for zero-shot learning

被引:2
|
作者
Zhou, Yuan [1 ]
Xiang, Lei [1 ]
Liu, Fan [1 ]
Duan, Haoran [2 ]
Long, Yang [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham, England
来源
JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 03期
关键词
Visual-guided selection; Class prototype refinement; Task-relevant regions; Zero-shot learning;
D O I
10.1007/s11227-023-05625-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-shot learning (ZSL) methods currently employed to identify seen or unseen classes rely on semantic attribute prototypes or class information. However, hand-annotated attributes are only for the category rather than for each image belonging to that category. Furthermore, attribute information is inconsistent across different images of the same category due to variant views. Therefore, we propose a dynamic visual-guided selection (DVGS) which helps dynamically focus on different regions and refines class prototype on each image. Instead of directly aligning an image's global feature with its semantic class vector or its local features with all attribute vectors, the proposed method learns a vision-guided soft mask to refine the class prototype for each image. Additionally, it discovers the most task-relevant regions for fine-grained recognition with the refined class prototype. Extensive experiments on three benchmarks verify the effectiveness of our DVGS and achieve the new state-of-the-art. Our DVGS achieved the best results on fine-grained datasets within both the conventional zero-shot learning (CZSL) and generalized zero-shot learning (GZSL) settings. In particular, on the SUN dataset, our DVGS demonstrates a significant superiority of 10.2% in the CZSL setting compared with the second-best approach. Similarly, our method outperforms the second-best method by an average of 4% on CUB in both the CZSL and GZSL settings. Despite securing the second-best result on the AWA2 dataset, DVGS remains closely competitive, trailing the best performance by a mere 3.4% in CZSL and 1.2% in GZSL.
引用
收藏
页码:4401 / 4419
页数:19
相关论文
共 50 条
  • [21] Ontology-guided Semantic Composition for Zero-Shot Learning
    Chen, Jiaoyan
    Lecue, Freddy
    Geng, Yuxia
    Pan, Jeff Z.
    Chen, Huajun
    KR2020: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2020, : 850 - 854
  • [22] TransZero: Attribute-Guided Transformer for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Liu, Yang
    Xie, Guo-Sen
    Sun, Baigui
    Li, Hao
    Peng, Qinmu
    Lu, Ke
    You, Xinge
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 330 - 338
  • [23] Knowledge Guided Transformer Network for Compositional Zero-Shot Learning
    Panda, Aditya
    Prasad, Dipti
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (11)
  • [24] Exemplar-Based, Semantic Guided Zero-Shot Visual Recognition
    Zhang, Chunjie
    Liang, Chao
    Zhao, Yao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3056 - 3065
  • [25] Learning semantic consistency for audio-visual zero-shot learning
    Xiaoyong Li
    Jing Yang
    Yuling Chen
    Wei Zhang
    Xiaoli Ruan
    Chengjiang Li
    Zhidong Su
    Artificial Intelligence Review, 58 (7)
  • [26] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning
    Changpinyo, Soravit
    Chao, Wei-Lun
    Sha, Fei
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3496 - 3505
  • [27] Contrastive visual feature filtering for generalized zero-shot learning
    Meng, Shixuan
    Jiang, Rongxin
    Tian, Xiang
    Zhou, Fan
    Chen, Yaowu
    Liu, Junjie
    Shen, Chen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [28] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
    Ye, Zihan
    Hu, Fuyuan
    Lyu, Fan
    Li, Linyan
    Huang, Kaizhu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
  • [29] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [30] Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
    Ziyu Wan
    Dongdong Chen
    Jing Liao
    International Journal of Computer Vision, 2021, 129 : 1893 - 1909