Dynamic visual-guided selection for zero-shot learning

被引：2

作者：

Zhou, Yuan ^{[1
]}

Xiang, Lei ^{[1
]}

Liu, Fan ^{[1
]}

Duan, Haoran ^{[2
]}

Long, Yang ^{[2
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Jiangsu, Peoples R China

[2] Univ Durham, Dept Comp Sci, Durham, England

来源：

JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 03期

关键词：

Visual-guided selection; Class prototype refinement; Task-relevant regions; Zero-shot learning;

D O I：

10.1007/s11227-023-05625-1

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Zero-shot learning (ZSL) methods currently employed to identify seen or unseen classes rely on semantic attribute prototypes or class information. However, hand-annotated attributes are only for the category rather than for each image belonging to that category. Furthermore, attribute information is inconsistent across different images of the same category due to variant views. Therefore, we propose a dynamic visual-guided selection (DVGS) which helps dynamically focus on different regions and refines class prototype on each image. Instead of directly aligning an image's global feature with its semantic class vector or its local features with all attribute vectors, the proposed method learns a vision-guided soft mask to refine the class prototype for each image. Additionally, it discovers the most task-relevant regions for fine-grained recognition with the refined class prototype. Extensive experiments on three benchmarks verify the effectiveness of our DVGS and achieve the new state-of-the-art. Our DVGS achieved the best results on fine-grained datasets within both the conventional zero-shot learning (CZSL) and generalized zero-shot learning (GZSL) settings. In particular, on the SUN dataset, our DVGS demonstrates a significant superiority of 10.2% in the CZSL setting compared with the second-best approach. Similarly, our method outperforms the second-best method by an average of 4% on CUB in both the CZSL and GZSL settings. Despite securing the second-best result on the AWA2 dataset, DVGS remains closely competitive, trailing the best performance by a mere 3.4% in CZSL and 1.2% in GZSL.

引用

页码：4401 / 4419

页数：19

共 50 条

[21] Ontology-guided Semantic Composition for Zero-Shot Learning
Chen, Jiaoyan
Lecue, Freddy
Geng, Yuxia
Pan, Jeff Z.
Chen, Huajun
KR2020: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2020, : 850 - 854
[22] TransZero: Attribute-Guided Transformer for Zero-Shot Learning
Chen, Shiming
Hong, Ziming
Liu, Yang
Xie, Guo-Sen
Sun, Baigui
Li, Hao
Peng, Qinmu
Lu, Ke
You, Xinge
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 330 - 338
[23] Knowledge Guided Transformer Network for Compositional Zero-Shot Learning
Panda, Aditya
Prasad, Dipti
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (11)
[24] Exemplar-Based, Semantic Guided Zero-Shot Visual Recognition
Zhang, Chunjie
Liang, Chao
Zhao, Yao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3056 - 3065
[25] Learning semantic consistency for audio-visual zero-shot learning
Xiaoyong Li
Jing Yang
Yuling Chen
Wei Zhang
Xiaoli Ruan
Chengjiang Li
Zhidong Su
Artificial Intelligence Review, 58 (7)
[26] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning
Changpinyo, Soravit
Chao, Wei-Lun
Sha, Fei
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3496 - 3505
[27] Contrastive visual feature filtering for generalized zero-shot learning
Meng, Shixuan
Jiang, Rongxin
Tian, Xiang
Zhou, Fan
Chen, Yaowu
Liu, Junjie
Shen, Chen
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
[28] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
Ye, Zihan
Hu, Fuyuan
Lyu, Fan
Li, Linyan
Huang, Kaizhu
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
[29] Learning discriminative visual semantic embedding for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Yuan, Jianying
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
[30] Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
Ziyu Wan
Dongdong Chen
Jing Liao
International Journal of Computer Vision, 2021, 129 : 1893 - 1909

← 1 2 3 4 5 →