Video Attribute Prototype Network: A New Perspective for Zero-Shot Video Classification

被引：0

作者：

Wang, Bo ^{[1
]}

Zhao, Kaili ^{[1
]}

Zhao, Hongyang ^{[1
]}

Pu, Shi

Xiao, Bo ^{[1
]}

Guo, Jun ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW | 2023年

关键词：

D O I：

10.1109/ICCVW60793.2023.00039

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video attributes, which leverage video contents to instantiate class semantics, play a critical role in diversifying semantics in zero-shot video classification, thereby facilitating semantic transfer from seen to unseen classes. However, few presences discuss video attributes, and most methods consider class names as class semantics that tend to be loosely defined. In this paper, we propose a Video Attribute Prototype Network (VAPNet) to generate video attributes that learns in-context semantics between video captions and class semantics. Specifically, we introduce a cross-attention module in the Transformer decoder by considering video captions as queries to probe and pool semantic-associated class-wise features. To alleviate noises in pre-extracted captions, we learn caption features through a stochastic representation derived from a Gaussian representation where the variance encodes uncertainties. We utilize a joint video-to-attribute and video-to-video contrastive loss to calibrate visual and semantic features. Experiments show that VAPNet significantly outperforms SoTA by relative improvements of 14.3% on UCF101 and 8.8% on HMDB51, and further surpasses the pre-trained vision-language SoTA by 4.1% and 17.2%. Code is available.

引用

页码：315 / 324

页数：10

共 50 条

[41] Zero-Shot Video Grounding With Pseudo Query Lookup and Verification
Lu, Yu
Quan, Ruijie
Zhu, Linchao
Yang, Yi
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1643 - 1654
[42] Dual Progressive Prototype Network for Generalized Zero-Shot Learning
Wang, Chaoqun
Mina, Shaobo
Chenl, Xuejin
Sun, Xiaoyan
Li, Houqiang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[43] Attribute Distillation for Zero-Shot Recognition
Li, Houjun
Wei, Boquan
Computer Engineering and Applications, 60 (09): : 219 - 227
[44] Zero-shot Learning With Fuzzy Attribute
Liu, Chongwen
Shang, Zhaowei
Tang, Yuan Yan
2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, : 277 - 282
[45] Language-free Training for Zero-shot Video Grounding
Kim, Dahye
Park, Jungin
Lee, Jiyoung
Park, Seongheon
Sohn, Kwanghoon
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2538 - 2547
[46] Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation
Yuan, Yichen
Wang, Yifan
Wang, Lijun
Zhao, Xiaoqi
Lu, Huchuan
Wang, Yu
Su, Weibo
Zhang, Lei
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 966 - 976
[47] Attribute-Based Classification for Zero-Shot Visual Object Categorization
Lampert, Christoph H.
Nickisch, Hannes
Harmeling, Stefan
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (03) : 453 - 465
[48] Attribute-Based Zero-Shot Learning for Encrypted Traffic Classification
Hu, Ying
Cheng, Guang
Chen, Wenchao
Jiang, Bomiao
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (04): : 4583 - 4599
[49] CI-GNN: Building a Category-Instance Graph for Zero-Shot Video Classification
Gao, Junyu
Xu, Changsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3088 - 3100
[50] Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Khachatryan, Levon
Movsisyan, Andranik
Tadevosyan, Vahram
Henschel, Roberto
Wang, Zhangyang
Navasardyan, Shant
Shi, Humphrey
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15908 - 15918

← 1 2 3 4 5 →