Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck

被引：0

作者：

Li, Yapeng

Luo, Yong ^{[1
]}

Du, Bo ^{[1
]}

机构：

[1] Wuhan Univ, Inst Artificial Intelligence, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年

基金：

中国国家自然科学基金;

关键词：

Audio-visual; generalized zero-shot learning; information bottleneck; multi-modality fusion;

D O I：

10.1109/ICME55011.2023.00084

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Audio-visual generalized zero-shot learning (GZSL) aims to train a model on seen classes for classifying data samples from both seen classes and unseen classes. Due to the absence of unseen training samples, the model tends to misclassify unseen class samples into seen classes. To mitigate this problem, in this paper, we propose a method based on variational information bottleneck for audio-visual GZSL. Specifically, we model the joint representations as a product-of-experts over marginal representations to integrate the information of audio and visual. Besides, we introduce variational information bottleneck to the learning of audio-visual joint representations and marginal representations of audio, visual, and text label modalities. This helps our model reduce the negative impact of information that cannot be generalized to unseen classes. Experimental results conducted on the UCF-GZSL, VGGSound-GZSL, and ActivityNet-GZSL benchmarks demonstrate the effectiveness and superiority of the proposed model for audio-visual GZSL.

引用

页码：450 / 455

页数：6

共 50 条

[21] A Variational Autoencoder with Deep Embedding Model for Generalized Zero-Shot Learning
Ma, Peirong
Hu, Xiao
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11733 - 11740
[22] Generalized Zero-Shot Learning Based on Manifold Alignment
Xu, Rui
Shao, Shuai
Liu, Baodi
Liu, Weifeng
2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 202 - 207
[23] A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning
Rahman, Shafin
Khan, Salman
Porikli, Fatih
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5652 - 5667
[24] Generating visual representations for zero-shot learning via adversarial learning and variational autoencoders
Gull, Muqaddas
Arif, Omar
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2023, 52 (05) : 636 - 651
[25] Dual Generative Network with Discriminative Information for Generalized Zero-Shot Learning
Xu, Tingting
Zhao, Ye
Liu, Xueliang
COMPLEXITY, 2021, 2021
[26] Triple Loss Based Framework for Generalized Zero-Shot Learning
Shen, Yaying
Li, Qun
Xu, Ding
Zhang, Ziyi
Yang, Rui
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (04) : 832 - 835
[27] Dual insurance for generalized zero-shot learning
Liang, Jiahao
Fang, Xiaozhao
Kang, Peipei
Han, Na
Li, Chuang
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (03) : 2111 - 2125
[28] Model Selection for Generalized Zero-Shot Learning
Zhang, Hongguang
Koniusz, Piotr
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 198 - 204
[29] Contrastive Embedding for Generalized Zero-Shot Learning
Han, Zongyan
Fu, Zhenyong
Chen, Shuo
Yang, Jian
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2371 - 2381
[30] Semantics Disentangling for Generalized Zero-Shot Learning
Chen, Zhi
Luo, Yadan
Qiu, Ruihong
Wang, Sen
Huang, Zi
Li, Jingjing
Zhang, Zheng
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8692 - 8700

← 1 2 3 4 5 →