Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck

被引:0
|
作者
Li, Yapeng
Luo, Yong [1 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Inst Artificial Intelligence, Sch Comp Sci, Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Audio-visual; generalized zero-shot learning; information bottleneck; multi-modality fusion;
D O I
10.1109/ICME55011.2023.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audio-visual generalized zero-shot learning (GZSL) aims to train a model on seen classes for classifying data samples from both seen classes and unseen classes. Due to the absence of unseen training samples, the model tends to misclassify unseen class samples into seen classes. To mitigate this problem, in this paper, we propose a method based on variational information bottleneck for audio-visual GZSL. Specifically, we model the joint representations as a product-of-experts over marginal representations to integrate the information of audio and visual. Besides, we introduce variational information bottleneck to the learning of audio-visual joint representations and marginal representations of audio, visual, and text label modalities. This helps our model reduce the negative impact of information that cannot be generalized to unseen classes. Experimental results conducted on the UCF-GZSL, VGGSound-GZSL, and ActivityNet-GZSL benchmarks demonstrate the effectiveness and superiority of the proposed model for audio-visual GZSL.
引用
收藏
页码:450 / 455
页数:6
相关论文
共 50 条
  • [41] Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
    Liu, Man
    Li, Feng
    Zhang, Chunjie
    Wei, Yunchao
    Bai, Huihui
    Zhao, Yao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15337 - 15346
  • [42] Superclass-aware visual feature disentangling for generalized zero-shot learning
    Niu, Chang
    Shang, Junyuan
    Zhou, Zhiheng
    Yang, Junmei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [43] EXPLORING META INFORMATION FOR AUDIO-BASED ZERO-SHOT BIRD CLASSIFICATION
    Gebhard, Alexander
    Triantafyllopoulos, Andreas
    Bez, Teresa
    Christ, Lukas
    Kathan, Alexander
    Schuller, Bjoern W.
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1211 - 1215
  • [44] Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation
    Zhang, Yi-Kai
    Zhou, Da-Wei
    Ye, Han-Jia
    Zhan, De-Chuan
    INTERSPEECH 2022, 2022, : 531 - 535
  • [45] Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning
    Chen, Zhi
    Huang, Zi
    Li, Jingjing
    Zhang, Zheng
    DATABASES THEORY AND APPLICATIONS (ADC 2021), 2021, 12610 : 139 - 151
  • [46] Zero-Shot Federated Learning with New Classes for Audio Classification
    Gudur, Gautham Krishna
    Perepu, Satheesh Kumar
    INTERSPEECH 2021, 2021, : 1579 - 1583
  • [47] Learning Multiple Criteria Calibration for Generalized Zero-shot Learning
    Lu, Ziqian
    Lu, Zhe-Ming
    Yu, Yunlong
    He, Zewei
    Luo, Hao
    Zheng, Yangming
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [48] Synthetic Sample Selection for Generalized Zero-Shot Learning
    Gowda, Shreyank N.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 58 - 67
  • [49] Discriminative comparison classifier for generalized zero-shot learning
    Hou, Mingzhen
    Xia, Wei
    Zhang, Xiangdong
    Gao, Quanxue
    NEUROCOMPUTING, 2020, 414 (414) : 10 - 17
  • [50] Data-Free Generalized Zero-Shot Learning
    Tang, Bowen
    Zhang, Jing
    Yan, Long
    Yu, Qian
    Sheng, Lu
    Xu, Dong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5108 - 5117