ADAPTIVE MULTI-SCALE SEMANTIC FUSION NETWORK FOR ZERO-SHOT LEARNING

被引:0
|
作者
Song, Jing [1 ]
Peng, Peixi [2 ]
Zhai, Yunpeng [1 ]
Zhang, Chong [1 ]
Tian, Yonghong [2 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
关键词
Multi-scale; attribute attention; Semantic fusion; global and local semantic attributes; class-center triplet loss;
D O I
10.1109/ICMEW53276.2021.9455945
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Zero-shot learning aims at accurately recognizing unseen objects by learning matrices that bridge the gap between visual information and semantic attributes. Existing approaches predominantly focus on learning the proper mapping function for visual-semantic embedding while neglecting the effect of learning discriminative semantic features, which leads to severe semantic ambiguity. We propose a practical Adaptive Multi-scale Semantic Fusion (AMSF) framework to perform object-based multi-scale attribute attention for semantic disambiguation. Considering both low-level visual information and global class-level features that relate to this ambiguity, the proposed method jointly learns cooperative global and local semantic attributes from different scales. Moreover, with the joint supervision of embedding softmax loss and class-center triplet loss, the model is encouraged to learn high discriminative semantic features and visual features with high interclass dispersion and infra-class compactness. The method is evaluated on CUB, AwA2, and SUN datasets, and the experimental results indicate the method achieves state-of-the-art performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Exploiting multi-scale contextual prompt learning for zero-shot semantic segmentation☆
    Wang, Yiqi
    Tian, Yingjie
    DISPLAYS, 2024, 81
  • [2] Multi-scale visual attention for attribute disambiguation in zero-shot learning
    Tian, Long
    Chen, Bo
    Ren, Jie
    Zhang, Hao
    Wu, Zhenhua
    Han, Ning
    Chen, Yuanwei
    Liu, Hongwei
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 103
  • [3] Attentive Semantic Preservation Network for Zero-Shot Learning
    Lu, Ziqian
    Yu, Yunlong
    Lu, Zhe-Ming
    Shen, Feng-Li
    Zhang, Zhongfei
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2919 - 2925
  • [4] Adaptive Fusion Learning for Compositional Zero-Shot Recognition
    Min, Lingtong
    Fan, Ziman
    Wang, Shunzhou
    Dou, Feiyang
    Li, Xin
    Wang, Binglu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1193 - 1204
  • [5] Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning
    Zhang, Jianyang
    Yang, Guowu
    Hu, Ping
    Lin, Guosheng
    Lv, Fengmao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4024 - 4035
  • [6] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Xie, Guo-Sen
    Yang, Wenhan
    Peng, Qinmu
    Wang, Kai
    Zhao, Jian
    You, Xinge
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
  • [7] Semantic Autoencoder for Zero-Shot Learning
    Kodirov, Elyor
    Xiang, Tao
    Gong, Shaogang
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4447 - 4456
  • [8] Learning semantic ambiguities for zero-shot learning
    Hanouti, Celina
    Le Borgne, Herve
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40745 - 40759
  • [9] Learning semantic ambiguities for zero-shot learning
    Celina Hanouti
    Hervé Le Borgne
    Multimedia Tools and Applications, 2023, 82 : 40745 - 40759
  • [10] Multi-Scale Speaker Vectors for Zero-Shot Speech Synthesis
    Cory, Tristin
    Iqbal, Razib
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 496 - 501