GNDAN: Graph Navigated Dual Attention Network for Zero-Shot Learning

被引:25
|
作者
Chen, Shiming [1 ]
Hong, Ziming [1 ]
Xie, Guosen [2 ]
Peng, Qinmu [1 ]
You, Xinge [1 ]
Ding, Weiping [3 ]
Shao, Ling [4 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[4] Saudi Data & Artificial Intelligence Author SDAIA, Natl Ctr Artificial Intelligence NCAI, Riyadh, Saudi Arabia
基金
中国国家自然科学基金;
关键词
Semantics; Visualization; Feature extraction; Task analysis; Knowledge transfer; Navigation; Learning systems; Attribute-based region features; graph attention network (GAT); graph neural network (GNN); zero-shot learning (ZSL);
D O I
10.1109/TNNLS.2022.3155602
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space. Thus, they fail to: 1) exploit the appearance relationship priors between various local regions in a single image, which corresponds to the semantic information and 2) learn cooperative global and local features jointly for discriminative feature representations. In this article, we propose the novel graph navigated dual attention network (GNDAN) for ZSL to address these drawbacks. GNDAN employs a region-guided attention network (RAN) and a region-guided graph attention network (RGAT) to jointly learn a discriminative local embedding and incorporate global context for exploiting explicit global embeddings under the guidance of a graph. Specifically, RAN uses soft spatial attention to discover discriminative regions for generating local embeddings. Meanwhile, RGAT employs an attribute-based attention to obtain attribute-based region features, where each attribute focuses on the most relevant image regions. Motivated by the graph neural network (GNN), which is beneficial for structural relationship representations, RGAT further leverages a graph attention network to exploit the relationships between the attribute-based region features for explicit global embedding representations. Based on the self-calibration mechanism, the joint visual embedding learned is matched with the semantic embedding to form the final prediction. Extensive experiments on three benchmark datasets demonstrate that the proposed GNDAN achieves superior performances to the state-of-the-art methods. Our code and trained models are available at https://github.com/shiming-chen/GNDAN.
引用
收藏
页码:4516 / 4529
页数:14
相关论文
共 50 条
  • [11] Generative Dual Adversarial Network for Generalized Zero-shot Learning
    Huang, He
    Wang, Changhu
    Yu, Philip S.
    Wang, Chang-Dong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 801 - 810
  • [12] Dual Progressive Prototype Network for Generalized Zero-Shot Learning
    Wang, Chaoqun
    Mina, Shaobo
    Chenl, Xuejin
    Sun, Xiaoyan
    Li, Houqiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [13] Learning Graph Embeddings for Compositional Zero-shot Learning
    Naeem, Muhammad Ferjad
    Xian, Yongqin
    Tombari, Federico
    Akata, Zeynep
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 953 - 962
  • [14] Learning Attention Propagation for Compositional Zero-Shot Learning
    Khan, Muhammad Gul Zain Ali
    Naeem, Muhammad Ferjad
    Van Gool, Luc
    Pagani, A.
    Stricker, Didier
    Afzal, Muhammad Zeshan
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3817 - 3826
  • [15] Learning Attention as Disentangler for Compositional Zero-shot Learning
    Hao, Shaozhe
    Han, Kai
    Wong, Kwan-Yee K.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15315 - 15324
  • [16] Dual-level contrastive learning network for generalized zero-shot learning
    Jiaqi Guan
    Min Meng
    Tianyou Liang
    Jigang Liu
    Jigang Wu
    The Visual Computer, 2022, 38 : 3087 - 3095
  • [17] DVAMN: Dual Visual Attention Matching Network for Zero-Shot Action Recognition
    Qi, Cheng
    Feng, Zhiyong
    Xing, Meng
    Su, Yong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 564 - 575
  • [18] Dual-level contrastive learning network for generalized zero-shot learning
    Guan, Jiaqi
    Meng, Min
    Liang, Tianyou
    Liu, Jigang
    Wu, Jigang
    VISUAL COMPUTER, 2022, 38 (9-10): : 3087 - 3095
  • [19] Dual Generative Network with Discriminative Information for Generalized Zero-Shot Learning
    Xu, Tingting
    Zhao, Ye
    Liu, Xueliang
    COMPLEXITY, 2021, 2021
  • [20] CORRELATED DUAL AUTOENCODER FOR ZERO-SHOT LEARNING
    Jiang, Ming
    Liu, Zhiyong
    Li, Pengfei
    Zhang, Min
    Tang, Jingfan
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2020, 82 (01): : 65 - 76