Dual Projective Zero-Shot Learning Using Text Descriptions

被引:7
|
作者
Rao, Yunbo [1 ]
Yang, Ziqiang [1 ]
Zeng, Shaoning [2 ]
Wang, Qifeng [3 ]
Pu, Jiansu [4 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, 4,Sect 2,North Jianshe Rd, Chengdu 610054, Sichuan, Peoples R China
[2] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Chengdu 313000, Sichuan, Peoples R China
[3] Google Berkeley, Berkeley, CA 94720 USA
[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, 4,Sect 2,North Jianshe Rd, Chengdu 610054, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Zero-shot learning; generalized zero-shot learning; autoencoder; inductive zero-shot learning;
D O I
10.1145/3514247
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-shot learning (ZSL) aims to recognize image instances of unseen classes solely based on the semantic descriptions of the unseen classes. In this field, Generalized Zero-Shot Learning (GZSL) is a challenging problem in which the images of both seen and unseen classes are mixed in the testing phase of learning. Existing methods formulate GZSL as a semantic-visual correspondence problem and apply generative models such as Generative Adversarial Networks and Variational Autoencoders to solve the problem. However, these methods suffer from the bias problem since the images of unseen classes are often misclassified into seen classes. In this work, a novel model named the Dual Projective model for Zero-Shot Learning (DPZSL) is proposed using text descriptions. In order to alleviate the bias problem, we leverage two autoencoders to project the visual and semantic features into a latent space and evaluate the embeddings by a visual-semantic correspondence loss function. An additional novel classifier is also introduced to ensure the discriminability of the embedded features. Our method focuses on a more challenging inductive ZSL setting in which only the labeled data from seen classes are used in the training phase. The experimental results, obtained from two popular datasets-Caltech-UCSD Birds-200-2011 (CUB) and North America Birds (NAB)-show that the proposed DPZSL model significantly outperforms both the inductive ZSL and GZSL settings. Particularly in the GZSL setting, our model yields an improvement up to 15.2% in comparison with state-of-the-art CANZSL on datasets CUB and NAB with two splittings.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Zero-Shot Entity Linking by Reading Entity Descriptions
    Logeswaran, Lajanugen
    Chang, Ming-Wei
    Lee, Kenton
    Toutanova, Kristina
    Devlin, Jacob
    Lee, Honglak
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3449 - 3460
  • [42] Unmasking the Masked Face Using Zero-Shot Learning
    Singh, Pranjali
    Singh, Amritpal
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2021, 2022, 1534 : 563 - 585
  • [43] Zero-Shot Text-to-Image Generation
    Ramesh, Aditya
    Pavlov, Mikhail
    Goh, Gabriel
    Gray, Scott
    Voss, Chelsea
    Radford, Alec
    Chen, Mark
    Sutskever, Ilya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [44] Human Motion Recognition Using Zero-Shot Learning
    Mohammadi, Farid Ghareh
    Imteaj, Ahmed
    Amini, M. Hadi
    Arabnia, Hamid R.
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND APPLIED COGNITIVE COMPUTING, 2021, : 171 - 181
  • [45] Retrieval Augmented Zero-Shot Text Classification
    Abdullahi, Tassallah
    Singh, Ritambhara
    Eickhoff, Carsten
    PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 195 - 203
  • [46] Dual-level contrastive learning network for generalized zero-shot learning
    Jiaqi Guan
    Min Meng
    Tianyou Liang
    Jigang Liu
    Jigang Wu
    The Visual Computer, 2022, 38 : 3087 - 3095
  • [47] Dual-level contrastive learning network for generalized zero-shot learning
    Guan, Jiaqi
    Meng, Min
    Liang, Tianyou
    Liu, Jigang
    Wu, Jigang
    VISUAL COMPUTER, 2022, 38 (9-10): : 3087 - 3095
  • [48] A transformer-based dual contrastive learning approach for zero-shot learning
    Lei, Yu
    Jing, Ran
    Li, Fangfang
    Gao, Quanxue
    Deng, Cheng
    NEUROCOMPUTING, 2025, 626
  • [49] Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
    Saeki, Takaaki
    Maiti, Soumi
    Li, Xinjian
    Watanabe, Shinji
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5179 - 5187
  • [50] Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions
    Ba, Jimmy Lei
    Swersky, Kevin
    Fidler, Sanja
    Salakhutdinov, Ruslan
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4247 - 4255