Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings

被引:28
|
作者
Shen, Fumin [1 ,2 ]
Zhou, Xiang [1 ,2 ]
Yu, Jun [3 ]
Yang, Yang [1 ,2 ]
Liu, Li [4 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 610054, Sichuan, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China
[3] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
中国国家自然科学基金;
关键词
Zero-shot learning; binary embeddings;
D O I
10.1109/TIP.2019.2899987
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot learning aims to classify the visual instances from unseen classes in the absence of training examples. This is typically achieved by directly mapping visual features to a semantic embedding space of classes (e.g., attributes or word vectors), where the similarity between the two modalities can be readily measured. However, the semantic space may not be reliable for recognition due to the noisy class embeddings or visual bias problem. In this paper, we propose a novel binary embedding-based zero-shot learning (BZSL) method, which recognizes the visual instances from unseen classes through an intermediate discriminative Hamming space. Specifically, BZSL jointly learns two binary coding functions to encode both visual instances and class embeddings into the Hamming space, which well alleviates the visual-semantic bias problem. As a desiring property, classifying an unseen instance thereby can he efficiently done by retrieving its nearest class codes with minimal Hamming distance. During training, by introducing two auxiliary variables for the coding functions, we formulate an equivalent correlation maximization problem, which admits an analytical solution. The resulting algorithm thus enjoys both highly efficient training and scalable novel class inferring. Extensive experiments on four benchmark datasets, including the full ImageNet Fall 2011 dataset with over 20k unseen classes, demonstrate the superiority of our method on the zero-shot learning task. Particularly, we show that increasing the binary embedding dimension can inevitably improve the recognition accuracy.
引用
收藏
页码:3662 / 3674
页数:13
相关论文
共 50 条
  • [31] Graph-Based Visual-Semantic Entanglement Network for Zero-Shot Image Recognition
    Hu, Yang
    Wen, Guihua
    Chapman, Adriane
    Yang, Pei
    Luo, Mingnan
    Xu, Yingxue
    Dai, Dan
    Hall, Wendy
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2473 - 2487
  • [32] Zero-Shot Learning via Semantic Similarity Embedding
    Zhang, Ziming
    Saligrama, Venkatesh
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4166 - 4174
  • [33] Visual Context Embeddings for Zero-Shot Recognition
    Cho, Gunhee
    Choi, Yong Suk
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 1039 - 1047
  • [34] Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning
    Ye, Zihan
    Hu, Fuyuan
    Lyu, Fan
    Li, Linyan
    Huang, Kaizhu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2828 - 2840
  • [35] Zero-Shot Learning to Index on Semantic Trees for Scalable Image Retrieval
    Kan, Shichao
    Cen, Yi
    Cen, Yigang
    Vladimir, Mladenovic
    Li, Yang
    He, Zhihai
    IEEE Transactions on Image Processing, 2021, 30 : 501 - 516
  • [36] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [37] Zero-Shot Learning to Index on Semantic Trees for Scalable Image Retrieval
    Kan, Shichao
    Cen, Yi
    Cen, Yigang
    Vladimir, Mladenovic
    Li, Yang
    He, Zhihai
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 501 - 516
  • [38] Learning semantic consistency for audio-visual zero-shot learning
    Xiaoyong Li
    Jing Yang
    Yuling Chen
    Wei Zhang
    Xiaoli Ruan
    Chengjiang Li
    Zhidong Su
    Artificial Intelligence Review, 58 (7)
  • [39] From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection
    Nie, Hui
    Wang, Ruiping
    Chen, Xilin
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1648 - 1657
  • [40] Leveraging Self-Distillation and Disentanglement Network to Enhance Visual-Semantic Feature Consistency in Generalized Zero-Shot Learning
    Liu, Xiaoming
    Wang, Chen
    Yang, Guan
    Wang, Chunhua
    Long, Yang
    Liu, Jie
    Zhang, Zhiyuan
    ELECTRONICS, 2024, 13 (10)