Generalized Zero-Shot Learning for Action Recognition Fusing Text and Image GANs

被引:1
|
作者
Huang, Kaiqiang [1 ]
McKeever, Susan [1 ]
Miralles-Pechuan, Luis [1 ]
机构
[1] Technol Univ Dublin, Sch Comp Sci, Grangegorman, Dublin 7, Ireland
关键词
Generalized zero-shot action recognition; generalised zero-shot learning; generative adversarial networks; human action recognition;
D O I
10.1109/ACCESS.2024.3349510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Generalized Zero-Shot Action Recognition (GZSAR) is geared towards recognizing classes that the model has not been trained on, while still maintaining robust performance on the familiar, trained classes. This approach mitigates the need for an extensive amount of labeled training data and enhances the efficient utilization of available datasets. The main contribution of this paper is a novel approach for GZSAR that combines the power of two Generative Adversarial Networks (GANs). One GAN is responsible for generating embeddings from visual representations, while the other GAN focuses on generating embeddings from textual representations. These generated embeddings are fused, with the selection of the maximum value from each array that represents the embeddings, and this fused data is then utilized to train a GZSAR classifier in a supervised manner. This framework also incorporates a feature refinement component and an out-of-distribution detector to mitigate the domain shift problem between seen and unseen classes. In our experiments, notable improvements were observed. On the UCF101 benchmark dataset, we achieved a 7.43% increase in performance, rising from 50.93% (utilizing images and Word2Vec alone) to 54.71% with the implementation of two GANs. Additionally, on the HMDB51 dataset, we saw a 7.06% improvement, advancing from 36.11% using Text and Word2Vec to 38.66% with the dual-GAN approach. These results underscore the efficacy of our dual-GAN framework in enhancing GZSAR performance. The rest of the paper shows the main contributions to the field of GZSAR and highlights the potential and future lines of research in this exciting area.
引用
收藏
页码:5188 / 5202
页数:15
相关论文
共 50 条
  • [1] Combining Text and Image Knowledge with GANs for Zero-Shot Action Recognition in Videos
    Huang, Kaiqiang
    Miralles-Pechuan, Luis
    Mckeever, Susan
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 623 - 631
  • [2] Enhancing Zero-Shot Action Recognition in Videos by Combining GANs with Text and Images
    Huang K.
    Miralles-Pechuán L.
    Mckeever S.
    SN Computer Science, 4 (4)
  • [3] Transductive Learning With Prior Knowledge for Generalized Zero-Shot Action Recognition
    Su, Taiyi
    Wang, Hanli
    Qi, Qiuping
    Wang, Lei
    He, Bin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 260 - 273
  • [4] Dissimilarity Representation Learning for Generalized Zero-Shot Recognition
    Yang, Gang
    Liu, Jinlu
    Xu, Jieping
    Li, Xirong
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2032 - 2039
  • [5] Harnessing GANs for Zero-Shot Learning of New Classes in Visual Speech Recognition
    Kumar, Yaman
    Sahrawat, Dhruva
    Maheshwari, Shubham
    Mahata, Debanjan
    Stent, Amanda
    Yin, Yifang
    Shah, Rajiv Ratn
    Zimmermann, Roger
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2645 - 2652
  • [6] Generalized zero-shot learning for action recognition with web-scale video data
    Kun Liu
    Wu Liu
    Huadong Ma
    Wenbing Huang
    Xiongxiong Dong
    World Wide Web, 2019, 22 : 807 - 824
  • [7] Generalized zero-shot learning for action recognition with web-scale video data
    Liu, Kun
    Liu, Wu
    Ma, Huadong
    Huang, Wenbing
    Dong, Xiongxiong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 807 - 824
  • [8] CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition
    Gowda, Shreyank N.
    Sevilla-Lara, Laura
    Keller, Frank
    Rohrbach, Marcus
    COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 187 - 203
  • [9] Learning Using Privileged Information for Zero-Shot Action Recognition
    Gao, Zhiyi
    Hou, Yonghong
    Li, Wanqing
    Guo, Zihui
    Yu, Bin
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 347 - 362
  • [10] Zero-shot learning for action recognition using synthesized features
    Mishra, Ashish
    Pandey, Anubha
    Murthy, Hema A.
    NEUROCOMPUTING, 2020, 390 : 117 - 130