Hybrid attentive prototypical network for few-shot action recognition

被引:1
|
作者
Ruan, Zanxi [1 ]
Wei, Yingmei [1 ]
Guo, Yanming [1 ]
Xie, Yuxiang [1 ]
机构
[1] Natl Univ Def Technol, Lab Big Data & Decis, Changsha, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot action recognition; Few-shot learning; Video understanding; Metric learning;
D O I
10.1007/s40747-024-01571-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most previous few-shot action recognition works tend to process video temporal and spatial features separately, resulting in insufficient extraction of comprehensive features. In this paper, a novel hybrid attentive prototypical network (HAPN) framework for few-shot action recognition is proposed. Distinguished by its joint processing of temporal and spatial information, the HAPN framework strategically manipulates these dimensions from feature extraction to the attention module, consequently enhancing its ability to perform action recognition tasks. Our framework utilizes the R(2+1)D backbone network, coupling the extraction of integrated temporal and spatial features to ensure a comprehensive understanding of video content. Additionally, our framework introduces the novel Residual Tri-dimensional Attention (ResTriDA) mechanism, specifically designed to augment feature information across the temporal, spatial, and channel dimensions. ResTriDA dynamically enhances crucial aspects of video features by amplifying significant channel-wise features for action distinction, accentuating spatial details vital for capturing the essence of actions within frames, and emphasizing temporal dynamics to capture movement over time. We further propose a prototypical attentive matching module (PAM) built on the concept of metric learning to resolve the overfitting issue common in few-shot tasks. We evaluate our HAPN framework on three classical few-shot action recognition datasets: Kinetics-100, UCF101, and HMDB51. The results indicate that our framework significantly outperformed state-of-the-art methods. Notably, the 1-shot task, demonstrated an increase of 9.8% in accuracy on UCF101 and improvements of 3.9% on HMDB51 and 12.4% on Kinetics-100. These gains confirm the robustness and effectiveness of our approach in leveraging limited data for precise action recognition.
引用
收藏
页码:8249 / 8272
页数:24
相关论文
共 50 条
  • [1] A prototypical network for few-shot recognition of speech imagery data
    Hernandez-Galvan, Alan
    Ramirez-Alonso, Graciela
    Ramirez-Quintana, Juan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
  • [2] VDARN: Video Disentangling Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition
    Su, Yong
    Xing, Meng
    An, Simin
    Peng, Weilong
    Feng, Zhiyong
    AD HOC NETWORKS, 2021, 113
  • [3] HYBRID CONTRASTIVE PROTOTYPICAL NETWORK FOR FEW-SHOT SCENE CLASSIFICATION
    Zhu, Junjie
    Yang, Ke
    Qiu, Chunping
    Dai, Mengyuan
    Guan, Naiyang
    Yi, Xiaodong
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3588 - 3592
  • [4] Few-Shot Learning Sensitive Recognition Method Based on Prototypical Network
    Yuan, Guoquan
    Zhao, Xinjian
    Li, Liu
    Zhang, Song
    Wei, Shanming
    MATHEMATICS, 2024, 12 (17)
  • [5] Transductive Prototypical Attention Network for Few-shot SAR Target Recognition
    Yu, Xuelian
    Liu, Sen
    Ren, Haohao
    Zou, Lin
    Zhou, Yun
    Wang, Xuegang
    2023 IEEE RADAR CONFERENCE, RADARCONF23, 2023,
  • [6] FEW-SHOT RADAR HRRP RECOGNITION BASED ON IMPROVED PROTOTYPICAL NETWORK
    Li, Jixi
    Li, Dongying
    Jiang, Yong
    Yu, Wenxian
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5277 - 5280
  • [7] TRANSDUCTIVE PROTOTYPICAL NETWORK FOR FEW-SHOT CLASSIFICATION
    Liu, Xinyue
    Liu, Pengxin
    Zong, Linlin
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1671 - 1675
  • [8] Attentive matching network for few-shot learning
    Mai, Sijie
    Hu, Haifeng
    Xu, Jia
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 187
  • [9] Hybrid Attention-Based Prototypical Network for Unfamiliar Restaurant Food Image Few-Shot Recognition
    Song, Gege
    Tao, Zhulin
    Huang, Xianglin
    Cao, Gang
    Liu, Wei
    Yang, Lifang
    IEEE ACCESS, 2020, 8 (08): : 14893 - 14900
  • [10] Transductive Prototypical Attention Reasoning Network for Few-Shot SAR Target Recognition
    Ren, Haohao
    Liu, Sen
    Yu, Xuelian
    Zou, Lin
    Zhou, Yun
    Wang, Xuegang
    Tang, Hao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61