Spatio-Temporal Self-supervision for Few-Shot Action Recognition

被引:0
|
作者
Yu, Wanchuan [1 ]
Guo, Hanyu [1 ]
Yan, Yan [1 ]
Li, Jie [2 ]
Wang, Hanzi [1 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen, Peoples R China
[2] Xidian Univ, Sch Elect Engn, Video & Image Proc Syst Lab, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot learning; Action recognition; Self-supervised learning;
D O I
10.1007/978-981-99-8429-9_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot action recognition aims to classify unseen action classes with limited labeled training samples. Most current works follow the metric learning technology to learn a good embedding and an appropriate comparison metric. Due to the limited labeled data, the generalization of embedding networks is limited when employing the meta-learning process with episodic tasks. In this paper, we aim to repurpose self-supervised learning to learn a more generalized few-shot embedding model. Specifically, a Spatio-Temporal Self-supervision (STS) framework for few-shot action recognition is proposed to generate self-supervision loss at the spatial and temporal levels as auxiliary losses. By this means, the proposed STS can provide a robust representation for few-shot action recognition. Furthermore, we propose a Spatio-Temporal Aggregation (STA) module that accounts for the spatial information relationship among all frames within a video sequence to achieve optimal video embedding. Experiments on several challenging few-shot action recognition benchmarks show the effectiveness of the proposed method in achieving state-of-the-art performance for few-shot action recognition.
引用
收藏
页码:84 / 96
页数:13
相关论文
共 50 条
  • [31] RS-SSKD: Self-Supervision Equipped with Knowledge Distillation for Few-Shot Remote Sensing Scene Classification
    Zhang, Pei
    Li, Ying
    Wang, Dong
    Wang, Jiyue
    SENSORS, 2021, 21 (05) : 1 - 23
  • [32] Edge-Based Self-supervision for Semi-supervised Few-Shot Microscopy Image Cell Segmentation
    Dawoud, Youssef
    Ernst, Katharina
    Carneiro, Gustavo
    Belagiannis, Vasileios
    MEDICAL OPTICAL IMAGING AND VIRTUAL MICROSCOPY IMAGE ANALYSIS, MOVI 2022, 2022, 13578 : 22 - 31
  • [33] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
    Chen, Lianggangxu
    Cai, Yiqing
    Lu, Changhong
    Wang, Changbo
    He, Gaoqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 38947 - 38966
  • [34] Forecasting Fine-Grained Urban Flows Via Spatio-Temporal Contrastive Self-Supervision
    Qu, Hao
    Gong, Yongshun
    Chen, Meng
    Zhang, Junbo
    Zheng, Yu
    Yin, Yilong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8008 - 8023
  • [35] Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention
    Wang, Jiahui
    Zhu, Haiyue
    Guo, Haoren
    Al Mamun, Abdullah
    Xiang, Cheng
    Lee, Tong Heng
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2811 - 2817
  • [36] Few-shot human motion prediction using deformable spatio-temporal CNN with parameter generation
    Zang, Chuanqi
    Li, Menghao
    Pei, Mingtao
    NEUROCOMPUTING, 2022, 513 : 46 - 58
  • [37] Task-Agnostic Self-Distillation for Few-Shot Action Recognition
    Bin Zhang
    Dan, Yuanjie
    Chen, Peng
    Li, Ronghua
    Gao, Nan
    Hum, Ruohong
    He, Xiaofei
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5425 - 5433
  • [38] Temporal Relational Modeling with Self-Supervision for Action Segmentation
    Wang, Dong
    Hu, Di
    Li, Xingjian
    Dou, Dejing
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2729 - 2737
  • [39] On the Importance of Spatial Relations for Few-shot Action Recognition
    Zhang, Yilun
    Fu, Yuqian
    Ma, Xingjun
    Qi, Lizhe
    Chen, Jingjing
    Wu, Zuxuan
    Jiang, Yu-Gang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2243 - 2251
  • [40] Compound Prototype Matching for Few-Shot Action Recognition
    Huang, Yifei
    Yang, Lijin
    Sato, Yoichi
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 351 - 368