Spatio-Temporal Self-supervision for Few-Shot Action Recognition

被引：0

作者：

Yu, Wanchuan ^{[1
]}

Guo, Hanyu ^{[1
]}

Yan, Yan ^{[1
]}

Li, Jie ^{[2
]}

Wang, Hanzi ^{[1
]}

机构：

[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen, Peoples R China

[2] Xidian Univ, Sch Elect Engn, Video & Image Proc Syst Lab, Xian, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I | 2024年 / 14425卷

基金：

中国国家自然科学基金;

关键词：

Few-shot learning; Action recognition; Self-supervised learning;

D O I：

10.1007/978-981-99-8429-9_7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Few-shot action recognition aims to classify unseen action classes with limited labeled training samples. Most current works follow the metric learning technology to learn a good embedding and an appropriate comparison metric. Due to the limited labeled data, the generalization of embedding networks is limited when employing the meta-learning process with episodic tasks. In this paper, we aim to repurpose self-supervised learning to learn a more generalized few-shot embedding model. Specifically, a Spatio-Temporal Self-supervision (STS) framework for few-shot action recognition is proposed to generate self-supervision loss at the spatial and temporal levels as auxiliary losses. By this means, the proposed STS can provide a robust representation for few-shot action recognition. Furthermore, we propose a Spatio-Temporal Aggregation (STA) module that accounts for the spatial information relationship among all frames within a video sequence to achieve optimal video embedding. Experiments on several challenging few-shot action recognition benchmarks show the effectiveness of the proposed method in achieving state-of-the-art performance for few-shot action recognition.

引用

页码：84 / 96

页数：13

共 50 条

[31] RS-SSKD: Self-Supervision Equipped with Knowledge Distillation for Few-Shot Remote Sensing Scene Classification
Zhang, Pei
Li, Ying
Wang, Dong
Wang, Jiyue
SENSORS, 2021, 21 (05) : 1 - 23
[32] Edge-Based Self-supervision for Semi-supervised Few-Shot Microscopy Image Cell Segmentation
Dawoud, Youssef
Ernst, Katharina
Carneiro, Gustavo
Belagiannis, Vasileios
MEDICAL OPTICAL IMAGING AND VIRTUAL MICROSCOPY IMAGE ANALYSIS, MOVI 2022, 2022, 13578 : 22 - 31
[33] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
Chen, Lianggangxu
Cai, Yiqing
Lu, Changhong
Wang, Changbo
He, Gaoqi
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 38947 - 38966
[34] Forecasting Fine-Grained Urban Flows Via Spatio-Temporal Contrastive Self-Supervision
Qu, Hao
Gong, Yongshun
Chen, Meng
Zhang, Junbo
Zheng, Yu
Yin, Yilong
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8008 - 8023
[35] Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention
Wang, Jiahui
Zhu, Haiyue
Guo, Haoren
Al Mamun, Abdullah
Xiang, Cheng
Lee, Tong Heng
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2811 - 2817
[36] Few-shot human motion prediction using deformable spatio-temporal CNN with parameter generation
Zang, Chuanqi
Li, Menghao
Pei, Mingtao
NEUROCOMPUTING, 2022, 513 : 46 - 58
[37] Task-Agnostic Self-Distillation for Few-Shot Action Recognition
Bin Zhang
Dan, Yuanjie
Chen, Peng
Li, Ronghua
Gao, Nan
Hum, Ruohong
He, Xiaofei
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5425 - 5433
[38] Temporal Relational Modeling with Self-Supervision for Action Segmentation
Wang, Dong
Hu, Di
Li, Xingjian
Dou, Dejing
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2729 - 2737
[39] On the Importance of Spatial Relations for Few-shot Action Recognition
Zhang, Yilun
Fu, Yuqian
Ma, Xingjun
Qi, Lizhe
Chen, Jingjing
Wu, Zuxuan
Jiang, Yu-Gang
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2243 - 2251
[40] Compound Prototype Matching for Few-Shot Action Recognition
Huang, Yifei
Yang, Lijin
Sato, Yoichi
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 351 - 368

← 1 2 3 4 5 →