Spatio-Temporal Self-supervision for Few-Shot Action Recognition

被引：0

作者：

Yu, Wanchuan ^{[1
]}

Guo, Hanyu ^{[1
]}

Yan, Yan ^{[1
]}

Li, Jie ^{[2
]}

Wang, Hanzi ^{[1
]}

机构：

[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen, Peoples R China

[2] Xidian Univ, Sch Elect Engn, Video & Image Proc Syst Lab, Xian, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I | 2024年 / 14425卷

基金：

中国国家自然科学基金;

关键词：

Few-shot learning; Action recognition; Self-supervised learning;

D O I：

10.1007/978-981-99-8429-9_7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Few-shot action recognition aims to classify unseen action classes with limited labeled training samples. Most current works follow the metric learning technology to learn a good embedding and an appropriate comparison metric. Due to the limited labeled data, the generalization of embedding networks is limited when employing the meta-learning process with episodic tasks. In this paper, we aim to repurpose self-supervised learning to learn a more generalized few-shot embedding model. Specifically, a Spatio-Temporal Self-supervision (STS) framework for few-shot action recognition is proposed to generate self-supervision loss at the spatial and temporal levels as auxiliary losses. By this means, the proposed STS can provide a robust representation for few-shot action recognition. Furthermore, we propose a Spatio-Temporal Aggregation (STA) module that accounts for the spatial information relationship among all frames within a video sequence to achieve optimal video embedding. Experiments on several challenging few-shot action recognition benchmarks show the effectiveness of the proposed method in achieving state-of-the-art performance for few-shot action recognition.

引用

页码：84 / 96

页数：13

共 50 条

[1] Spatio-temporal Relation Modeling for Few-shot Action Recognition
Thatipelli, Anirudh
Narayan, Sanath
Khan, Salman
Anwer, Rao Muhammad
Khan, Fahad Shahbaz
Ghanem, Bernard
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935
[2] Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition
Cao, Yichao
Su, Xiu
Tang, Qingfei
You, Shan
Lu, Xiaobo
Xu, Chang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] Semantic-guided spatio-temporal attention for few-shot action recognition
Jianyu Wang
Baolin Liu
Applied Intelligence, 2024, 54 : 2458 - 2471
[4] Semantic-guided spatio-temporal attention for few-shot action recognition
Wang, Jianyu
Liu, Baolin
APPLIED INTELLIGENCE, 2024, 54 (03) : 2458 - 2471
[5] Self-Supervision Can Be a Good Few-Shot Learner
Lu, Yuning
Wen, Liangjian
Liu, Jianzhuang
Liu, Yajing
Tian, Xinmei
COMPUTER VISION, ECCV 2022, PT XIX, 2022, 13679 : 740 - 758
[6] Boosting Few-Shot Visual Learning with Self-Supervision
Gidaris, Spyros
Bursuc, Andrei
Komodakis, Nikos
Perez, Patrick
Cord, Matthieu
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8058 - 8067
[7] Cross-modal guides spatio-temporal enrichment network for few-shot action recognition
Chen, Zhiwen
Yang, Yi
Li, Li
Li, Min
APPLIED INTELLIGENCE, 2024, 54 (22) : 11196 - 11211
[8] Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Yuan, Liangzhe
Qian, Rui
Cui, Yin
Gong, Boqing
Schroff, Florian
Yang, Ming-Hsuan
Adam, Hartwig
Liu, Ting
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13957 - 13966
[9] Rethinking Self-Supervision for Few-Shot Class-Incremental Learning
Zhao, Linglan
Lu, Jing
Cheng, Zhanzhan
Liu, Duo
Fang, Xiangzhong
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 726 - 731
[10] Equivariant Spatio-temporal Self-supervision for LiDAR Object Detection
Hegde, Deepti
Lohit, Suhas
Peng, Kuan-Chuan
Jones, Michael J.
Patel, Vishal M.
COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 475 - 491

← 1 2 3 4 5 →