TVENet: Temporal variance embedding network for fine-grained action representation

被引：9

作者：

Han, Tingting ^{[1
,2
]}

Yao, Hongxun ^{[2
]}

Xie, Wenlong ^{[2
]}

Sun, Xiaoshuai ^{[2
]}

Zhao, Sicheng ^{[3
]}

Yu, Jun ^{[1
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China

[2] Harbin Inst Technol, Sch Comp Sci & Technol, 612 Zonghe Bldg, Harbin, Peoples R China

[3] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

来源：

PATTERN RECOGNITION | 2020年 / 103卷 / 103期

基金：

中国国家自然科学基金;

关键词：

Fine-grained action representation; temporal variance embedding network (TVENet); joint optimization; temporal triplet loss; action search; DEEP; MODEL;

D O I：

10.1016/j.patcog.2020.107267

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the breakthroughs in general action understanding, it has become an inevitable trend to analyze the actions in finer granularity. However, related researches have been largely hindered by the lack of fine-grained datasets and the difficulty of capturing subtle differences between fine-grained actions that are highly similar overall. In this paper, we address the above challenges by constructing a fine-grained action dataset, i.e., Figure Skating, which can be used for end-to-end network training and presenting a framework for the joint optimization of classification and similarity constraints. We propose to incorporate the triplet loss into the training of Convolutional Neural Network, which learns a mapping from fine-grained actions to a compact Euclidean space where distances directly correspond to a measure of action similarity. Triplet loss compels actions of distinct classes to have larger distances than actions of the same class. Besides, to boost the discrimination of the fine-grained actions, we further propose a temporal variance embedding network (TVENet) embedding temporal context variances into the feature embeddings during the joint network training. The experimental results on Figure Skating dataset, HMDB51 dataset as well as UCF101 dataset demonstrate the effectiveness of TVENet representation for fine-grained action search. (C) 2020 Elsevier Ltd. All rights reserved.

引用

页数：16

共 50 条

[31] GRAPH FINE-GRAINED CONTRASTIVE REPRESENTATION LEARNING
Tang, Hui
Liang, Xun
Guo, Yuhui
Zheng, Xiangping
Wu, Bo
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3478 - 3482
[32] Fine-grained representation learning in convolutional autoencoders
Luo, Chang
Wang, Jie
JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (02)
[33] Representation Learning for Fine-Grained Change Detection
O'Mahony, Niall
Campbell, Sean
Krpalkova, Lenka
Carvalho, Anderson
Walsh, Joseph
Riordan, Daniel
SENSORS, 2021, 21 (13)
[34] Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition
Li, Tianjiao
Foo, Lin Geng
Ke, Qiuhong
Rahmani, Hossein
Wang, Anran
Wang, Jinghua
Liu, Jun
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 386 - 403
[35] CTM: Cross-time temporal module for fine-grained action recognition
Qian, Huifang
Zhang, Jialun
Yi, Jianping
Shi, Zhenyu
Zhang, Yimin
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
[36] Efficient Image Embedding for Fine-Grained Visual Classification
Payatsuporn, Soranan
Kijsirikul, Boonserm
2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 40 - 45
[37] Learning Fine-Grained Motion Embedding for Landscape Animation
Xue, Hongwei
Liu, Bei
Yang, Huan
Fu, Jianlong
Li, Houqiang
Luo, Jiebo
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 291 - 299
[38] Discriminative Suprasphere Embedding for Fine-Grained Visual Categorization
Ye, Shuo
Peng, Qinmu
Sun, Wenju
Xu, Jiamiao
Wang, Yu
You, Xinge
Cheung, Yiu-Ming
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5092 - 5102
[39] FINE-GRAINED ACTION DETECTION AND CLASSIFICATION IN TABLE TENNIS WITH SIAMESE SPATIO-TEMPORAL CONVOLUTIONAL NEURAL NETWORK
Martin, Pierre-Etienne
Benois-Pineau, Jenny
Peteri, Renaud
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3027 - 3028
[40] Temporal refinement network: Combining dynamic convolution and multi-scale information for fine-grained action recognition
Di, Jirui
Hu, Zhengping
Bi, Shuai
Zhang, Hehao
Wang, Yulu
Sun, Zhe
IMAGE AND VISION COMPUTING, 2024, 147

← 1 2 3 4 5 →