TVENet: Temporal variance embedding network for fine-grained action representation

被引:9
|
作者
Han, Tingting [1 ,2 ]
Yao, Hongxun [2 ]
Xie, Wenlong [2 ]
Sun, Xiaoshuai [2 ]
Zhao, Sicheng [3 ]
Yu, Jun [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, 612 Zonghe Bldg, Harbin, Peoples R China
[3] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
基金
中国国家自然科学基金;
关键词
Fine-grained action representation; temporal variance embedding network (TVENet); joint optimization; temporal triplet loss; action search; DEEP; MODEL;
D O I
10.1016/j.patcog.2020.107267
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the breakthroughs in general action understanding, it has become an inevitable trend to analyze the actions in finer granularity. However, related researches have been largely hindered by the lack of fine-grained datasets and the difficulty of capturing subtle differences between fine-grained actions that are highly similar overall. In this paper, we address the above challenges by constructing a fine-grained action dataset, i.e., Figure Skating, which can be used for end-to-end network training and presenting a framework for the joint optimization of classification and similarity constraints. We propose to incorporate the triplet loss into the training of Convolutional Neural Network, which learns a mapping from fine-grained actions to a compact Euclidean space where distances directly correspond to a measure of action similarity. Triplet loss compels actions of distinct classes to have larger distances than actions of the same class. Besides, to boost the discrimination of the fine-grained actions, we further propose a temporal variance embedding network (TVENet) embedding temporal context variances into the feature embeddings during the joint network training. The experimental results on Figure Skating dataset, HMDB51 dataset as well as UCF101 dataset demonstrate the effectiveness of TVENet representation for fine-grained action search. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Embedding Label Structures for Fine-Grained Feature Representation
    Zhang, Xiaofan
    Zhou, Feng
    Lin, Yuanqing
    Zhang, Shaoting
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1114 - 1123
  • [2] Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network
    Zhou, Xuan
    Yi, Jianping
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2103 - 2116
  • [3] Fine-Grained Spatio-Temporal Parsing Network for Action Quality Assessment
    Gedamu, Kumie
    Ji, Yanli
    Yang, Yang
    Shao, Jie
    Shen, Heng Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6386 - 6400
  • [4] Fine-Grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding
    Chen, Tianshui
    Wu, Wenxi
    Gao, Yuefang
    Dong, Le
    Luo, Xiaonan
    Lin, Liang
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2023 - 2031
  • [5] Context Sensitive Network for weakly-supervised fine-grained temporal action localization
    Dong, Cerui
    Liu, Qinying
    Wang, Zilei
    Zhang, Yixin
    Zhao, Feng
    NEURAL NETWORKS, 2025, 185
  • [6] Convolutional transformer network for fine-grained action recognition
    Ma, Yujun
    Wang, Ruili
    Zong, Ming
    Ji, Wanting
    Wang, Yi
    Ye, Baoliu
    NEUROCOMPUTING, 2024, 569
  • [7] Towards Fine-Grained Temporal Network Representation via Time-Reinforced Random Walk
    Liu, Zhining
    Zhou, Dawei
    Zhu, Yada
    Gu, Jinjie
    He, Jingrui
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4973 - 4980
  • [8] Local Temporal Bilinear Pooling for Fine-Grained Action Parsing
    Zhang, Yan
    Tang, Siyu
    Muandet, Krikamol
    Jarvers, Christian
    Neumann, Heiko
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11997 - 12007
  • [9] FineAction: A Fine-Grained Video Dataset for Temporal Action Localization
    Liu, Yi
    Wang, Limin
    Wang, Yali
    Ma, Xiao
    Qiao, Yu
    IEEE Transactions on Image Processing, 2022, 31 : 6937 - 6950
  • [10] DUAL TEMPORAL TRANSFORMERS FOR FINE-GRAINED DANGEROUS ACTION RECOGNITION
    Song, Wenfeng
    Jin, Xingliang
    Ding, Yang
    Gao, Yang
    Hou, Xia
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 415 - 419