Skip-Pose Vectors: Pose-based motion embedding using Encoder-Decoder models

被引:0
|
作者
Shirakawa, Yuta [1 ]
Kozakaya, Tatsuo [1 ]
机构
[1] Toshiba Co Ltd, Corp Res & Dev Ctr, Tokyo, Japan
关键词
D O I
10.23919/mva.2019.8757937
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a pose-based unsupervised embedding learning method for action recognition. To classify human action based on the similarity of motions, it is important to establish a good feature space such that similar motions are mapped to similar vector representations. On the other hand, learning a feature space with this property with a supervised approach requires huge training samples, tailored supervised keypoints, and action categories. Although the labeling cost of keypoints is decreasing day by day with improvement of 2D pose estimation methods, labeling video category is still problematic work due to the variety of categories, ambiguity and variations of videos. To avoid the need for such expensive category labeling, following the success of "Skip-Thought Vectors", an unsupervised approach to model the similarity of sentences, we apply its idea to contiguous pose sequences to learn feature representations for measuring motion similarities. Thanks to handling human action as 2D poses instead of images, the model size can be small and easy to handle, and we can augment the training data by projecting 3D motion capture data to 2D. Through evaluation on the JHMDB dataset, we explore various design choices, such as whether to handle the actions as a sequence of poses or as a sequence of images. Our approach leverages pose sequences from 3D motion capture and improves its performance as much as 61.6% on JHMDB.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections
    Mao, Xiao-Jiao
    Shen, Chunhua
    Yang, Yu-Bin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [22] InfPose: Real-Time Infrared Multi-Human Pose Estimation for Edge Devices Based on Encoder-Decoder CNN Architecture
    Xu, Xin
    Wei, Xinchao
    Xu, Yuelei
    Zhang, Zhaoxiang
    Gong, Kun
    Li, Huafeng
    Xiao, Leibing
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3672 - 3679
  • [23] Stacked residual blocks based encoder-decoder framework for human motion prediction
    Liu, Xiaoli
    Yin, Jianqin
    COGNITIVE COMPUTATION AND SYSTEMS, 2020, 2 (04) : 242 - 246
  • [24] Pose-Based Identification Using Deep Learning for Military Surveillance Systems
    Aouto, Ali
    Thien Huynh-The
    Lee, Jae-Min
    Kim, Dong-Seong
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 626 - 629
  • [25] A Multiclass ELM Strategy in Pose-based 3D Human Motion Analysis
    Budiman, Arif
    Fanany, Mohamad Ivan
    2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2013, : 341 - 346
  • [26] Feature Extraction and Generation of Robot Writing Motion Using Encoder-Decoder Based Deep Neural Network
    Kamigaki, Masahiro
    Katsura, Seiichiro
    2020 IEEE 16TH INTERNATIONAL WORKSHOP ON ADVANCED MOTION CONTROL (AMC), 2020, : 121 - 126
  • [27] Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer
    Chen, Zhengyang
    Han, Bing
    Wang, Shuai
    Qian, Yanmin
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1636 - 1649
  • [28] Proper Error Estimation and Calibration for Attention-Based Encoder-Decoder Models
    Lee, Mun-Hak
    Chang, Joon-Hyuk
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4919 - 4930
  • [29] A diverse embedding-based composite reconstruction encoder-decoder for color fabric defect detection
    Zhang, Hongwei
    Meng, LiPing
    Lu, Shuai
    Song, Zhihuan
    Wu, Lvyuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 278
  • [30] Multi-task Learning using Multi-modal Encoder-Decoder Networks with Shared Skip Connections
    Kuga, Ryohei
    Kanezaki, Asako
    Samejima, Masaki
    Sugano, Yusuke
    Matsushita, Yasuyuki
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 403 - 411