Skip-Pose Vectors: Pose-based motion embedding using Encoder-Decoder models

被引:0
|
作者
Shirakawa, Yuta [1 ]
Kozakaya, Tatsuo [1 ]
机构
[1] Toshiba Co Ltd, Corp Res & Dev Ctr, Tokyo, Japan
关键词
D O I
10.23919/mva.2019.8757937
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a pose-based unsupervised embedding learning method for action recognition. To classify human action based on the similarity of motions, it is important to establish a good feature space such that similar motions are mapped to similar vector representations. On the other hand, learning a feature space with this property with a supervised approach requires huge training samples, tailored supervised keypoints, and action categories. Although the labeling cost of keypoints is decreasing day by day with improvement of 2D pose estimation methods, labeling video category is still problematic work due to the variety of categories, ambiguity and variations of videos. To avoid the need for such expensive category labeling, following the success of "Skip-Thought Vectors", an unsupervised approach to model the similarity of sentences, we apply its idea to contiguous pose sequences to learn feature representations for measuring motion similarities. Thanks to handling human action as 2D poses instead of images, the model size can be small and easy to handle, and we can augment the training data by projecting 3D motion capture data to 2D. Through evaluation on the JHMDB dataset, we explore various design choices, such as whether to handle the actions as a sequence of poses or as a sequence of images. Our approach leverages pose sequences from 3D motion capture and improves its performance as much as 61.6% on JHMDB.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Human Pose Estimation using Motion Priors and Ensemble Models
    Ukita, Norimichi
    2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 9 - 14
  • [42] Robust facial landmark localization using classified random ferns and pose-based initialization
    Cui, Ying
    Zhang, Jian
    Guo, Dongyan
    Jin, Zhong
    SIGNAL PROCESSING, 2015, 110 : 46 - 53
  • [43] A study on pose-based deep learning models for gloss-free Sign Language Translation
    Dal Bianco, Pedro
    Rios, Gaston
    Hasperue, Waldo
    Stanchi, Oscar
    Quiroga, Facundo
    Ronchetti, Franco
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2024, 24 (02): : 99 - 103
  • [44] Arabic Machine Transliteration using an Attention-based Encoder-decoder Model
    Ameur, Mohamed Seghir Hadj
    Meziane, Farid
    Guessoum, Ahmed
    ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 287 - 297
  • [45] Person Identification Using Pose-Based Hough Forests from Skeletal Action Sequence
    Chang, Ju Yong
    Park, Ji Young
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (03) : 767 - 777
  • [46] Text Normalization Using Encoder-Decoder Networks Based on the Causal Feature Extractor
    Javaloy, Adrian
    Garcia-Mateos, Gines
    APPLIED SCIENCES-BASEL, 2020, 10 (13):
  • [47] Private Cell-ID Trajectory Prediction Using Multi-Graph Embedding and Encoder-Decoder Network
    Lv, Mingqi
    Zeng, Dajian
    Chen, Ling
    Chen, Tieming
    Zhu, Tiantian
    Ji, Shouling
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (08) : 2967 - 2977
  • [48] Image Captioning Encoder-Decoder Models Using CNN-RNN Architectures: A Comparative Study
    Suresh, K. Revati
    Jarapala, Arun
    Sudeep, P., V
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (10) : 5719 - 5742
  • [49] A Systematic Literature Review on Using the Encoder-Decoder Models for Image Captioning in English and Arabic Languages
    Alsayed, Ashwaq
    Arif, Muhammad
    Qadah, Thamir M.
    Alotaibi, Saud
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [50] IMPROVED MULTI-STAGE TRAINING OF ONLINE ATTENTION-BASED ENCODER-DECODER MODELS
    Garg, Abhinav
    Gowda, Dhananjaya
    Kumar, Ankur
    Kim, Kwangyoun
    Kumar, Mehul
    Kim, Chanwoo
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 70 - 77