Skip-Pose Vectors: Pose-based motion embedding using Encoder-Decoder models

被引：0

作者：

Shirakawa, Yuta ^{[1
]}

Kozakaya, Tatsuo ^{[1
]}

机构：

[1] Toshiba Co Ltd, Corp Res & Dev Ctr, Tokyo, Japan

来源：

PROCEEDINGS OF MVA 2019 16TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA) | 2019年

关键词：

D O I：

10.23919/mva.2019.8757937

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a pose-based unsupervised embedding learning method for action recognition. To classify human action based on the similarity of motions, it is important to establish a good feature space such that similar motions are mapped to similar vector representations. On the other hand, learning a feature space with this property with a supervised approach requires huge training samples, tailored supervised keypoints, and action categories. Although the labeling cost of keypoints is decreasing day by day with improvement of 2D pose estimation methods, labeling video category is still problematic work due to the variety of categories, ambiguity and variations of videos. To avoid the need for such expensive category labeling, following the success of "Skip-Thought Vectors", an unsupervised approach to model the similarity of sentences, we apply its idea to contiguous pose sequences to learn feature representations for measuring motion similarities. Thanks to handling human action as 2D poses instead of images, the model size can be small and easy to handle, and we can augment the training data by projecting 3D motion capture data to 2D. Through evaluation on the JHMDB dataset, we explore various design choices, such as whether to handle the actions as a sequence of poses or as a sequence of images. Our approach leverages pose sequences from 3D motion capture and improves its performance as much as 61.6% on JHMDB.

引用

页数：6

共 50 条

[1] Human action recognition using Pose-based discriminant embedding
Saghafi, Behrouz
Rajan, Deepu
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2012, 27 (01) : 96 - 111
[2] Skip-attention encoder-decoder framework for human motion prediction
Zhang, Ruipeng
Shu, Xiangbo
Yan, Rui
Zhang, Jiachao
Song, Yan
MULTIMEDIA SYSTEMS, 2022, 28 (02) : 413 - 422
[3] Classification of human actions using pose-based features and stacked auto encoder
Ijjina, Earnest Paul
Mohan, Krishna C.
PATTERN RECOGNITION LETTERS, 2016, 83 : 268 - 277
[4] Pedestrian Trajectory Prediction in Heterogeneous Traffic Using Pose Keypoints-Based Convolutional Encoder-Decoder Network
Chen, Kai
Song, Xiao
Ren, Xiaoxiang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1764 - 1775
[5] Image Denoising Using a Deep Encoder-Decoder Network with Skip Connections
Couturier, Raphael
Perrot, Gilles
Salomon, Michel
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VI, 2018, 11306 : 554 - 565
[6] Analysis on Norms of Word Embedding and Hidden Vectors in Neural Conversational Model Based on Encoder-Decoder RNN
Tomioka, Manaya
Kato, Tsuneo
Tamura, Akihiro
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (10) : 1780 - 1789
[7] Modeling Electrical Motor Dynamics Using Encoder-Decoder with Recurrent Skip Connection
Verma, Sagar
Henwood, Nicolas
Castella, Marc
Malrait, Francois
Pesquet, Jean-Christophe
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 1387 - 1394
[8] Alpha matting for portraits using encoder-decoder models
Akshat Srivastava
Srivatsav Raghu
Abitha K Thyagarajan
Jayasri Vaidyaraman
Mohanaprasad Kothandaraman
Pavan Sudheendra
Avinav Goel
Multimedia Tools and Applications, 2022, 81 : 14517 - 14528
[9] Alpha matting for portraits using encoder-decoder models
Srivastava, Akshat
Raghu, Srivatsav
Thyagarajan, Abitha K.
Vaidyaraman, Jayasri
Kothandaraman, Mohanaprasad
Sudheendra, Pavan
Goel, Avinav
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 14517 - 14528
[10] VisCode: Embedding Information in Visualization Images using Encoder-Decoder Network
Zhang, Peiying
Li, Chenhui
Wang, Changbo
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 326 - 336

← 1 2 3 4 5 →