Action recognition using kinematics posture feature on 3D skeleton joint locations

被引:48
|
作者
Ahad, Md Atiqur Rahman [1 ,4 ]
Ahmed, Masud [2 ]
Das Antar, Anindya [3 ]
Makihara, Yasushi [1 ]
Yagi, Yasushi [1 ]
机构
[1] Osaka Univ, Suita, Osaka, Japan
[2] Univ Maryland, Baltimore, MD 21201 USA
[3] Univ Michigan, Ann Arbor, MI 48109 USA
[4] Univ Dhaka, Dhaka, Bangladesh
关键词
Action recognition; Skeleton data; Kinematics posture feature (KPF); Position-based statistical feature (PSF); Joint angle; Joint position; Deep neural network; Ensemble architecture; Convrnn; Benchmark datasets; Linear joint position feature (LJPF); Angular joint position feature (AJPF); DEPTH; FRAMEWORK;
D O I
10.1016/j.patrec.2021.02.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is a very widely explored research area in computer vision and related fields. We propose Kinematics Posture Feature (KPF) extraction from 3D joint positions based on skeleton data for improving the performance of action recognition. In this approach, we consider the skeleton 3D joints as kinematics sensors. We propose Linear Joint Position Feature (LJPF) and Angular Joint Position Feature (AJPF) based on 3D linear joint positions and angles between bone segments. We then combine these two kinematics features for each video frame for each action to create the KPF feature sets. These feature sets encode the variation of motion in the temporal domain as if each body joint represents kinematics position and orientation sensors. In the next stage, we process the extracted KPF feature descriptor by using a low pass filter, and segment them by using sliding windows with optimized length. This concept resembles the approach of processing kinematics sensor data. From the segmented windows, we compute the Position-based Statistical Feature (PSF). These features consist of temporal domain statistical features (e.g., mean, standard deviation, variance, etc.). These statistical features encode the variation of postures (i.e., joint positions and angles) across the video frames. For performing classification, we explore Support Vector Machine (Linear), RNN, CNNRNN, and ConvRNN model. The proposed PSF feature sets demonstrate prominent performance in both statistical machine learning-and deep learning-based models. For evaluation, we explore five benchmark datasets namely UTKinect-Action3D, Kinect Activity Recognition Dataset (KARD), MSR 3D Action Pairs, Florence 3D, and Office Activity Dataset (OAD). To prevent overfitting, we consider the leave-one-subject-out framework as the experimental setup and perform 10-fold cross-validation. Our approach outperforms several existing methods in these benchmark datasets and achieves very promising classification performance. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:216 / 224
页数:9
相关论文
共 50 条
  • [41] State Estimation Using a Randomized Unscented Kalman Filter for 3D Skeleton Posture
    Musunuri, Yogendra Rao
    Kwon, Oh-Seol
    ELECTRONICS, 2021, 10 (08)
  • [42] Action recognition using 3D DAISY descriptor
    Xiaochun Cao
    Hua Zhang
    Chao Deng
    Qiguang Liu
    Hanyu Liu
    Machine Vision and Applications, 2014, 25 : 159 - 171
  • [43] Effective 3D action recognition using EigenJoints
    Yang, Xiaodong
    Tian, YingLi
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (01) : 2 - 11
  • [44] Action recognition using 3D DAISY descriptor
    Cao, Xiaochun
    Zhang, Hua
    Deng, Chao
    Liu, Qiguang
    Liu, Hanyu
    MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 159 - 171
  • [45] Tripool: Graph triplet pooling for 3D skeleton-based action recognition
    Peng, Wei
    Hong, Xiaopeng
    Zhao, Guoying
    PATTERN RECOGNITION, 2021, 115
  • [46] INVESTIGATION OF DIFFERENT SKELETON FEATURES FOR CNN-BASED 3D ACTION RECOGNITION
    Ding, Zewei
    Wang, Pichao
    Ogunbona, Philip O.
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [47] Mix Dimension in Poincare Geometry for 3D Skeleton-based Action Recognition
    Peng, Wei
    Shi, Jingang
    Xia, Zhaoqiang
    Zhao, Guoying
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1432 - 1440
  • [48] Accurate and Real-time Human Action Recognition Based on 3D Skeleton
    Chen, Hongzhao
    Wang, Guijin
    He, Li
    2013 INTERNATIONAL CONFERENCE ON OPTICAL INSTRUMENTS AND TECHNOLOGY: OPTOELECTRONIC IMAGING AND PROCESSING TECHNOLOGY, 2013, 9045
  • [49] Recurrent Neural Network based Action Recognition from 3D Skeleton Data
    Shukla, Parul
    Biswas, Kanad K.
    Kalra, Prem K.
    2017 13TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY AND INTERNET-BASED SYSTEMS (SITIS), 2017, : 339 - 345
  • [50] SKELETON-BASED HUMAN ACTION RECOGNITION USING SPATIAL TEMPORAL 3D CONVOLUTIONAL NEURAL NETWORKS
    Tu, Juanhui
    Liu, Mengyuan
    Liu, Hong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,