Action recognition using kinematics posture feature on 3D skeleton joint locations

被引:48
|
作者
Ahad, Md Atiqur Rahman [1 ,4 ]
Ahmed, Masud [2 ]
Das Antar, Anindya [3 ]
Makihara, Yasushi [1 ]
Yagi, Yasushi [1 ]
机构
[1] Osaka Univ, Suita, Osaka, Japan
[2] Univ Maryland, Baltimore, MD 21201 USA
[3] Univ Michigan, Ann Arbor, MI 48109 USA
[4] Univ Dhaka, Dhaka, Bangladesh
关键词
Action recognition; Skeleton data; Kinematics posture feature (KPF); Position-based statistical feature (PSF); Joint angle; Joint position; Deep neural network; Ensemble architecture; Convrnn; Benchmark datasets; Linear joint position feature (LJPF); Angular joint position feature (AJPF); DEPTH; FRAMEWORK;
D O I
10.1016/j.patrec.2021.02.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is a very widely explored research area in computer vision and related fields. We propose Kinematics Posture Feature (KPF) extraction from 3D joint positions based on skeleton data for improving the performance of action recognition. In this approach, we consider the skeleton 3D joints as kinematics sensors. We propose Linear Joint Position Feature (LJPF) and Angular Joint Position Feature (AJPF) based on 3D linear joint positions and angles between bone segments. We then combine these two kinematics features for each video frame for each action to create the KPF feature sets. These feature sets encode the variation of motion in the temporal domain as if each body joint represents kinematics position and orientation sensors. In the next stage, we process the extracted KPF feature descriptor by using a low pass filter, and segment them by using sliding windows with optimized length. This concept resembles the approach of processing kinematics sensor data. From the segmented windows, we compute the Position-based Statistical Feature (PSF). These features consist of temporal domain statistical features (e.g., mean, standard deviation, variance, etc.). These statistical features encode the variation of postures (i.e., joint positions and angles) across the video frames. For performing classification, we explore Support Vector Machine (Linear), RNN, CNNRNN, and ConvRNN model. The proposed PSF feature sets demonstrate prominent performance in both statistical machine learning-and deep learning-based models. For evaluation, we explore five benchmark datasets namely UTKinect-Action3D, Kinect Activity Recognition Dataset (KARD), MSR 3D Action Pairs, Florence 3D, and Office Activity Dataset (OAD). To prevent overfitting, we consider the leave-one-subject-out framework as the experimental setup and perform 10-fold cross-validation. Our approach outperforms several existing methods in these benchmark datasets and achieves very promising classification performance. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:216 / 224
页数:9
相关论文
共 50 条
  • [21] An Efficient Method for Extracting Key-Frames from 3D Human Joint Locations for Action Recognition
    Kabir, Md Hasanul
    Ahmed, Ferdous
    Abdullah-Al-Tariq
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2015), 2015, 9164 : 277 - 284
  • [22] 3D-Posture Recognition Using Joint Angle Representation
    Al Alwani, Adnan
    Chahir, Youssef
    Goumidi, Djamal E.
    Molina, Michele
    Jouen, Francois
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, PT II, 2014, 443 : 106 - 115
  • [23] RGB+2D skeleton: local hand-crafted and 3D convolution feature coding for action recognition
    Zhang, Yi-Xiang
    Zhang, Hong-Bo
    Du, Ji-Xiang
    Lei, Qing
    Yang, Lijie
    Zhong, Bineng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1379 - 1386
  • [24] Multi-Feature Fusion Real-Time Action Recognition Based on 2D to 3D Skeleton
    Ren Guoyin
    Lu Xiaoqi
    Li Yuhao
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (24)
  • [25] 3D SPARSE QUANTIZATION FOR FEATURE LEARNING IN ACTION RECOGNITION
    Zhao, Yang
    Cheng, Hong
    Yang, Lu
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 263 - 267
  • [26] A New Feature Descriptor for 3D Human Action Recognition
    Asadi-Aghbolaghi, Maryam
    Ramezanpour, Sadegh
    Kasaei, Shohreh
    2014 22ND IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2014, : 1157 - 1161
  • [27] RGB+2D skeleton: local hand-crafted and 3D convolution feature coding for action recognition
    Yi-Xiang Zhang
    Hong-Bo Zhang
    Ji-Xiang Du
    Qing Lei
    Lijie Yang
    Bineng Zhong
    Signal, Image and Video Processing, 2021, 15 : 1379 - 1386
  • [28] Action Recognition Using Deep 3D CNNs with Sequential Feature Aggregation and Attention
    Anvarov, Fazliddin
    Kim, Dae Ha
    Song, Byung Cheol
    ELECTRONICS, 2020, 9 (01)
  • [29] Joint movement similarities for robust 3D action recognition using skeletal data
    Pazhoumand-Dar, Hossein
    Lam, Chiou-Peng
    Masek, Martin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2015, 30 : 10 - 21
  • [30] Arm-hand Action Recognition Based on 3D Skeleton Joints
    Rui, Ling
    Ma, Shi-wei
    Wen, Jia-rui
    Liu, Li-na
    INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA 2016), 2016, : 326 - 332