Action recognition using kinematics posture feature on 3D skeleton joint locations

被引：48

作者：

Ahad, Md Atiqur Rahman ^{[1
,4
]}

Ahmed, Masud ^{[2
]}

Das Antar, Anindya ^{[3
]}

Makihara, Yasushi ^{[1
]}

Yagi, Yasushi ^{[1
]}

机构：

[1] Osaka Univ, Suita, Osaka, Japan

[2] Univ Maryland, Baltimore, MD 21201 USA

[3] Univ Michigan, Ann Arbor, MI 48109 USA

[4] Univ Dhaka, Dhaka, Bangladesh

来源：

PATTERN RECOGNITION LETTERS | 2021年 / 145卷 / 145期

关键词：

Action recognition; Skeleton data; Kinematics posture feature (KPF); Position-based statistical feature (PSF); Joint angle; Joint position; Deep neural network; Ensemble architecture; Convrnn; Benchmark datasets; Linear joint position feature (LJPF); Angular joint position feature (AJPF); DEPTH; FRAMEWORK;

D O I：

10.1016/j.patrec.2021.02.013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action recognition is a very widely explored research area in computer vision and related fields. We propose Kinematics Posture Feature (KPF) extraction from 3D joint positions based on skeleton data for improving the performance of action recognition. In this approach, we consider the skeleton 3D joints as kinematics sensors. We propose Linear Joint Position Feature (LJPF) and Angular Joint Position Feature (AJPF) based on 3D linear joint positions and angles between bone segments. We then combine these two kinematics features for each video frame for each action to create the KPF feature sets. These feature sets encode the variation of motion in the temporal domain as if each body joint represents kinematics position and orientation sensors. In the next stage, we process the extracted KPF feature descriptor by using a low pass filter, and segment them by using sliding windows with optimized length. This concept resembles the approach of processing kinematics sensor data. From the segmented windows, we compute the Position-based Statistical Feature (PSF). These features consist of temporal domain statistical features (e.g., mean, standard deviation, variance, etc.). These statistical features encode the variation of postures (i.e., joint positions and angles) across the video frames. For performing classification, we explore Support Vector Machine (Linear), RNN, CNNRNN, and ConvRNN model. The proposed PSF feature sets demonstrate prominent performance in both statistical machine learning-and deep learning-based models. For evaluation, we explore five benchmark datasets namely UTKinect-Action3D, Kinect Activity Recognition Dataset (KARD), MSR 3D Action Pairs, Florence 3D, and Office Activity Dataset (OAD). To prevent overfitting, we consider the leave-one-subject-out framework as the experimental setup and perform 10-fold cross-validation. Our approach outperforms several existing methods in these benchmark datasets and achieves very promising classification performance. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：216 / 224

页数：9

共 50 条

[21] An Efficient Method for Extracting Key-Frames from 3D Human Joint Locations for Action Recognition
Kabir, Md Hasanul
Ahmed, Ferdous
Abdullah-Al-Tariq
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2015), 2015, 9164 : 277 - 284
[22] 3D-Posture Recognition Using Joint Angle Representation
Al Alwani, Adnan
Chahir, Youssef
Goumidi, Djamal E.
Molina, Michele
Jouen, Francois
INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, PT II, 2014, 443 : 106 - 115
[23] RGB+2D skeleton: local hand-crafted and 3D convolution feature coding for action recognition
Zhang, Yi-Xiang
Zhang, Hong-Bo
Du, Ji-Xiang
Lei, Qing
Yang, Lijie
Zhong, Bineng
SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1379 - 1386
[24] Multi-Feature Fusion Real-Time Action Recognition Based on 2D to 3D Skeleton
Ren Guoyin
Lu Xiaoqi
Li Yuhao
LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (24)
[25] 3D SPARSE QUANTIZATION FOR FEATURE LEARNING IN ACTION RECOGNITION
Zhao, Yang
Cheng, Hong
Yang, Lu
2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 263 - 267
[26] A New Feature Descriptor for 3D Human Action Recognition
Asadi-Aghbolaghi, Maryam
Ramezanpour, Sadegh
Kasaei, Shohreh
2014 22ND IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2014, : 1157 - 1161
[27] RGB+2D skeleton: local hand-crafted and 3D convolution feature coding for action recognition
Yi-Xiang Zhang
Hong-Bo Zhang
Ji-Xiang Du
Qing Lei
Lijie Yang
Bineng Zhong
Signal, Image and Video Processing, 2021, 15 : 1379 - 1386
[28] Action Recognition Using Deep 3D CNNs with Sequential Feature Aggregation and Attention
Anvarov, Fazliddin
Kim, Dae Ha
Song, Byung Cheol
ELECTRONICS, 2020, 9 (01)
[29] Joint movement similarities for robust 3D action recognition using skeletal data
Pazhoumand-Dar, Hossein
Lam, Chiou-Peng
Masek, Martin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2015, 30 : 10 - 21
[30] Arm-hand Action Recognition Based on 3D Skeleton Joints
Rui, Ling
Ma, Shi-wei
Wen, Jia-rui
Liu, Li-na
INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA 2016), 2016, : 326 - 332

← 1 2 3 4 5 →