Skeleton-based human activity recognition using ConvLSTM and guided feature learning

被引：41

作者：

Yadav, Santosh Kumar ^{[1
,2
,3
]}

Tiwari, Kamlesh ^{[4
]}

Pandey, Hari Mohan ^{[5
]}

Akbar, Shaik Ali ^{[1
,2
]}

机构：

[1] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, Uttar Pradesh, India

[2] Cent Elect Engn Res Inst CEERI, CSIR, Pilani 333031, Rajasthan, India

[3] DeepBlink LLC, 30 N Gould St Ste R, Sheridan, WY 82801 USA

[4] Birla Inst Technol & Sci Pilani, Dept CSIS, Pilani Campus, Pilani 333031, Rajasthan, India

[5] Edge Hill Univ, Dept Comp Sci, Ormskirk, Lancs, England

来源：

SOFT COMPUTING | 2022年 / 26卷 / 02期

关键词：

Activity recognition; CNNs; LSTMs; ConvLTM; Skeleton tracking; FALL DETECTION;

D O I：

10.1007/s00500-021-06238-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human activity recognition aims to determine actions performed by a human in an image or video. Examples of human activity include standing, running, sitting, sleeping, etc. These activities may involve intricate motion patterns and undesired events such as falling. This paper proposes a novel deep convolutional long short-term memory (ConvLSTM) network for skeletal-based activity recognition and fall detection. The proposed ConvLSTM network is a sequential fusion of convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and fully connected layers. The acquisition system applies human detection and pose estimation to pre-calculate skeleton coordinates from the image/video sequence. The ConvLSTM model uses the raw skeleton coordinates along with their characteristic geometrical and kinematic features to construct the novel guided features. The geometrical and kinematic features are built upon raw skeleton coordinates using relative joint position values, differences between joints, spherical joint angles between selected joints, and their angular velocities. The novel spatiotemporal-guided features are obtained using a trained multi-player CNN-LSTM combination. Classification head including fully connected layers is subsequently applied. The proposed model has been evaluated on the KinectHAR dataset having 130,000 samples with 81 attribute values, collected with the help of a Kinect (v2) sensor. Experimental results are compared against the performance of isolated CNNs and LSTM networks. Proposed ConvLSTM have achieved an accuracy of 98.89% that is better than CNNs and LSTMs having an accuracy of 93.89 and 92.75%, respectively. The proposed system has been tested in realtime and is found to be independent of the pose, facing of the camera, individuals, clothing, etc. The code and dataset will be made publicly available.

引用

页码：877 / 890

页数：14

共 50 条

[21] Self-Supervised Representation Learning for Skeleton-Based Group Activity Recognition
Bian, Cunling
Feng, Wei
Wang, Song
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5990 - 5998
[22] STFC: Spatio-temporal feature chain for skeleton-based human action recognition
Ding, Wenwen
Liu, Kai
Cheng, Fei
Zhang, Jin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2015, 26 : 329 - 337
[23] Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
Zhang, Pengfei
Lan, Cuiling
Zeng, Wenjun
Xing, Junliang
Xue, Jianru
Zheng, Nanning
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1109 - 1118
[24] Skeleton-based Human Action Recognition Using Multiple Sequence Alignment
Ding, Wenwen
Liu, Kai
Cheng, Fei
Zhang, Jin
Li, YunSong
SATELLITE DATA COMPRESSION, COMMUNICATIONS, AND PROCESSING XI, 2015, 9501
[25] Deep Learning for Skeleton-Based Human Activity Segmentation: An Autoencoder Approach
Hossen, Md Amran
Naim, Abdul Ghani
Abas, Pg Emeroylariffion
TECHNOLOGIES, 2024, 12 (07)
[26] Hard Sample Mining and Learning for Skeleton-Based Human Action Recognition and Identification
Cui, Ran
Hua, Gang
Zhu, Aichun
Wu, Jingran
Liu, Haiqiang
IEEE ACCESS, 2019, 7 : 8245 - 8257
[27] Decoupled Representation Learning for Skeleton-Based Gesture Recognition
Liu, Jianbo
Liu, Yongcheng
Wang, Ying
Prinet, Veronique
Xiang, Shiming
Pan, Chunhong
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5750 - 5759
[28] Skeleton-based action recognition with extreme learning machines
Chen, Xi
Koskela, Markus
NEUROCOMPUTING, 2015, 149 : 387 - 396
[29] Bootstrapped Representation Learning for Skeleton-Based Action Recognition
Moliner, Olivier
Huang, Sangxia
Astrom, Kalle
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4153 - 4163
[30] Zoom Transformer for Skeleton-Based Group Activity Recognition
Zhang, Jiaxu
Jia, Yifan
Xie, Wei
Tu, Zhigang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8646 - 8659

← 1 2 3 4 5 →