Skeleton-based human activity recognition using ConvLSTM and guided feature learning

被引:41
|
作者
Yadav, Santosh Kumar [1 ,2 ,3 ]
Tiwari, Kamlesh [4 ]
Pandey, Hari Mohan [5 ]
Akbar, Shaik Ali [1 ,2 ]
机构
[1] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, Uttar Pradesh, India
[2] Cent Elect Engn Res Inst CEERI, CSIR, Pilani 333031, Rajasthan, India
[3] DeepBlink LLC, 30 N Gould St Ste R, Sheridan, WY 82801 USA
[4] Birla Inst Technol & Sci Pilani, Dept CSIS, Pilani Campus, Pilani 333031, Rajasthan, India
[5] Edge Hill Univ, Dept Comp Sci, Ormskirk, Lancs, England
关键词
Activity recognition; CNNs; LSTMs; ConvLTM; Skeleton tracking; FALL DETECTION;
D O I
10.1007/s00500-021-06238-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human activity recognition aims to determine actions performed by a human in an image or video. Examples of human activity include standing, running, sitting, sleeping, etc. These activities may involve intricate motion patterns and undesired events such as falling. This paper proposes a novel deep convolutional long short-term memory (ConvLSTM) network for skeletal-based activity recognition and fall detection. The proposed ConvLSTM network is a sequential fusion of convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and fully connected layers. The acquisition system applies human detection and pose estimation to pre-calculate skeleton coordinates from the image/video sequence. The ConvLSTM model uses the raw skeleton coordinates along with their characteristic geometrical and kinematic features to construct the novel guided features. The geometrical and kinematic features are built upon raw skeleton coordinates using relative joint position values, differences between joints, spherical joint angles between selected joints, and their angular velocities. The novel spatiotemporal-guided features are obtained using a trained multi-player CNN-LSTM combination. Classification head including fully connected layers is subsequently applied. The proposed model has been evaluated on the KinectHAR dataset having 130,000 samples with 81 attribute values, collected with the help of a Kinect (v2) sensor. Experimental results are compared against the performance of isolated CNNs and LSTM networks. Proposed ConvLSTM have achieved an accuracy of 98.89% that is better than CNNs and LSTMs having an accuracy of 93.89 and 92.75%, respectively. The proposed system has been tested in realtime and is found to be independent of the pose, facing of the camera, individuals, clothing, etc. The code and dataset will be made publicly available.
引用
收藏
页码:877 / 890
页数:14
相关论文
共 50 条
  • [41] Improved semantic-guided network for skeleton-based action recognition
    Mansouri, Amine
    Bakir, Toufik
    Elzaar, Abdellah
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
  • [42] LEARNING EXPLICIT SHAPE AND MOTION EVOLUTION MAPS FOR SKELETON-BASED HUMAN ACTION RECOGNITION
    Liu, Hong
    Tu, Juanhui
    Liu, Mengyuan
    Ding, Runwei
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 1333 - 1337
  • [43] Reconstruction-driven contrastive learning for unsupervised skeleton-based human action recognition
    Liu, Xing
    Gao, Bo
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [44] Learning Graph Convolutional Network for Skeleton-Based Human Action Recognition by Neural Searching
    Peng, Wei
    Hong, Xiaopeng
    Chen, Haoyu
    Zhao, Guoying
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2669 - 2676
  • [45] Skeleton-Based Action Recognition with Joint Coordinates as Feature Using Neural Oblivious Decision Ensembles
    Nasrul'Alam, Fakhrul Aniq Hakimi
    Shapiai, Mohd Ibrahim
    Batool, Uzma
    Ramli, Ahmad Kamal
    Elias, Khairil Ashraf
    NEW TRENDS IN INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, 2021, 337 : 380 - 392
  • [46] SKELETON-BASED ACTION RECOGNITION USING LSTM AND CNN
    Li, Chuankun
    Wang, Pichao
    Wang, Shuang
    Hou, Yonghong
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [47] Skeleton-based explainable human activity recognition for child gross-motor assessment
    Suzuki, Satoshi
    Amemiya, Yukie
    Sato, Maiko
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 4015 - 4022
  • [48] Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
    Lin, Lilang
    Wu, Lehong
    Zhang, Jiahang
    Wang, Jiaying
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 75 - 92
  • [49] Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition
    Tang, Yansong
    Tian, Yi
    Lu, Jiwen
    Li, Peiyang
    Zhou, Jie
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5323 - 5332
  • [50] A Cross View Learning Approach for Skeleton-Based Action Recognition
    Zheng, Hui
    Zhang, Xinming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3061 - 3072