Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition

被引:14
|
作者
Malik, Najeeb ur Rehman [1 ]
Abu-Bakar, Syed Abdul Rahman [1 ]
Sheikh, Usman Ullah [1 ]
Channa, Asma [2 ]
Popescu, Nirvana [2 ]
机构
[1] Univ Teknol Malaysia, Comp Vis Video & Image Proc Lab, ECE Dept, Johor Baharu 81310, Malaysia
[2] Univ Politehn Bucuresti, Comp Sci Dept, Bucharest 060042, Romania
来源
SIGNALS | 2023年 / 4卷 / 01期
基金
欧盟地平线“2020”;
关键词
human action recognition (HAR); deep learning; CNN-LSTM; REPRESENTATION;
D O I
10.3390/signals4010002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [41] Data Augmentation for Recognition of Handwritten Words and Lines using a CNN-LSTM Network
    Wigington, Curtis
    Stewart, Seth
    Davis, Brian
    Barrett, Bill
    Price, Brian
    Cohen, Scott
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 639 - 645
  • [42] An investigation of CNN-LSTM music recognition algorithm in ethnic vocal technique singing
    Dong, Fang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2024, 27 (05) : 505 - 514
  • [43] Research on Transformer Partial Discharge UHF Pattern Recognition Based on Cnn-lstm
    Zhou, Xiu
    Wu, Xutao
    Ding, Pei
    Li, Xiuguang
    He, Ninghui
    Zhang, Guozhi
    Zhang, Xiaoxing
    ENERGIES, 2020, 13 (01)
  • [44] An improved CNN-LSTM network for modulation identification relying on periodic features of signal
    Zhou, Fan
    Li, Jinghui
    Wang, Yang
    IET COMMUNICATIONS, 2023, 17 (18) : 2097 - 2106
  • [45] Shallow hierarchical CNN-LSTM for activity recognition to integrate postural transition states
    Tilley, Douglas
    Martinez-Hernandez, Uriel
    2023 IEEE SENSORS, 2023,
  • [46] Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features
    Ullah, Amin
    Ahmad, Jamil
    Muhammad, Khan
    Sajjad, Muhammad
    Baik, Sung Wook
    IEEE ACCESS, 2018, 6 : 1155 - 1166
  • [47] Human Activity Recognition Based on CNN and LSTM
    Tan, Xu-Nan
    Journal of Computers (Taiwan), 2023, 34 (03) : 221 - 235
  • [48] Human Action Recognition Based on Temporal Pose CNN and Multi-dimensional Fusion
    Huang, Yi
    Lai, Shang-Hong
    Tai, Shao-Heng
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 426 - 440
  • [49] Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks
    Zan, Hui
    Zhao, Gang
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 2331 - 2345
  • [50] Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks
    Hui Zan
    Gang Zhao
    Arabian Journal for Science and Engineering, 2023, 48 : 2331 - 2345