Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition

被引:14
|
作者
Malik, Najeeb ur Rehman [1 ]
Abu-Bakar, Syed Abdul Rahman [1 ]
Sheikh, Usman Ullah [1 ]
Channa, Asma [2 ]
Popescu, Nirvana [2 ]
机构
[1] Univ Teknol Malaysia, Comp Vis Video & Image Proc Lab, ECE Dept, Johor Baharu 81310, Malaysia
[2] Univ Politehn Bucuresti, Comp Sci Dept, Bucharest 060042, Romania
来源
SIGNALS | 2023年 / 4卷 / 01期
基金
欧盟地平线“2020”;
关键词
human action recognition (HAR); deep learning; CNN-LSTM; REPRESENTATION;
D O I
10.3390/signals4010002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [1] Multiview Attention CNN-LSTM Network for SAR Automatic Target Recognition
    Wang, Chenwei
    Liu, Xiaoyu
    Pei, Jifang
    Huang, Yulin
    Zhang, Yin
    Yang, Jianyu
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 12504 - 12513
  • [2] Human Motion Recognition by Micro-Doppler Features and Concatenated CNN-LSTM Network
    Xiong, Xiangrui
    Ren, Aifeng
    Yuan, Tongyang
    Zahid, Adnan
    IEEE SENSORS JOURNAL, 2025, 25 (07) : 12294 - 12302
  • [3] A Novel CNN-LSTM Hybrid Architecture for the Recognition of Human Activities
    Stylianou-Nikolaidou, Sofia
    Vernikos, Ioannis
    Mathe, Eirini
    Spyrou, Evaggelos
    Mylonas, Phivos
    PROCEEDINGS OF THE 22ND ENGINEERING APPLICATIONS OF NEURAL NETWORKS CONFERENCE, EANN 2021, 2021, 3 : 121 - 132
  • [4] Human activity recognition with fine-tuned CNN-LSTM
    Genc, Erdal
    Yildirim, Mustafa Eren
    Salman, Yucel Batu
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2024, 75 (01): : 8 - 13
  • [5] Pose estimation-based lameness recognition in broiler using CNN-LSTM network
    Nasiri, Amin
    Yoder, Jonathan
    Zhao, Yang
    Hawkins, Shawn
    Prado, Maria
    Gan, Hao
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 197
  • [6] Facial Expression Recognition Based on CNN-LSTM
    Liu, Anping
    Yue, Hongjie
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 486 - 491
  • [7] ?-OTDR pattern recognition based on CNN-LSTM
    Wang, Ming
    Feng, Hao
    Qi, Dunzhe
    Du, Lipu
    Sha, Zhou
    OPTIK, 2023, 272
  • [8] P-CNN: Pose-based CNN Features for Action Recognition
    Cheron, Guilhem
    Laptev, Ivan
    Schmid, Cordelia
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3218 - 3226
  • [9] Human action recognition using attention based LSTM network with dilated CNN features
    Muhammad, Khan
    Mustaqeem
    Ullah, Amin
    Imran, Ali Shariq
    Sajjad, Muhammad
    Kiran, Mustafa Servet
    Sannino, Giovanna
    de Albuquerque, Victor Hugo C.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 125 : 820 - 830
  • [10] Harmonic Representation for CNN-LSTM Automatic Chord Recognition
    Ito, Tsuyoshi
    Arai, Shuichi
    3RD INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (ICORIS 2021), 2021, : 196 - 200