Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition

被引:14
|
作者
Malik, Najeeb ur Rehman [1 ]
Abu-Bakar, Syed Abdul Rahman [1 ]
Sheikh, Usman Ullah [1 ]
Channa, Asma [2 ]
Popescu, Nirvana [2 ]
机构
[1] Univ Teknol Malaysia, Comp Vis Video & Image Proc Lab, ECE Dept, Johor Baharu 81310, Malaysia
[2] Univ Politehn Bucuresti, Comp Sci Dept, Bucharest 060042, Romania
来源
SIGNALS | 2023年 / 4卷 / 01期
基金
欧盟地平线“2020”;
关键词
human action recognition (HAR); deep learning; CNN-LSTM; REPRESENTATION;
D O I
10.3390/signals4010002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [21] Robotic tactile recognition and adaptive grasping control based on CNN-LSTM
    Hui W.
    Li H.
    Chen M.
    Song A.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2019, 40 (01): : 211 - 218
  • [22] Recognition of Multiple Overlapping Activities Using Compositional CNN-LSTM Model
    Okita, Tsuyoshi
    Inoue, Sozo
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2017 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC '17 ADJUNCT), 2017, : 165 - 168
  • [23] Motor Imagery EEG Signal Recognition Based on ACVAE and CNN-LSTM
    Hu, Cunlin
    Ye, Ye
    Li, Jian
    Wang, Hongliang
    Zhou, Tao
    Xie, Nenggang
    2024 INTERNATIONAL CONFERENCE ON ELECTRONIC ENGINEERING AND INFORMATION SYSTEMS, EEISS 2024, 2024, : 197 - 202
  • [24] CNN-LSTM for automatic emotion recognition using contactless photoplythesmographic signals
    Mellouk, Wafa
    Handouzi, Wahida
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [25] A portable terminal for acoustic monitoring and online recognition of bats with CNN-LSTM
    Gao, Wenzhuo
    Liu, Fanghao
    Li, Chengxuan
    Shi, Mengyao
    Lin, Aiqing
    Dong, Yongjun
    Guo, Jingfu
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (07)
  • [26] Strong motion recording baseline drift recognition based on CNN-LSTM
    Guo, Wenheng
    Zhang, Runjie
    Wang, Maofa
    Zhou, Baofeng
    Yin, Yue
    Zhang, Yue
    Journal of Applied Geophysics,
  • [27] Strong motion recording baseline drift recognition based on CNN-LSTM
    Guo, Wenheng
    Zhang, Runjie
    Wang, Maofa
    Zhou, Baofeng
    Yin, Yue
    Zhang, Yue
    JOURNAL OF APPLIED GEOPHYSICS, 2025, 232
  • [28] Multi-branch LSTM encoded latent features with CNN-LSTM for Youtube popularity prediction
    Sangwan, Neeti
    Bhatnagar, Vishal
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [29] Dynamic Gesture Recognition with Pose-based CNN Features derived from videos using LSTM
    Roy, Kankana
    Sahay, Rajiv R.
    ELEVENTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING (ICVGIP 2018), 2018,
  • [30] Deep CNN-LSTM With Self-Attention Model for Human Activity Recognition Using Wearable Sensor
    Khatun, Mst Alema
    Abu Yousuf, Mohammad
    Ahmed, Sabbir
    Uddin, Md Zia
    Alyami, Salem A.
    Al-Ashhab, Samer
    Akhdar, Hanan F.
    Khan, Asaduzzaman
    Azad, Akm
    Moni, Mohammad Ali
    IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2022, 10