Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引:3
|
作者
Ezzeldin, Mustafa [1 ]
Ghoneim, Amr S. [2 ]
Abdelhamid, Laila [3 ]
Atia, Ayman [4 ]
机构
[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt
[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt
[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt
[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt
关键词
Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;
D O I
10.1007/s11760-024-03552-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.
引用
收藏
页码:9375 / 9385
页数:11
相关论文
共 50 条
  • [31] Multi-modal Transformer for Indoor Human Action Recognition
    Do, Jeonghyeok
    Kim, Munchurl
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1155 - 1160
  • [32] Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition
    Pratishtha Verma
    Animesh Sah
    Rajeev Srivastava
    Multimedia Systems, 2020, 26 : 671 - 685
  • [33] Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition
    Verma, Pratishtha
    Sah, Animesh
    Srivastava, Rajeev
    MULTIMEDIA SYSTEMS, 2020, 26 (06) : 671 - 685
  • [34] MMTSA: Multi-Modal Temporal Segment Attention Network for Efficient Human Activity Recognition
    Gao, Ziqi
    Wang, Yuntao
    Chen, Jianguo
    Xing, Junliang
    Patel, Shwetak
    Liu, Xin
    Shi, Yuanchun
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (03):
  • [35] Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing
    Zhang, Dong
    Ju, Xincheng
    Zhang, Wei
    Li, Junhui
    Li, Shoushan
    Zhu, Qiaoming
    Zhou, Guodong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14338 - 14346
  • [36] Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease Classification with Incomplete Data
    Liu, Linfeng
    Liu, Siyu
    Zhang, Lu
    To, Xuan Vinh
    Nasrallah, Fatima
    Chandra, Shekhar S.
    NEUROIMAGE, 2023, 277
  • [37] A multi-modal approach for high-dimensional feature recognition
    Kushan Ahmadian
    Marina Gavrilova
    The Visual Computer, 2013, 29 : 123 - 130
  • [38] A multi-modal approach for high-dimensional feature recognition
    Ahmadian, Kushan
    Gavrilova, Marina
    VISUAL COMPUTER, 2013, 29 (02): : 123 - 130
  • [39] On the Impact of Wireless Multimedia Network for Multi-Modal Activity Recognition
    Yamashita, Akika
    Lua, Eng Keong
    Oguchi, Masato
    2014 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATION (ISCC), 2014,
  • [40] Empirical Mode Decomposition Based Multi-Modal Activity Recognition
    Hu, Lingyue
    Zhao, Kailong
    Zhou, Xueling
    Ling, Bingo Wing-Kuen
    Liao, Guozhao
    SENSORS, 2020, 20 (21) : 1 - 15