Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引:3
|
作者
Ezzeldin, Mustafa [1 ]
Ghoneim, Amr S. [2 ]
Abdelhamid, Laila [3 ]
Atia, Ayman [4 ]
机构
[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt
[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt
[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt
[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt
关键词
Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;
D O I
10.1007/s11760-024-03552-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.
引用
收藏
页码:9375 / 9385
页数:11
相关论文
共 50 条
  • [1] Multi-modal lifelog data fusion for improved human activity recognition: A hybrid approach
    Oh, Yongkyung
    Kim, Sungil
    INFORMATION FUSION, 2024, 110
  • [2] Multi-modal Sensing for Human Activity Recognition
    Bruno, Barbara
    Grosinger, Jasmin
    Mastrogiovanni, Fulvio
    Pecora, Federico
    Saffiotti, Alessandro
    Sathyakeerthy, Subhash
    Sgorbissa, Antonio
    2015 24TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2015, : 594 - 600
  • [3] HuMAn: Complex Activity Recognition with Multi-Modal Multi-Positional Body Sensing
    Bharti, Pratool
    De, Debraj
    Chellappan, Sriram
    Das, Sajal K.
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2019, 18 (04) : 857 - 870
  • [4] Hybrid Multi-modal Fusion for Human Action Recognition
    Seddik, Bassem
    Gazzah, Sami
    Ben Amara, Najoua Essoukri
    IMAGE ANALYSIS AND RECOGNITION, ICIAR 2017, 2017, 10317 : 201 - 209
  • [5] A multi-modal approach for activity classification and fall detection
    Carlos Castillo, Jose
    Carneiro, Davide
    Serrano-Cuerda, Juan
    Novais, Paulo
    Fernandez-Caballero, Antonio
    Neves, Jose
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2014, 45 (04) : 810 - 824
  • [6] Hierarchical Multi-Modal Prompting Transformer for Multi-Modal Long Document Classification
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6376 - 6390
  • [7] Human activity recognition based on multi-modal fusion
    Zhang, Cheng
    Zu, Tianqi
    Hou, Yibin
    He, Jian
    Yang, Shengqi
    Dong, Ruihai
    CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2023, 5 (03) : 321 - 332
  • [8] Human activity recognition based on multi-modal fusion
    Cheng Zhang
    Tianqi Zu
    Yibin Hou
    Jian He
    Shengqi Yang
    Ruihai Dong
    CCF Transactions on Pervasive Computing and Interaction, 2023, 5 : 321 - 332
  • [9] A hybrid approach to news video classification with multi-modal features
    Wang, P
    Cai, R
    Yang, SQ
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 787 - 791
  • [10] HMGAN: A Hierarchical Multi-Modal Generative Adversarial Network Model for Wearable Human Activity Recognition
    Chen, Ling
    Hu, Rong
    Wu, Menghan
    Zhou, Xin
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (03):