Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引:3
|
作者
Ezzeldin, Mustafa [1 ]
Ghoneim, Amr S. [2 ]
Abdelhamid, Laila [3 ]
Atia, Ayman [4 ]
机构
[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt
[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt
[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt
[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt
关键词
Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;
D O I
10.1007/s11760-024-03552-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.
引用
收藏
页码:9375 / 9385
页数:11
相关论文
共 50 条
  • [41] A Multiple Kernel Learning Approach to Multi-Modal Pedestrian Classification
    San-Biagio, Marco
    Ulas, Aydin
    Crocco, Marco
    Cristani, Marco
    Castellani, Umberto
    Murino, Vittorio
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2412 - 2415
  • [42] BTDNet: A Multi-Modal Approach for Brain Tumor Radiogenomic Classification
    Kollias, Dimitrios
    Vendal, Karanjot
    Gadhavi, Priyankaben
    Russom, Solomon
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [43] A multi-modal approach to enhance Toxoplasma gondii detection in the Australian landscape
    Breidahl, Amanda Jane
    Lynch, Michael
    Sutherland, Duncan R.
    Traub, Rebecca
    Hufschmid, Jasmin
    WILDLIFE RESEARCH, 2025, 52 (02)
  • [44] PURE VERSUS HYBRID TRANSFORMERS FOR MULTI-MODAL BRAIN TUMOR SEGMENTATION: A COMPARATIVE STUDY
    Andrade-Miranda, G.
    Jaouen, V.
    Bourbonne, V.
    Lucia, F.
    Visvikis, D.
    Conze, P. -H.
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1336 - 1340
  • [45] Is a Multi-Modal Approach Needed to Enhance Thromboprophylaxis Prescribing in Obese Patients?
    Shiue, Harn J.
    Reynolds, Jenna
    JOURNAL OF PHARMACY PRACTICE, 2024, 37 (06) : 1235 - 1236
  • [46] A Hierarchical Approach to Continuous Gesture Analysis for Natural Multi-modal Interaction
    Yin, Ying
    ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2012, : 357 - 360
  • [47] Multi-modal human motion recognition based on behaviour tree
    Yang, Qin
    Zhou, Zhenhua
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2024, 16 (3-4) : 381 - 398
  • [48] Multi-modal Pyramid Feature Combination for Human Action Recognition
    Roig, Carlos
    Sarmiento, Manuel
    Varas, David
    Masuda, Issey
    Riveiro, Juan Carlos
    Bou-Balust, Elisenda
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3742 - 3746
  • [49] Rethinking Fusion Baselines for Multi-modal Human Action Recognition
    Jiang, Hongda
    Li, Yanghao
    Song, Sijie
    Liu, Jiaying
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 178 - 187
  • [50] Heterogeneous Multi-Modal Sensor Fusion with Hybrid Attention for Exercise Recognition
    Wijekoon, Anjana
    Wiratunga, Nirmalie
    Cooper, Kay
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,