Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引：3

作者：

Ezzeldin, Mustafa ^{[1
]}

Ghoneim, Amr S. ^{[2
]}

Abdelhamid, Laila ^{[3
]}

Atia, Ayman ^{[4
]}

机构：

[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt

[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt

[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt

[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 12期

关键词：

Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;

D O I：

10.1007/s11760-024-03552-z

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.

引用

页码：9375 / 9385

页数：11

共 50 条

[41] A Multiple Kernel Learning Approach to Multi-Modal Pedestrian Classification
San-Biagio, Marco
Ulas, Aydin
Crocco, Marco
Cristani, Marco
Castellani, Umberto
Murino, Vittorio
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2412 - 2415
[42] BTDNet: A Multi-Modal Approach for Brain Tumor Radiogenomic Classification
Kollias, Dimitrios
Vendal, Karanjot
Gadhavi, Priyankaben
Russom, Solomon
APPLIED SCIENCES-BASEL, 2023, 13 (21):
[43] A multi-modal approach to enhance Toxoplasma gondii detection in the Australian landscape
Breidahl, Amanda Jane
Lynch, Michael
Sutherland, Duncan R.
Traub, Rebecca
Hufschmid, Jasmin
WILDLIFE RESEARCH, 2025, 52 (02)
[44] PURE VERSUS HYBRID TRANSFORMERS FOR MULTI-MODAL BRAIN TUMOR SEGMENTATION: A COMPARATIVE STUDY
Andrade-Miranda, G.
Jaouen, V.
Bourbonne, V.
Lucia, F.
Visvikis, D.
Conze, P. -H.
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1336 - 1340
[45] Is a Multi-Modal Approach Needed to Enhance Thromboprophylaxis Prescribing in Obese Patients?
Shiue, Harn J.
Reynolds, Jenna
JOURNAL OF PHARMACY PRACTICE, 2024, 37 (06) : 1235 - 1236
[46] A Hierarchical Approach to Continuous Gesture Analysis for Natural Multi-modal Interaction
Yin, Ying
ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2012, : 357 - 360
[47] Multi-modal human motion recognition based on behaviour tree
Yang, Qin
Zhou, Zhenhua
INTERNATIONAL JOURNAL OF BIOMETRICS, 2024, 16 (3-4) : 381 - 398
[48] Multi-modal Pyramid Feature Combination for Human Action Recognition
Roig, Carlos
Sarmiento, Manuel
Varas, David
Masuda, Issey
Riveiro, Juan Carlos
Bou-Balust, Elisenda
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3742 - 3746
[49] Rethinking Fusion Baselines for Multi-modal Human Action Recognition
Jiang, Hongda
Li, Yanghao
Song, Sijie
Liu, Jiaying
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 178 - 187
[50] Heterogeneous Multi-Modal Sensor Fusion with Hybrid Attention for Exercise Recognition
Wijekoon, Anjana
Wiratunga, Nirmalie
Cooper, Kay
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

← 1 2 3 4 5 →