Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引：3

作者：

Ezzeldin, Mustafa ^{[1
]}

Ghoneim, Amr S. ^{[2
]}

Abdelhamid, Laila ^{[3
]}

Atia, Ayman ^{[4
]}

机构：

[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt

[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt

[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt

[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 12期

关键词：

Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;

D O I：

10.1007/s11760-024-03552-z

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.

引用

页码：9375 / 9385

页数：11

共 50 条

[1] Multi-modal lifelog data fusion for improved human activity recognition: A hybrid approach
Oh, Yongkyung
Kim, Sungil
INFORMATION FUSION, 2024, 110
[2] Multi-modal Sensing for Human Activity Recognition
Bruno, Barbara
Grosinger, Jasmin
Mastrogiovanni, Fulvio
Pecora, Federico
Saffiotti, Alessandro
Sathyakeerthy, Subhash
Sgorbissa, Antonio
2015 24TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2015, : 594 - 600
[3] HuMAn: Complex Activity Recognition with Multi-Modal Multi-Positional Body Sensing
Bharti, Pratool
De, Debraj
Chellappan, Sriram
Das, Sajal K.
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2019, 18 (04) : 857 - 870
[4] Hybrid Multi-modal Fusion for Human Action Recognition
Seddik, Bassem
Gazzah, Sami
Ben Amara, Najoua Essoukri
IMAGE ANALYSIS AND RECOGNITION, ICIAR 2017, 2017, 10317 : 201 - 209
[5] A multi-modal approach for activity classification and fall detection
Carlos Castillo, Jose
Carneiro, Davide
Serrano-Cuerda, Juan
Novais, Paulo
Fernandez-Caballero, Antonio
Neves, Jose
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2014, 45 (04) : 810 - 824
[6] Hierarchical Multi-Modal Prompting Transformer for Multi-Modal Long Document Classification
Liu, Tengfei
Hu, Yongli
Gao, Junbin
Sun, Yanfeng
Yin, Baocai
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6376 - 6390
[7] Human activity recognition based on multi-modal fusion
Zhang, Cheng
Zu, Tianqi
Hou, Yibin
He, Jian
Yang, Shengqi
Dong, Ruihai
CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2023, 5 (03) : 321 - 332
[8] Human activity recognition based on multi-modal fusion
Cheng Zhang
Tianqi Zu
Yibin Hou
Jian He
Shengqi Yang
Ruihai Dong
CCF Transactions on Pervasive Computing and Interaction, 2023, 5 : 321 - 332
[9] A hybrid approach to news video classification with multi-modal features
Wang, P
Cai, R
Yang, SQ
ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 787 - 791
[10] HMGAN: A Hierarchical Multi-Modal Generative Adversarial Network Model for Wearable Human Activity Recognition
Chen, Ling
Hu, Rong
Wu, Menghan
Zhou, Xin
PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (03):

← 1 2 3 4 5 →