Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引：3

作者：

Ezzeldin, Mustafa ^{[1
]}

Ghoneim, Amr S. ^{[2
]}

Abdelhamid, Laila ^{[3
]}

Atia, Ayman ^{[4
]}

机构：

[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt

[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt

[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt

[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 12期

关键词：

Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;

D O I：

10.1007/s11760-024-03552-z

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.

引用

页码：9375 / 9385

页数：11

共 50 条

[21] A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization
Papadakis, Antonios
Spyrou, Evaggelos
SENSORS, 2024, 24 (08)
[22] Interpretable Passive Multi-Modal Sensor Fusion for Human Identification and Activity Recognition
Yuan, Liangqi
Andrews, Jack
Mu, Huaizheng
Vakil, Asad
Ewing, Robert
Blasch, Erik
Li, Jia
SENSORS, 2022, 22 (15)
[23] Template co-updating in multi-modal human activity recognition systems
Franco, Annalisa
Magnani, Antonio
Maio, Dario
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 2113 - 2116
[24] Multi-modal recognition of worker activity for human-centered intelligent manufacturing
Tao, Wenjin
Leu, Ming C.
Yin, Zhaozheng
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95 (95)
[25] A Multi-Modal Approach to Patient Activity Monitoring
Steele, Alec M.
Nourani, Mehrdad
Bopp, Melinda M.
Taylor, Tanya S.
Sullivan, Dennis H.
2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 423 - 427
[26] A Multi-Modal Deep Learning Approach for Emotion Recognition
Shahzad, H. M.
Bhatti, Sohail Masood
Jaffar, Arfan
Rashid, Muhammad
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02): : 1561 - 1570
[27] Multi-Modal Convolutional Neural Networks for Activity Recognition
Ha, Sojeong
Yun, Jeong-Min
Choi, Seungjin
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 3017 - 3022
[28] A MULTI-MODAL TRANSFORMER APPROACH FOR FOOTBALL EVENT CLASSIFICATION
Zhang, Yixiao
Li, Baihua
Fang, Hui
Meng, Qinggang
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2220 - 2224
[29] Multi-modal interaction with transformers: bridging robots and human with natural language
Wang, Shaochen
Zhou, Zhangli
Li, Bin
Li, Zhijun
Kan, Zhen
ROBOTICA, 2024, 42 (02) : 415 - 434
[30] A Multi-Modal Approach to Sensing Human Emotion
Gibilisco, Hannah
Laubenberger, Michael
Spiridonov, Valerii
Belga, Jacob
Hallstrom, Jason O.
Peluso, Paul R.
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2499 - 2502

← 1 2 3 4 5 →