Multi-modal hybrid hierarchical classification approach with transformers to enhance complex human activity recognition

被引:3
|
作者
Ezzeldin, Mustafa [1 ]
Ghoneim, Amr S. [2 ]
Abdelhamid, Laila [3 ]
Atia, Ayman [4 ]
机构
[1] Helwan Univ, Fac Comp & Aritifal Intelligence, Software Engn Dept, Cairo, Egypt
[2] Helwan Univ, Fac Comp & Aritifal Intelligence, Comp Sci Dept, Cairo, Egypt
[3] Helwan Univ, Fac Comp & Aritifal Intelligence, Informat Syst Dept, Cairo, Egypt
[4] Helwan Univ, October Univ Modern Sci & Arts MSA, HCI LAB FCAI, Fac Comp Sci, Cairo, Egypt
关键词
Human Activity Recognition (HAR); Complex HAR; Multi-modal; Hierarchical classification; Transformers;
D O I
10.1007/s11760-024-03552-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Activity Recognition (HAR) plays a crucial role in computer vision and signal processing, with extensive applications in domains such as security, surveillance, and healthcare. Traditional machine learning (ML) approaches for HAR have achieved commendable success but face limitations such as reliance on handcrafted features, sensitivity to noise, and challenges in handling complex temporal dependencies. These limitations have spurred interest in deep learning (DL) and hybrid models that address these issues more effectively. Thus, DL has emerged as a powerful approach for HAR, surpassing the performance of traditional methods. In this paper, a multi-modal hybrid hierarchical classification approach is proposed. It combines DL transformers with traditional ML techniques to improve both the accuracy and efficiency of HAR. The proposed classifier is evaluated based on the different activities of four widely used benchmark datasets: PAMAP2, CASAS, UCI HAR, and UCI HAPT. The experimental results demonstrate that the hybrid hierarchical classifier achieves remarkable accuracy rates of 99.69%, 97.4%, 98.7%, and 98.6%, respectively, outperforming traditional classification methods and significantly reducing training time compared to sequential LSTM models.
引用
收藏
页码:9375 / 9385
页数:11
相关论文
共 50 条
  • [21] A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization
    Papadakis, Antonios
    Spyrou, Evaggelos
    SENSORS, 2024, 24 (08)
  • [22] Interpretable Passive Multi-Modal Sensor Fusion for Human Identification and Activity Recognition
    Yuan, Liangqi
    Andrews, Jack
    Mu, Huaizheng
    Vakil, Asad
    Ewing, Robert
    Blasch, Erik
    Li, Jia
    SENSORS, 2022, 22 (15)
  • [23] Template co-updating in multi-modal human activity recognition systems
    Franco, Annalisa
    Magnani, Antonio
    Maio, Dario
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 2113 - 2116
  • [24] Multi-modal recognition of worker activity for human-centered intelligent manufacturing
    Tao, Wenjin
    Leu, Ming C.
    Yin, Zhaozheng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95 (95)
  • [25] A Multi-Modal Approach to Patient Activity Monitoring
    Steele, Alec M.
    Nourani, Mehrdad
    Bopp, Melinda M.
    Taylor, Tanya S.
    Sullivan, Dennis H.
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 423 - 427
  • [26] A Multi-Modal Deep Learning Approach for Emotion Recognition
    Shahzad, H. M.
    Bhatti, Sohail Masood
    Jaffar, Arfan
    Rashid, Muhammad
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02): : 1561 - 1570
  • [27] Multi-Modal Convolutional Neural Networks for Activity Recognition
    Ha, Sojeong
    Yun, Jeong-Min
    Choi, Seungjin
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 3017 - 3022
  • [28] A MULTI-MODAL TRANSFORMER APPROACH FOR FOOTBALL EVENT CLASSIFICATION
    Zhang, Yixiao
    Li, Baihua
    Fang, Hui
    Meng, Qinggang
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2220 - 2224
  • [29] Multi-modal interaction with transformers: bridging robots and human with natural language
    Wang, Shaochen
    Zhou, Zhangli
    Li, Bin
    Li, Zhijun
    Kan, Zhen
    ROBOTICA, 2024, 42 (02) : 415 - 434
  • [30] A Multi-Modal Approach to Sensing Human Emotion
    Gibilisco, Hannah
    Laubenberger, Michael
    Spiridonov, Valerii
    Belga, Jacob
    Hallstrom, Jason O.
    Peluso, Paul R.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2499 - 2502