Human activity recognition based on multi-modal fusion

被引:0
|
作者
Zhang, Cheng [1 ]
Zu, Tianqi [1 ]
Hou, Yibin [1 ,2 ]
He, Jian [1 ,2 ]
Yang, Shengqi [1 ,2 ]
Dong, Ruihai [3 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
[3] Univ Coll Dublin, Insight Ctr Data Analyt, Dublin, Ireland
关键词
Human activity recognition; Multi-modal fusion; Fall detection; Convolutional network; Wearable device;
D O I
10.1007/s42486-023-00132-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, human activity recognition (HAR) methods are developing rapidly. However, most existing methods base on single input data modality, and suffers from accuracy and robustness issues. In this paper, we present a novel multi-modal HAR architecture which fuses signals from both RGB visual data and Inertial Measurement Units (IMU) data. As for the RGB modality, the speed-weighted star RGB representation is proposed to aggregate the temporal information, and a convolutional network is employed to extract features; As for the IMU modality, Fast Fourier transform and multi-layer perceptron are employed to extract the dynamical features of IMU data. As for the feature fusion scheme, the global soft attention layer is designed to adjust the weights according to the concatenated features, and the L-softmax with soft voting is adopted to classify activities. The proposed method is evaluated on the UP-Fall dataset, the F1-scores are 0.92 and 1.00 for 11 classes classification task and fall/non-fall binary classification task respectively.
引用
收藏
页码:321 / 332
页数:12
相关论文
共 50 条
  • [31] 3D shape recognition based on multi-modal information fusion
    Liang, Qi
    Xiao, Mengmeng
    Song, Dan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16173 - 16184
  • [32] Multi-modal fusion for robust hand gesture recognition based on heterogeneous networks
    ZOU YongXiang
    CHENG Long
    HAN LiJun
    LI ZhengWei
    Science China(Technological Sciences), 2023, 66 (11) : 3219 - 3230
  • [33] Multi-modal video event recognition based on association rules and decision fusion
    Mennan Güder
    Nihan Kesim Çiçekli
    Multimedia Systems, 2018, 24 : 55 - 72
  • [34] HuMAn: Complex Activity Recognition with Multi-Modal Multi-Positional Body Sensing
    Bharti, Pratool
    De, Debraj
    Chellappan, Sriram
    Das, Sajal K.
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2019, 18 (04) : 857 - 870
  • [35] Multi-View and Multi-Modal Action Recognition with Learned Fusion
    Ardianto, Sandy
    Hang, Hsueh-Ming
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1601 - 1604
  • [36] Human-Object Contour for Action Recognition with Attentional Multi-modal Fusion Network
    Yu, Miao
    Zhang, Weizhe
    Zeng, Qingxiang
    Wang, Chao
    Li, Jie
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 241 - 246
  • [37] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
    Nie, Weizhi
    Yan, Yan
    Song, Dan
    Wang, Kun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16205 - 16214
  • [38] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
    Weizhi Nie
    Yan Yan
    Dan Song
    Kun Wang
    Multimedia Tools and Applications, 2021, 80 : 16205 - 16214
  • [39] Analysis of Deep Fusion Strategies for Multi-modal Gesture Recognition
    Roitberg, Alina
    Pollert, Tim
    Haurilet, Monica
    Martin, Manuel
    Stiefelhagen, Rainer
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 198 - 206
  • [40] Convolutional Transformer Fusion Blocks for Multi-Modal Gesture Recognition
    Hampiholi, Basavaraj
    Jarvers, Christian
    Mader, Wolfgang
    Neumann, Heiko
    IEEE ACCESS, 2023, 11 : 34094 - 34103