MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

被引:0
|
作者
Zhu, Lei [1 ]
Ding, Yu [1 ]
Huang, Aiai [1 ]
Tan, Xufei [2 ]
Zhang, Jianhai [3 ,4 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310000, Peoples R China
[2] Hangzhou City Univ, Sch Med, Hangzhou 310015, Peoples R China
[3] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
[4] Hangzhou City Univ, Key Lab Brain Machine Collaborat Intelligence Zhej, Hangzhou 310015, Peoples R China
关键词
Deep learning; Physiological signal; Multimodal fusion; Emotion recognition; EEG;
D O I
10.1007/s11760-024-03632-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Currently, research on emotion recognition has shown that multi-modal data fusion has advantages in improving the accuracy and robustness of human emotion recognition, outperforming single-modal methods. Despite the promising results of existing methods, significant challenges remain in effectively fusing data from multiple modalities to achieve superior performance. Firstly, existing works tend to focus on generating a joint representation by fusing multi-modal data, with fewer methods considering the specific characteristics of each modality. Secondly, most methods fail to fully capture the intricate correlations among multiple modalities, often resorting to simplistic combinations of latent features. To address these challenges, we propose a novel fusion network for multi-modal emotion recognition. This network enhances the efficacy of multi-modal fusion while preserving the distinct characteristics of each modality. Specifically, a dual-stream multi-scale feature encoding (MFE) is designed to extract emotional information from both electroencephalogram (EEG) and peripheral physiological signals (PPS) temporal slices. Subsequently, a cross-modal global-local feature fusion module (CGFFM) is proposed to integrate global and local information from multi-modal data and then assign different importance to each modality, which makes the fusion data tend to the more important modalities. Meanwhile, the transformer module is employed to further learn the modality-specific information. Moreover, we introduce the adaptive collaboration block (ACB), which optimally leverages both modality-specific and cross-modality relations for enhanced integration and feature representation. Following extensive experiments on the DEAP and DREAMER multimodal datasets, our model achieves state-of-the-art performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Multimodal Emotion Recognition Using Visual, Vocal and Physiological Signals: A Review
    Udahemuka, Gustave
    Djouani, Karim
    Kurien, Anish M.
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [42] Deep Representation Learning for Multimodal Emotion Recognition Using Physiological Signals
    Zubair, Muhammad
    Woo, Sungpil
    Lim, Sunhwan
    Yoon, Changwoo
    IEEE ACCESS, 2024, 12 : 106605 - 106617
  • [43] HYPERCOMPLEX MULTIMODAL EMOTION RECOGNITION FROM EEG AND PERIPHERAL PHYSIOLOGICAL SIGNALS
    Lopez, Eleonora
    Chiarantano, Eleonora
    Grassucci, Eleonora
    Comminiello, Danilo
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [44] Multimodal machine learning approach for emotion recognition using physiological signals
    Ramadan, Mohamad A.
    Salem, Nancy M.
    Mahmoud, Lamees N.
    Sadek, Ibrahim
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96
  • [45] Incongruity-aware multimodal physiology signals fusion for emotion recognition
    Li, Jing
    Chen, Ning
    Zhu, Hongqing
    Li, Guangqiang
    Xu, Zhangyong
    Chen, Dingxin
    INFORMATION FUSION, 2024, 105
  • [46] Multimodal emotion recognition based on speech and ECG signals
    Huang C.
    Jin Y.
    Wang Q.
    Zhao L.
    Zou C.
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2010, 40 (05): : 895 - 900
  • [47] Multimodal Paradigm for Emotion Recognition Based on EEG Signals
    Masood, Naveen
    Farooq, Humera
    HUMAN-COMPUTER INTERACTION: THEORIES, METHODS, AND HUMAN ISSUES, HCI INTERNATIONAL 2018, PT I, 2018, 10901 : 419 - 428
  • [48] Emotion Recognition Based on Multimodal Physiological Data: A Survey
    Liu, Ying
    Yuan, Li
    Zu, Shuodi
    Fan, Youteng
    Xie, Ning
    Yang, Yang
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2024, 53 (05): : 720 - 731
  • [49] Video-based multimodal spontaneous emotion recognition using facial expressions and physiological signals
    Ouzar, Yassine
    Bousefsaf, Frederic
    Djeldjli, Djamaleddine
    Maaoui, Choubeila
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2459 - 2468
  • [50] Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition
    Yuan, Peicong
    Cai, Guoyong
    Chen, Ming
    Tang, Xiaolv
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 250 - 262