Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile

被引:19
|
作者
Li, Jeng-Lin [1 ,2 ]
Lee, Chi-Chun [1 ,2 ]
机构
[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu, Taiwan
[2] MOST Joint Res Ctr AI Technol & All Vista Healthc, Taipei, Taiwan
来源
关键词
personal attribute; multimodal emotion recognition; attention; psycholinguistic norm;
D O I
10.21437/Interspeech.2019-2044
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
A growing number of human-centered applications benefit from continuous advancements in the emotion recognition technology. Many emotion recognition algorithms have been designed to model multimodal behavior cues to achieve high performances. However, most of them do not consider the modulating factors of an individual's personal attributes in his/her expressive behaviors. In this work, we propose a Personalized Attributes-Aware Attention Network (PAaAN) with a novel personalized attention mechanism to perform emotion recognition using speech and language cues. The attention profile is learned from embeddings of an individual's profile, acoustic, and lexical behavior data. The profile embedding is derived using linguistics inquiry word count computed between the target speaker and a large set of movie scripts. Our method achieves the state-of-the-art 70.3% unweighted accuracy in a four class emotion recognition task on the IEMOCAP. Further analysis reveals that affect-related semantic categories are emphasized differently for each speaker in the corpus showing the effectiveness of our attention mechanism for personalization.
引用
收藏
页码:211 / 215
页数:5
相关论文
共 50 条
  • [31] Hierarchical Attention Approach in Multimodal Emotion Recognition for Human Robot Interaction
    Abdullah, Muhammad
    Ahmad, Mobeen
    Han, Dongil
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [32] DUAL FOCUS ATTENTION NETWORK FOR VIDEO EMOTION RECOGNITION
    Qiu, Haonan
    He, Liang
    Wang, Feng
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [33] Attention-based 3D convolutional recurrent neural network model for multimodal emotion recognition
    Du, Yiming
    Li, Penghai
    Cheng, Longlong
    Zhang, Xuanwei
    Li, Mingji
    Li, Fengzhou
    FRONTIERS IN NEUROSCIENCE, 2024, 17
  • [34] ISNet: Individual Standardization Network for Speech Emotion Recognition
    Fan, Weiquan
    Xu, Xiangmin
    Cai, Bolun
    Xing, Xiaofen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1803 - 1814
  • [35] Multi-level attention fusion network assisted by relative entropy alignment for multimodal speech emotion recognition
    Lei, Jianjun
    Wang, Jing
    Wang, Ying
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8478 - 8490
  • [36] Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network
    Chen, Xu
    Chen, Hanxiong
    Xu, Hongteng
    Zhang, Yongfeng
    Cao, Yixin
    Qin, Zheng
    Zha, Hongyuan
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 765 - 774
  • [37] A Multimodal Low Complexity Neural Network Approach for Emotion Recognition
    Aguinaga, Adrian Rodriguez
    Ramirez, Margarita Ramirez
    Soto, Maria del Consuelo Salgado
    Cisnero, Maria de los Angeles Quezada
    HUMAN BEHAVIOR AND EMERGING TECHNOLOGIES, 2024, 2024
  • [38] Multimodal Emotion Recognition Based on Ensemble Convolutional Neural Network
    Huang, Haiping
    Hu, Zhenchao
    Wang, Wenming
    Wu, Min
    IEEE ACCESS, 2020, 8 : 3265 - 3271
  • [39] Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition
    Yuan, Peicong
    Cai, Guoyong
    Chen, Ming
    Tang, Xiaolv
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 250 - 262
  • [40] Temporal Relation Inference Network for Multimodal Speech Emotion Recognition
    Dong, Guan-Nan
    Pun, Chi-Man
    Zhang, Zheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) : 6472 - 6485