Exploring the Contextual Factors Affecting Multimodal Emotion Recognition in Videos

被引:19
|
作者
Bhattacharya, Prasanta [1 ]
Gupta, Raj Kumar [1 ]
Yang, Yinping [1 ]
机构
[1] Agcy Sci Technol & Res STAR, Inst High Performance Comp, Singapore 138632, Singapore
关键词
Emotion recognition; Videos; Visualization; Feature extraction; Physiology; High performance computing; Distance measurement; Affective computing; affect sensing and analysis; modelling human emotions; multi-modal recognition; sentiment analysis; technology & devices for affective computing; SEX-DIFFERENCES; FACIAL EXPRESSIONS; LANGUAGES; SELECTION; MODEL;
D O I
10.1109/TAFFC.2021.3071503
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used to better recognize emotions in certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion features derived from facial expressions, tone and text in conjunction with two key contextual factors: i) gender of the speaker, and ii) duration of the emotional episode. Using a large public dataset of 2,176 manually annotated YouTube videos, we found that while multimodal features consistently outperformed bimodal and unimodal features, their performance varied significantly across different emotions, gender and duration contexts. Multimodal features performed particularly better for male speakers in recognizing most emotions. Furthermore, multimodal features performed particularly better for shorter than for longer videos in recognizing neutral and happiness, but not sadness and anger. These findings offer new insights towards the development of more context-aware emotion recognition and empathetic systems.
引用
收藏
页码:1547 / 1557
页数:11
相关论文
共 50 条
  • [41] Socializing the Videos: A Multimodal Approach for Social Relation Recognition
    Xu, Tong
    Zhou, Peilun
    Hu, Linkang
    He, Xiangnan
    Hu, Yao
    Chen, Enhong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [42] Contextual factors affecting hint utility
    Inventado, Paul Salvador
    Scupelli, Peter
    Ostrow, Korinn
    Heffernan, Neil, III
    Ocumpaugh, Jaclyn
    Almeda, Victoria
    Slater, Stefan
    INTERNATIONAL JOURNAL OF STEM EDUCATION, 2018, 5
  • [43] Emotion Recognition from Videos Using Facial Expressions
    Selvi, P. Tamil
    Vyshnavi, P.
    Jagadish, R.
    Srikumar, Shravan
    Veni, S.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 565 - 576
  • [44] Multimodal Multipart Learning for Action Recognition in Depth Videos
    Shahroudy, Amir
    Ng, Tian-Tsong
    Yang, Qingxiong
    Wang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2123 - 2129
  • [45] Emotion Recognition in the Wild from Videos using Images
    Bargal, Sarah Adel
    Barsoum, Emad
    Ferrer, Cristian Canton
    Zhang, Cha
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 433 - 436
  • [46] Contextual factors affecting hint utility
    Paul Salvador Inventado
    Peter Scupelli
    Korinn Ostrow
    Neil Heffernan
    Jaclyn Ocumpaugh
    Victoria Almeda
    Stefan Slater
    International Journal of STEM Education, 5
  • [47] TRANSFORMER BASED MULTIMODAL SCENE RECOGNITION IN SOCCER VIDEOS
    Gan, Yaozong
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [48] Multimodal Emotion Recognition for Human Robot Interaction
    Adiga, Sharvari
    Vaishnavi, D. V.
    Saxena, Suchitra
    ShikhaTripathi
    2020 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2020), 2020, : 197 - 203
  • [49] Multimodal Emotion Recognition With Temporal and Semantic Consistency
    Chen, Bingzhi
    Cao, Qi
    Hou, Mixiao
    Zhang, Zheng
    Lu, Guangming
    Zhang, David
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3592 - 3603
  • [50] Multimodal sentiment and emotion recognition in hyperbolic space
    Arano, Keith April
    Orsenigo, Carlotta
    Soto, Mauricio
    Vercellis, Carlo
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184