Exploring the Contextual Factors Affecting Multimodal Emotion Recognition in Videos

被引：19

作者：

Bhattacharya, Prasanta ^{[1
]}

Gupta, Raj Kumar ^{[1
]}

Yang, Yinping ^{[1
]}

机构：

[1] Agcy Sci Technol & Res STAR, Inst High Performance Comp, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2023年 / 14卷 / 02期

关键词：

Emotion recognition; Videos; Visualization; Feature extraction; Physiology; High performance computing; Distance measurement; Affective computing; affect sensing and analysis; modelling human emotions; multi-modal recognition; sentiment analysis; technology & devices for affective computing; SEX-DIFFERENCES; FACIAL EXPRESSIONS; LANGUAGES; SELECTION; MODEL;

D O I：

10.1109/TAFFC.2021.3071503

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used to better recognize emotions in certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion features derived from facial expressions, tone and text in conjunction with two key contextual factors: i) gender of the speaker, and ii) duration of the emotional episode. Using a large public dataset of 2,176 manually annotated YouTube videos, we found that while multimodal features consistently outperformed bimodal and unimodal features, their performance varied significantly across different emotions, gender and duration contexts. Multimodal features performed particularly better for male speakers in recognizing most emotions. Furthermore, multimodal features performed particularly better for shorter than for longer videos in recognizing neutral and happiness, but not sadness and anger. These findings offer new insights towards the development of more context-aware emotion recognition and empathetic systems.

引用

页码：1547 / 1557

页数：11

共 50 条

[41] Socializing the Videos: A Multimodal Approach for Social Relation Recognition
Xu, Tong
Zhou, Peilun
Hu, Linkang
He, Xiangnan
Hu, Yao
Chen, Enhong
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
[42] Contextual factors affecting hint utility
Inventado, Paul Salvador
Scupelli, Peter
Ostrow, Korinn
Heffernan, Neil, III
Ocumpaugh, Jaclyn
Almeda, Victoria
Slater, Stefan
INTERNATIONAL JOURNAL OF STEM EDUCATION, 2018, 5
[43] Emotion Recognition from Videos Using Facial Expressions
Selvi, P. Tamil
Vyshnavi, P.
Jagadish, R.
Srikumar, Shravan
Veni, S.
ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 565 - 576
[44] Multimodal Multipart Learning for Action Recognition in Depth Videos
Shahroudy, Amir
Ng, Tian-Tsong
Yang, Qingxiong
Wang, Gang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2123 - 2129
[45] Emotion Recognition in the Wild from Videos using Images
Bargal, Sarah Adel
Barsoum, Emad
Ferrer, Cristian Canton
Zhang, Cha
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 433 - 436
[46] Contextual factors affecting hint utility
Paul Salvador Inventado
Peter Scupelli
Korinn Ostrow
Neil Heffernan
Jaclyn Ocumpaugh
Victoria Almeda
Stefan Slater
International Journal of STEM Education, 5
[47] TRANSFORMER BASED MULTIMODAL SCENE RECOGNITION IN SOCCER VIDEOS
Gan, Yaozong
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
[48] Multimodal Emotion Recognition for Human Robot Interaction
Adiga, Sharvari
Vaishnavi, D. V.
Saxena, Suchitra
ShikhaTripathi
2020 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2020), 2020, : 197 - 203
[49] Multimodal Emotion Recognition With Temporal and Semantic Consistency
Chen, Bingzhi
Cao, Qi
Hou, Mixiao
Zhang, Zheng
Lu, Guangming
Zhang, David
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3592 - 3603
[50] Multimodal sentiment and emotion recognition in hyperbolic space
Arano, Keith April
Orsenigo, Carlotta
Soto, Mauricio
Vercellis, Carlo
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184

← 1 2 3 4 5 →