EmoComicNet: A multi-task model for comic emotion recognition

被引：4

作者：

Dutta, Arpita ^{[1
,2
]}

Biswas, Samit ^{[1
]}

Das, Amit Kumar ^{[1
]}

机构：

[1] Indian Inst Engn Science&Technol, Dept Comp Science&Technol, Howrah 711103, West Bengal, India

[2] Techno Main, Artificial Intelligence & Machine Learning, Dept Comp Sci & Engn, Kolkata 700091, West Bengal, India

来源：

PATTERN RECOGNITION | 2024年 / 150卷

关键词：

Comic analysis; Multi-modal emotion recognition; Document image processing; Deep learning; Multi-task learning;

D O I：

10.1016/j.patcog.2024.110261

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The emotion and sentiment associated with comic scenes can provide potential information for inferring the context of comic stories, which is an essential pre -requisite for developing comics' automatic content understanding tools. Here, we address this open area of comic research by exploiting the multi -modal nature of comics. The general assumptions for multi -modal sentiment analysis methods are that both image and text modalities are always present at the test phase. However, this assumption is not always satisfied for comics since comic characters' facial expressions, gestures, etc., are not always clearly visible. Also, the dialogues between comic characters are often challenging to comprehend the underlying context. To deal with these constraints of comic emotion analysis, we propose a multi -task -based framework, namely EmoComicNet, to fuse multi -modal information (i.e., both image and text) if it is available. However, the proposed EmoComicNet is designed to perform even when any modality is weak or completely missing. The proposed method potentially improves the overall performance. Besides, EmoComicNet can also deal with the problem of weak or absent modality during the training phase.

引用

页数：11

共 50 条

[31] Speaker independent feature selection for speech emotion recognition: A multi-task approach
Kalhor, Elham
Bakhtiari, Behzad
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 8127 - 8146
[32] MLDT: Multi-task Learning with Denoising Transformer for Gait Identity and Emotion Recognition
Sheng, Weijie
Lu, Xiaoyan
Li, Xinde
AICCC 2021: 2021 4TH ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE, 2021, : 47 - 52
[33] Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition
Latif, Siddique
Rana, Rajib
Khalifa, Sara
Jurdak, Raja
Epps, Julien
Schuller, Bjoern W.
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 992 - 1004
[34] Speaker independent feature selection for speech emotion recognition: A multi-task approach
Elham Kalhor
Behzad Bakhtiari
Multimedia Tools and Applications, 2021, 80 : 8127 - 8146
[35] Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning
Zhao, Huijuan
Ye, Ning
Wang, Ruchuan
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (2-3): : 299 - 308
[36] Multi-task learning on the edge for effective gender, age, ethnicity and emotion recognition
Foggia, Pasquale
Greco, Antonio
Saggese, Alessia
Vento, Mario
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 118
[37] Comic MTL: optimized multi-task learning for comic book image analysis
Nhu-Van Nguyen
Rigaud, Christophe
Burie, Jean-Christophe
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (03) : 265 - 284
[38] Comic MTL: optimized multi-task learning for comic book image analysis
Nhu-Van Nguyen
Christophe Rigaud
Jean-Christophe Burie
International Journal on Document Analysis and Recognition (IJDAR), 2019, 22 : 265 - 284
[39] A Multi-Task Model for Multi-Attribute Fashion Recognition and Retrieval
Sun, Yang
Wong, Wai Keung
Zou, Xingxing
AATCC JOURNAL OF RESEARCH, 2021, 8 (1_SUPPL) : 106 - 117
[40] A Multi-Task Model for Multi-Attribute Fashion Recognition and Retrieval
Sun, Yang
Wong, Wai Keung
Zou, Xingxing
AATCC JOURNAL OF RESEARCH, 2021, 8 : 105 - 116

← 1 2 3 4 5 →