EmoComicNet: A multi-task model for comic emotion recognition

被引：4

作者：

Dutta, Arpita ^{[1
,2
]}

Biswas, Samit ^{[1
]}

Das, Amit Kumar ^{[1
]}

机构：

[1] Indian Inst Engn Science&Technol, Dept Comp Science&Technol, Howrah 711103, West Bengal, India

[2] Techno Main, Artificial Intelligence & Machine Learning, Dept Comp Sci & Engn, Kolkata 700091, West Bengal, India

来源：

PATTERN RECOGNITION | 2024年 / 150卷

关键词：

Comic analysis; Multi-modal emotion recognition; Document image processing; Deep learning; Multi-task learning;

D O I：

10.1016/j.patcog.2024.110261

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The emotion and sentiment associated with comic scenes can provide potential information for inferring the context of comic stories, which is an essential pre -requisite for developing comics' automatic content understanding tools. Here, we address this open area of comic research by exploiting the multi -modal nature of comics. The general assumptions for multi -modal sentiment analysis methods are that both image and text modalities are always present at the test phase. However, this assumption is not always satisfied for comics since comic characters' facial expressions, gestures, etc., are not always clearly visible. Also, the dialogues between comic characters are often challenging to comprehend the underlying context. To deal with these constraints of comic emotion analysis, we propose a multi -task -based framework, namely EmoComicNet, to fuse multi -modal information (i.e., both image and text) if it is available. However, the proposed EmoComicNet is designed to perform even when any modality is weak or completely missing. The proposed method potentially improves the overall performance. Besides, EmoComicNet can also deal with the problem of weak or absent modality during the training phase.

引用

页数：11

共 50 条

[41] Multi-label, multi-task CNN approach for context-based emotion recognition
Bendjoudi, Ilyes
Vanderhaegen, Frederic
Hamad, Denis
Dornaika, Fadi
INFORMATION FUSION, 2021, 76 : 422 - 428
[42] Multi-task Recurrent Model for True Multilingual Speech Recognition
Tang, Zhiyuan
Li, Lantian
Wang, Dong
2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
[43] FasterPest: A Multi-Task Classification Model for Rice Pest Recognition
Zhan, Xiaoyun
Zhang, Cong
Wang, Zheng
Han, Yuantao
Xiong, Peng
He, Linfeng
IEEE ACCESS, 2024, 12 : 167845 - 167855
[44] FasterPest: A Multi-Task Classification Model for Rice Pest Recognition
Zhan, Xiaoyun
Zhang, Cong
Wang, Zheng
Han, Yuantao
Xiong, Peng
He, Linfeng
IEEE ACCESS, 2024, 12 : 167845 - 167855
[45] Speech Emotion Recognition Based on Multi-Task Learning Using a Convolutional Neural Network
Kim, Nam Kyun
Lee, Jiwon
Ha, Hun Kyu
Lee, Geon Woo
Lee, Jung Hyuk
Kim, Hong Kook
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 704 - 707
[46] LEVERAGING VALENCE AND ACTIVATION INFORMATION VIA MULTI-TASK LEARNING FOR CATEGORICAL EMOTION RECOGNITION
Xia, Rui
Liu, Yang
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5301 - 5305
[47] A Co-regularization Facial Emotion Recognition Based on Multi-Task Facial Action Unit Recognition
Udeh, Chinonso Paschal
Chen, Luefeng
Du, Sheng
Li, Min
Wu, Min
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6806 - 6810
[48] Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network
Duc Le
Aldeneh, Zakaria
Provost, Emily Mower
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1108 - 1112
[49] SELECTIVE MULTI-TASK LEARNING FOR SPEECH EMOTION RECOGNITION USING CORPORA OF DIFFERENT STYLES
Zhang, Heran
Mimura, Masato
Kawahara, Tatsuya
Ishizuka, Kenkichi
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7707 - 7711
[50] A Multi-Task Framework for Weather Recognition
Li, Xuelong
Wang, Zhigang
Lu, Xiaoqiang
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1318 - 1326

← 1 2 3 4 5 →