Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects

被引:46
|
作者
Zhang, Shiqing [1 ]
Yang, Yijiao [1 ]
Chen, Chen [1 ]
Zhang, Xingnan [1 ]
Leng, Qingming [2 ]
Zhao, Xiaoming [1 ]
机构
[1] Taizhou Univ, Inst Intelligent Informat Proc, Taizhou 318000, Zhejiang, Peoples R China
[2] Jiujiang Univ, Sch Elect & Informat Engn, Jiujiang 332005, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Multimodal emotion recognition; Deep learning; Feature extraction; Multimodal information fusion; review; FACIAL EXPRESSION RECOGNITION; INFORMATION FUSION; AFFECTIVE FEATURES; SENTIMENT ANALYSIS; NEURAL-NETWORKS; SPEECH; DATABASES; MODEL; DIMENSIONALITY; SIGNALS;
D O I
10.1016/j.eswa.2023.121692
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition has recently attracted extensive interest due to its significant applications to human-computer interaction. The expression of human emotion depends on various verbal and non-verbal languages like audio, visual, text, etc. Emotion recognition is thus well suited as a multimodal rather than single-modal learning problem. Owing to the powerful feature learning capability, extensive deep learning methods have been recently leveraged to capture high-level emotional feature representations for multimodal emotion recognition (MER). Therefore, this paper makes the first effort in comprehensively summarize recent advances in deep learning-based multimodal emotion recognition (DL-MER) involved in audio, visual, and text modalities. We focus on: (1) MER milestones are given to summarize the development tendency of MER, and conventional multimodal emotional datasets are provided; (2) The core principles of typical deep learning models and its recent advancements are overviewed; (3) A systematic survey and taxonomy is provided to cover the state-of-theart methods related to two key steps in a MER system, including feature extraction and multimodal information fusion; (4) The research challenges and open issues in this field are discussed, and promising future directions are given.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] A systematic review of trimodal affective computing approaches: Text, audio, and visual integration in emotion recognition and sentiment analysis
    Al-Saadawi, Hussein Farooq Tayeb
    Das, Bihter
    Das, Resul
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [22] Deep Learning-Based Emotion Recognition from Real-Time Videos
    Zhou, Wenbin
    Cheng, Justin
    Lei, Xingyu
    Benes, Bedrich
    Adamo, Nicoletta
    HUMAN-COMPUTER INTERACTION. MULTIMODAL AND NATURAL INTERACTION, HCI 2020, PT II, 2020, 12182 : 321 - 332
  • [23] Advancements in Food Recognition: A Comprehensive Review of Deep Learning-Based Automated Food Item Identification
    Krutik, Rathod
    Thacker, Chintan
    Adhvaryu, Rachit
    PROGRAM OF THE 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATIC CONTROL, ICEEAC 2024, 2024,
  • [24] Emotion recognition using deep learning approach from audio-visual emotional big data
    Hossain, M. Shamim
    Muhammad, Ghulam
    INFORMATION FUSION, 2019, 49 : 69 - 78
  • [25] Deep Learning-Based Dermatological Condition Detection: A Systematic Review With Recent Methods, Datasets, Challenges, and Future Directions
    Noronha, Stephanie S.
    Mehta, Mayuri A.
    Garg, Dweepna
    Kotecha, Ketan
    Abraham, Ajith
    IEEE ACCESS, 2023, 11 : 140348 - 140381
  • [26] Deep learning-based depression recognition through facial expression: A systematic review
    Cao, Xiaoming
    Zhai, Lingling
    Zhai, Pengpeng
    Li, Fangfei
    He, Tao
    He, Lang
    NEUROCOMPUTING, 2025, 627
  • [27] Deep Learning-Based Automated Emotion Recognition Using Multimodal Physiological Signals and Time-Frequency Methods
    Sriram Kumar, P.
    Govarthan, Praveen Kumar
    Gadda, Abdul Aleem Shaik
    Ganapathy, Nagarajan
    Ronickom, Jac Fredo Agastinose
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
  • [28] Deep learning-based personality recognition from text posts of online social networks
    Di Xue
    Lifa Wu
    Zheng Hong
    Shize Guo
    Liang Gao
    Zhiyong Wu
    Xiaofeng Zhong
    Jianshan Sun
    Applied Intelligence, 2018, 48 : 4232 - 4246
  • [29] Deep learning-based personality recognition from text posts of online social networks
    Xue, Di
    Wu, Lifa
    Hong, Zheng
    Guo, Shize
    Gao, Liang
    Wu, Zhiyong
    Zhong, Xiaofeng
    Sun, Jianshan
    APPLIED INTELLIGENCE, 2018, 48 (11) : 4232 - 4246
  • [30] A Systematic Review on Recent Advancements in Deep and Machine Learning Based Detection and Classification of Acute Lymphoblastic Leukemia
    Das, Pradeep Kumar
    Diya, V. A.
    Meher, Sukadev
    Panda, Rutuparna
    Abraham, Ajith
    IEEE ACCESS, 2022, 10 : 81741 - 81763