Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects

被引:46
|
作者
Zhang, Shiqing [1 ]
Yang, Yijiao [1 ]
Chen, Chen [1 ]
Zhang, Xingnan [1 ]
Leng, Qingming [2 ]
Zhao, Xiaoming [1 ]
机构
[1] Taizhou Univ, Inst Intelligent Informat Proc, Taizhou 318000, Zhejiang, Peoples R China
[2] Jiujiang Univ, Sch Elect & Informat Engn, Jiujiang 332005, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Multimodal emotion recognition; Deep learning; Feature extraction; Multimodal information fusion; review; FACIAL EXPRESSION RECOGNITION; INFORMATION FUSION; AFFECTIVE FEATURES; SENTIMENT ANALYSIS; NEURAL-NETWORKS; SPEECH; DATABASES; MODEL; DIMENSIONALITY; SIGNALS;
D O I
10.1016/j.eswa.2023.121692
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition has recently attracted extensive interest due to its significant applications to human-computer interaction. The expression of human emotion depends on various verbal and non-verbal languages like audio, visual, text, etc. Emotion recognition is thus well suited as a multimodal rather than single-modal learning problem. Owing to the powerful feature learning capability, extensive deep learning methods have been recently leveraged to capture high-level emotional feature representations for multimodal emotion recognition (MER). Therefore, this paper makes the first effort in comprehensively summarize recent advances in deep learning-based multimodal emotion recognition (DL-MER) involved in audio, visual, and text modalities. We focus on: (1) MER milestones are given to summarize the development tendency of MER, and conventional multimodal emotional datasets are provided; (2) The core principles of typical deep learning models and its recent advancements are overviewed; (3) A systematic survey and taxonomy is provided to cover the state-of-theart methods related to two key steps in a MER system, including feature extraction and multimodal information fusion; (4) The research challenges and open issues in this field are discussed, and promising future directions are given.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] A multimodal fusion-based deep learning framework combined with local-global contextual TCNs for continuous emotion recognition from videos
    Congbao Shi
    Yuanyuan Zhang
    Baolin Liu
    Applied Intelligence, 2024, 54 : 3040 - 3057
  • [42] Framework for Deep Learning-Based Language Models Using Multi-Task Learning in Natural Language Understanding: A Systematic Literature Review and Future Directions
    Samant, Rahul Manohar
    Bachute, Mrinal R.
    Gite, Shilpa
    Kotecha, Ketan
    IEEE ACCESS, 2022, 10 : 17078 - 17097
  • [43] A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis
    Jiang, Peng
    Li, Xuekong
    Shen, Hui
    Chen, Yuqi
    Wang, Lang
    Chen, Hua
    Feng, Jing
    Liu, Juan
    ARTIFICIAL INTELLIGENCE REVIEW, 2023,
  • [44] A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis
    Jiang, Peng
    Li, Xuekong
    Shen, Hui
    Chen, Yuqi
    Wang, Lang
    Chen, Hua
    Feng, Jing
    Liu, Juan
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (03) : S2687 - S2758
  • [45] A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis
    Peng Jiang
    Xuekong Li
    Hui Shen
    Yuqi Chen
    Lang Wang
    Hua Chen
    Jing Feng
    Juan Liu
    Artificial Intelligence Review, 2023, 56 : 2687 - 2758
  • [46] A systematic review of deep learning-based denoising for low-dose computed tomography from a perceptual quality perspective
    Kim, Wonjin
    Jeon, Sun-Young
    Byun, Gyuri
    Yoo, Hongki
    Choi, Jang-Hwan
    BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (06) : 1153 - 1173
  • [47] A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-Based Semantic Scene Understanding Using Multi-Modal Sensor Fusion
    Lai, Tin
    SENSORS, 2022, 22 (19)
  • [48] RETRACTED: Machine Learning-Based Automated Diagnostic Systems Developed for Heart Failure Prediction Using Different Types of Data Modalities: A Systematic Review and Future Directions (Retracted Article)
    Javeed, Ashir
    Khan, Shafqat Ullah
    Ali, Liaqat
    Ali, Sardar
    Imrana, Yakubu
    Rahman, Atiqur
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022
  • [49] Forecasting vegetation indices from spatio-temporal remotely sensed data using deep learning-based approaches: A systematic literature review
    Ferchichi, Aya
    Ben Abbes, Ali
    Barra, Vincent
    Farah, Imed Riadh
    ECOLOGICAL INFORMATICS, 2022, 68
  • [50] Deep learning-based techniques for estimating high-quality full-dose positron emission tomography images from low-dose scans: a systematic review
    Seyyedi, Negisa
    Ghafari, Ali
    Seyyedi, Navisa
    Sheikhzadeh, Peyman
    BMC MEDICAL IMAGING, 2024, 24 (01):