A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach

被引:52
|
作者
Seng, Kah Phooi [1 ]
Ang, Li-Minn [1 ]
Ooi, Chien Shing [2 ]
机构
[1] Charles Sturt Univ, Sch Comp & Math, Bathurst, NSW 2678, Australia
[2] Sunway Univ, Dept Comp Sci & Networked Syst, Subang Jaya 47500, Malaysia
关键词
Emotion recognition; audio-visual processing; rule-based; machine learning; multimodal system; LINEAR DISCRIMINANT-ANALYSIS; EFFICIENT APPROACH; FACE; FRAMEWORK; FUSION; AUDIO; LDA;
D O I
10.1109/TAFFC.2016.2588488
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an audio-visual emotion recognition system that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio and video paths. The visual path is designed using the Bi-directional Principal Component Analysis (BDPCA) and Least-Square Linear Discriminant Analysis (LSLDA) for dimensionality reduction and discrimination. The extracted visual features are passed into a newly designed Optimized Kernel-Laplacian Radial Basis Function (OKL-RBF) neural classifier. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). The extracted audio features are passed into an audio feature level fusion module that uses a set of rules to determine the most likely emotion contained in the audio signal. An audio visual fusion module fuses outputs from both paths. The performances of the proposed audio path, visual path, and the final system are evaluated on standard databases. Experiment results and comparisons reveal the good performance of the proposed system.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 50 条
  • [21] AUDIO-VISUAL EMOTION RECOGNITION WITH BOOSTED COUPLED HMM
    Lu, Kun
    Jia, Yunde
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1148 - 1151
  • [22] Temporal aggregation of audio-visual modalities for emotion recognition
    Birhala, Andreea
    Ristea, Catalin Nicolae
    Radoi, Anamaria
    Dutu, Liviu Cristian
    2020 43RD INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2020, : 305 - 308
  • [23] AUDIO-VISUAL EMOTION RECOGNITION USING BOLTZMANN ZIPPERS
    Lu, Kun
    Jia, Yunde
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 2589 - 2592
  • [24] Fusion of Classifier Predictions for Audio-Visual Emotion Recognition
    Noroozi, Fatemeh
    Marjanovic, Marina
    Njegus, Angelina
    Escalera, Sergio
    Anbarjafari, Gholamreza
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 61 - 66
  • [25] Audio-visual emotion recognition with multilayer boosted HMM
    Lü, Kun
    Jia, Yun-De
    Zhang, Xin
    Lü, K. (kunlv@bit.edu.cn), 1600, Beijing Institute of Technology (22): : 89 - 93
  • [26] Audio-visual emotion recognition with multilayer boosted HMM
    吕坤
    贾云得
    张欣
    JournalofBeijingInstituteofTechnology, 2013, 22 (01) : 89 - 93
  • [27] Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes
    Ito, Koichiro
    Fujioka, Takuya
    Sun, Qinghua
    Nagamatsu, Kenji
    INTERSPEECH 2021, 2021, : 4493 - 4497
  • [28] Fusion of deep learning features with mixture of brain emotional learning for audio-visual emotion recognition
    Farhoudi, Zeinab
    Setayeshi, Saeed
    SPEECH COMMUNICATION, 2021, 127 : 92 - 103
  • [29] Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived Labels
    Lei, Yuanyuan
    Cao, Houwei
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2954 - 2969
  • [30] Audio-visual based emotion recognition using tripled hidden Markov model
    Song, ML
    Chen, C
    You, MY
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 877 - 880