A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach

被引:52
|
作者
Seng, Kah Phooi [1 ]
Ang, Li-Minn [1 ]
Ooi, Chien Shing [2 ]
机构
[1] Charles Sturt Univ, Sch Comp & Math, Bathurst, NSW 2678, Australia
[2] Sunway Univ, Dept Comp Sci & Networked Syst, Subang Jaya 47500, Malaysia
关键词
Emotion recognition; audio-visual processing; rule-based; machine learning; multimodal system; LINEAR DISCRIMINANT-ANALYSIS; EFFICIENT APPROACH; FACE; FRAMEWORK; FUSION; AUDIO; LDA;
D O I
10.1109/TAFFC.2016.2588488
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an audio-visual emotion recognition system that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio and video paths. The visual path is designed using the Bi-directional Principal Component Analysis (BDPCA) and Least-Square Linear Discriminant Analysis (LSLDA) for dimensionality reduction and discrimination. The extracted visual features are passed into a newly designed Optimized Kernel-Laplacian Radial Basis Function (OKL-RBF) neural classifier. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). The extracted audio features are passed into an audio feature level fusion module that uses a set of rules to determine the most likely emotion contained in the audio signal. An audio visual fusion module fuses outputs from both paths. The performances of the proposed audio path, visual path, and the final system are evaluated on standard databases. Experiment results and comparisons reveal the good performance of the proposed system.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 50 条
  • [1] Audio-visual based emotion recognition - A new approach
    Song, ML
    Bu, JJ
    Chen, C
    Li, N
    PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 1020 - 1025
  • [2] Audio-Visual Learning for Multimodal Emotion Recognition
    Fan, Siyu
    Jing, Jianan
    Wang, Chongwen
    SYMMETRY-BASEL, 2025, 17 (03):
  • [3] Deep Learning Based Audio-Visual Emotion Recognition in a Smart Learning Environment
    Ivleva, Natalja
    Pentel, Avar
    Dunajeva, Olga
    Justsenko, Valeria
    TOWARDS A HYBRID, FLEXIBLE AND SOCIALLY ENGAGED HIGHER EDUCATION, VOL 1, ICL 2023, 2024, 899 : 420 - 431
  • [4] Metric Learning-Based Multimodal Audio-Visual Emotion Recognition
    Ghaleb, Esam
    Popa, Mirela
    Asteriadis, Stylianos
    IEEE MULTIMEDIA, 2020, 27 (01) : 37 - 48
  • [5] Audio-visual spontaneous emotion recognition
    Zeng, Zhihong
    Hu, Yuxiao
    Roisman, Glenn I.
    Wen, Zhen
    Fu, Yun
    Huang, Thomas S.
    ARTIFICIAL INTELLIGENCE FOR HUMAN COMPUTING, 2007, 4451 : 72 - +
  • [6] Deep emotion recognition based on audio-visual correlation
    Hajarolasvadi, Noushin
    Demirel, Hasan
    IET COMPUTER VISION, 2020, 14 (07) : 517 - 527
  • [7] An Active Learning Paradigm for Online Audio-Visual Emotion Recognition
    Kansizoglou, Ioannis
    Bampis, Loukas
    Gasteratos, Antonios
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 756 - 768
  • [8] Emotion Recognition From Audio-Visual Data Using Rule Based Decision Level Fusion
    Sahoo, Subhasmita
    Routray, Aurobinda
    PROCEEDINGS OF THE 2016 IEEE STUDENTS' TECHNOLOGY SYMPOSIUM (TECHSYM), 2016, : 7 - 12
  • [9] Audio-Visual Attention Networks for Emotion Recognition
    Lee, Jiyoung
    Kim, Sunok
    Kim, Seungryong
    Sohn, Kwanghoon
    AVSU'18: PROCEEDINGS OF THE 2018 WORKSHOP ON AUDIO-VISUAL SCENE UNDERSTANDING FOR IMMERSIVE MULTIMEDIA, 2018, : 27 - 32
  • [10] Deep operational audio-visual emotion recognition
    Akturk, Kaan
    Keceli, Ali Seydi
    NEUROCOMPUTING, 2024, 588