EMOTION RECOGNITION FROM SPEECH VIA BOOSTED GAUSSIAN MIXTURE MODELS

被引:0
|
作者
Tang, Hao [1 ]
Chu, Stephen M. [2 ]
Hasegawa-Johnson, Mark [1 ]
Huang, Thomas S. [1 ]
机构
[1] Univ Illinois, Dept Elect & Comp Engn, 1406 W Green St, Urbana, IL 61801 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Emotion recognition; Gaussian mixture model; Bayesian optimal classifier; EM algorithm; boosting;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set Then, classification is performed to minimize the classification error w.r.t. the estimated class-conditional distributions. We call this method the EM-GMM algorithm. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly "boosted" by the Boosted-GMM algorithm as compared to the EM-GMM algorithm. This is due to the fact that the boosting algorithm can lead to more accurate estimates of the class-conditional GMMs, namely the class-conditional distributions of acoustic features.
引用
收藏
页码:294 / +
页数:2
相关论文
共 50 条
  • [31] UNSUPERVISED TRAINING OF SUBSPACE GAUSSIAN MIXTURE MODELS FOR CONVERSATIONAL TELEPHONE SPEECH RECOGNITION
    Ma, Zejun
    Wang, Xiaorui
    Xu, Bo
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4829 - 4832
  • [32] UNSUPERVISED TRAINING OF SUBSPACE GAUSSIAN MIXTURE MODELS FOR CONVERSATIONAL TELEPHONE SPEECH RECOGNITION
    Ma, Zejun
    Wang, Xiaorui
    Xu, Bo
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4829 - 4832
  • [33] Age Approximation from Speech using Gaussian Mixture Models
    Mittal, Tanushri
    Barthwal, Anurag
    Koolagudi, Shashidhar G.
    2013 SECOND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND SECURITY (ADCONS 2013), 2013, : 74 - 78
  • [34] Audio-Visual Emotion Recognition using Gaussian Mixture Models for Face and Voice
    Metallinou, Angeliki
    Lee, Sungbok
    Narayanan, Shrikanth
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 250 - 257
  • [35] Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification
    Yun, Sungrack
    Yoo, Chang D.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 585 - 598
  • [36] DEALING WITH ACOUSTIC MISMATCH FOR TRAINING MULTILINGUAL SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Mohan, Aanchan
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4893 - 4896
  • [37] Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals
    Muthusamy, Hariharan
    Polat, Kemal
    Yaacob, Sazali
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [38] Gaussian Process Dynamical Models for Emotion Recognition
    Garcia, Hernan F.
    Alvarez, Mauricio A.
    Orozco, Alvaro
    ADVANCES IN VISUAL COMPUTING (ISVC 2014), PT II, 2014, 8888 : 799 - 808
  • [39] A Preliminary Study of Emotion Recognition Employing Adaptive Gaussian Mixture Models with the Maximum A Posteriori Principle
    Yang, Jing-Hsiang
    Hung, Jeih-weih
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1575 - +
  • [40] Speaker recognition using Gaussian mixture models
    Kamarauskas, J.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2008, (05) : 29 - 32