EMOTION RECOGNITION FROM SPEECH VIA BOOSTED GAUSSIAN MIXTURE MODELS

被引：0

作者：

Tang, Hao ^{[1
]}

Chu, Stephen M. ^{[2
]}

Hasegawa-Johnson, Mark ^{[1
]}

Huang, Thomas S. ^{[1
]}

机构：

[1] Univ Illinois, Dept Elect & Comp Engn, 1406 W Green St, Urbana, IL 61801 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3 | 2009年

关键词：

Emotion recognition; Gaussian mixture model; Bayesian optimal classifier; EM algorithm; boosting;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set Then, classification is performed to minimize the classification error w.r.t. the estimated class-conditional distributions. We call this method the EM-GMM algorithm. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly "boosted" by the Boosted-GMM algorithm as compared to the EM-GMM algorithm. This is due to the fact that the boosting algorithm can lead to more accurate estimates of the class-conditional GMMs, namely the class-conditional distributions of acoustic features.

引用

页码：294 / +

页数：2

共 50 条

[41] Skew Gaussian mixture models for speaker recognition
Matza, Avi
Bistritz, Yuval
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
[42] Skew Gaussian mixture models for speaker recognition
Matza, Avi
Bistritz, Yuval
IET SIGNAL PROCESSING, 2014, 8 (08) : 860 - 867
[43] Speech Emotion Recognition Based on Dynamic Models
Lv, Guoyun
Hu, Shuixian
Lu, Xipan
2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 480 - 484
[44] Speech emotion recognition: Features and classification models
Chen, Lijiang
Mao, Xia
Xue, Yuli
Cheng, Lee Lung
DIGITAL SIGNAL PROCESSING, 2012, 22 (06) : 1154 - 1160
[45] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
[46] MAXIMUM A POSTERIORI ADAPTATION OF SUBSPACE GAUSSIAN MIXTURE MODELS FOR CROSS-LINGUAL SPEECH RECOGNITION
Lu, Liang
Ghoshal, Arnab
Renals, Steve
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4877 - 4880
[47] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
Lu, Liang
Ghoshal, Arnab
Renals, Steve
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27
[48] DEEP NEURAL NETWORKS WITH AUXILIARY GAUSSIAN MIXTURE MODELS FOR REAL-TIME SPEECH RECOGNITION
Lei, Xin
Lin, Hui
Heigold, Georg
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7634 - 7638
[49] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
[50] Speech emotion recognition via learning analogies
Ntalampiras, Stavros
PATTERN RECOGNITION LETTERS, 2021, 144 : 21 - 26

← 1 2 3 4 5 →