A Preliminary Study of Emotion Recognition Employing Adaptive Gaussian Mixture Models with the Maximum A Posteriori Principle

被引:0
|
作者
Yang, Jing-Hsiang [1 ]
Hung, Jeih-weih [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Puli, Taiwan
来源
2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3 | 2014年
关键词
emotion recognition; MFCC; PLPCC; GMM; MAP adaptation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a novel processing structure to improve the performance of the automatic speech emotion recognition. In this structure, the Gaussian mixture model (GMM) is first created for each type of emotions with speech features in the training set, which consists of the utterances produced by several speakers. Next, the emotion GMMs are further adapted via a portion of the speaker-specific data in the training set using the maximum a posteriori (MAP) criterion, and thus the resulting new GMMs are expected to be better-suited for the testing utterances produced by the specific speaker in emotion recognition in comparison with the original speaker-independent GMMs. Experimental results show that after MAP adaptation for the GMMs, the emotion recognition accuracy can be improved significantly irrespective of the selected speech feature types being mel-frequency cepstral coefficients (MFCC) or perceptual linear predictive cepstral coefficients (PLPCC).
引用
收藏
页码:1575 / +
页数:2
相关论文
共 50 条
  • [21] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
  • [22] Population pharmacokinetic/pharmacodynamic mixture models via maximum a posteriori estimation
    Wang, Xiaoning
    Schumitzky, Alan
    D'Argenio, David Z.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (12) : 3907 - 3915
  • [23] Gaussian mixture language models for speech recognition
    Afify, Mohamed
    Siohan, Olivier
    Sarikaya, Ruhi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 29 - +
  • [24] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    IET SIGNAL PROCESSING, 2014, 8 (08) : 860 - 867
  • [25] The Research of Speech Emotion Recognition Based on Gaussian Mixture Model
    Zhang, Wanli
    Li, Guoxin
    Gao, Wei
    MECHANICAL COMPONENTS AND CONTROL ENGINEERING III, 2014, 668-669 : 1126 - +
  • [26] Complementary Gaussian Mixture Models for Multimodal Speech Recognition
    Sad, Gonzalo D.
    Terissi, Lucas D.
    Gomez, Juan C.
    MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, 2015, 8869 : 54 - 65
  • [27] Bayesian face recognition based on Gaussian mixture models
    Wang, XG
    Tang, XO
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 142 - 145
  • [28] License plate recognition based on Gaussian mixture models
    College of Electronics and Information Engineering, Sichuan University, Chengdu 610064, China
    Guangdianzi Jiguang, 2007, 4 (487-490):
  • [29] Gaussian mixture models of phonetic boundaries for speech recognition
    Omar, MK
    Hasegawa-Johnson, M
    Levinson, S
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 33 - 36
  • [30] EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION
    Motlicek, Petr
    Dey, Subhadeep
    Madikeri, Srikanth
    Burget, Lukas
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4445 - 4449