Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features

被引:0
|
作者
Koolagudi, Shashidhar [1 ]
Krothapalli, Sreenivasa [1 ]
机构
[1] Indian Inst Technol Kharagpur, Sch Informat Technol, Kharagpur 721302, W Bengal, India
关键词
Emotion recognition; Consonant region; CV transition region; Pitch synchronous analysis; Spectral features; Vowel onset point; Vowel region;
D O I
10.1007/s10772-012-9150-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work, spectral features extracted from sub-syllabic regions and pitch synchronous analysis are proposed for speech emotion recognition. Linear prediction cepstral coefficients, mel frequency cepstral coefficients and the features extracted from high amplitude regions of spectrum are used to represent emotion specific spectral information. These features are extracted from consonant, vowel and transition regions of each syllable to study the contribution of these regions toward recognition of emotions. Consonant, vowel and the transition regions are determined using vowel onset points. Spectral features extracted from each pitch cycle, are also used to recognize emotions present in speech. The emotions used in this study are: anger, fear, happy, neutral and sad. The emotion recognition performance using sub-syllabic speech segments are compared with the results of conventional block processing approach, where entire speech signal is processed frame by frame. The proposed emotion specific features are evaluated using simulated emotion speech corpus, IITKGP-SESC (Indian Institute of Technology, KharaGPur-Simulated Emotion Speech Corpus). The emotion recognition results obtained using IITKGP-SESC are compared with the results of Berlin emotion speech corpus. Emotion recognition systems are developed using Gaussian mixture models and auto-associative neural networks. The purpose of this study is to explore sub-syllabic regions to identify the emotions embedded in a speech signal, and if possible, to avoid processing of entire speech signal for emotion recognition without serious compromise in the performance.
引用
收藏
页码:495 / 511
页数:17
相关论文
共 50 条
  • [21] RECOGNITION OF EMOTION IN SPEECH USING SPECTRAL PATTERNS
    Shahzadi, Ali
    Ahmadyfard, Alireza
    Yaghmaie, Khashayar
    Harimi, Ali
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2013, 26 (02) : 140 - 158
  • [22] Speech Emotion Recognition Using Spectral Entropy
    Lee, Woo-Seok
    Roh, Yong-Wan
    Kim, Dong-Ju
    Kim, Jung-Hyun
    Hong, Kwang-Seok
    INTELLIGENT ROBOTICS AND APPLICATIONS, PT II, PROCEEDINGS, 2008, 5315 : 45 - 54
  • [23] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
  • [24] Speech Emotion Recognition using Combination of Features
    Zhang, Qingli
    An, Ning
    Wang, Kunxia
    Ren, Fuji
    Li, Lian
    PROCEEDINGS OF THE 2013 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2013, : 523 - 528
  • [25] Brhamo: metaheuristic optimization algorithm for speech emotion recognition using spectral and hybrid features
    Agrawal, Akshat
    Jain, Anurag
    EVOLUTIONARY INTELLIGENCE, 2025, 18 (01)
  • [26] Emotion recognition from speech using global and local prosodic features
    Rao K.S.
    Koolagudi S.G.
    Vempada R.R.
    International Journal of Speech Technology, 2013, 16 (2) : 143 - 160
  • [27] Emotion recognition from telephone speech using acoustic and nonlinear features
    Bedoya-Jaramillo, S.
    Orozco-Arroyave, J. R.
    Arias-Londono, J. D.
    Vargas-Bonilla, J. F.
    2013 47TH INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2013,
  • [28] Emotion recognition from speech using source, system, and prosodic features
    Koolagudi, Shashidhar G.
    Rao, K. Sreenivasa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 265 - 289
  • [29] Emotion recognition from speech signals using new harmony features
    Yang, B.
    Lugger, M.
    SIGNAL PROCESSING, 2010, 90 (05) : 1415 - 1423
  • [30] SPEECH EMOTION RECOGNITION USING CYCLOSTATIONARY SPECTRAL ANALYSIS
    Jalili, Amin
    Sahami, Sadid
    Chi, Chong-Yung
    Amirfattahi, Rassoul
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,