Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features

被引:0
|
作者
Koolagudi, Shashidhar [1 ]
Krothapalli, Sreenivasa [1 ]
机构
[1] Indian Inst Technol Kharagpur, Sch Informat Technol, Kharagpur 721302, W Bengal, India
关键词
Emotion recognition; Consonant region; CV transition region; Pitch synchronous analysis; Spectral features; Vowel onset point; Vowel region;
D O I
10.1007/s10772-012-9150-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work, spectral features extracted from sub-syllabic regions and pitch synchronous analysis are proposed for speech emotion recognition. Linear prediction cepstral coefficients, mel frequency cepstral coefficients and the features extracted from high amplitude regions of spectrum are used to represent emotion specific spectral information. These features are extracted from consonant, vowel and transition regions of each syllable to study the contribution of these regions toward recognition of emotions. Consonant, vowel and the transition regions are determined using vowel onset points. Spectral features extracted from each pitch cycle, are also used to recognize emotions present in speech. The emotions used in this study are: anger, fear, happy, neutral and sad. The emotion recognition performance using sub-syllabic speech segments are compared with the results of conventional block processing approach, where entire speech signal is processed frame by frame. The proposed emotion specific features are evaluated using simulated emotion speech corpus, IITKGP-SESC (Indian Institute of Technology, KharaGPur-Simulated Emotion Speech Corpus). The emotion recognition results obtained using IITKGP-SESC are compared with the results of Berlin emotion speech corpus. Emotion recognition systems are developed using Gaussian mixture models and auto-associative neural networks. The purpose of this study is to explore sub-syllabic regions to identify the emotions embedded in a speech signal, and if possible, to avoid processing of entire speech signal for emotion recognition without serious compromise in the performance.
引用
收藏
页码:495 / 511
页数:17
相关论文
共 50 条
  • [1] Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features
    Shashidhar G. Koolagudi
    Sreenivasa Rao Krothapalli
    International Journal of Speech Technology, 2012, 15 (4) : 495 - 511
  • [2] Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units
    Lin, CH
    Wu, CH
    Ting, PY
    Wang, HM
    SPEECH COMMUNICATION, 1996, 18 (02) : 175 - 190
  • [3] Emotion Recognition from Speech Signals using Excitation Source and Spectral Features
    Choudhury, Akash Roy
    Ghosh, Anik
    Pandey, Rahul
    Barman, Subhas
    PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 257 - 261
  • [4] Automatic speech emotion recognition using modulation spectral features
    Wu, Siqing
    Falk, Tiago H.
    Chan, Wai-Yip
    SPEECH COMMUNICATION, 2011, 53 (05) : 768 - 785
  • [5] Hybrid Spectral Features for Speech Emotion Recognition
    Shah, Firoz A.
    Anto, Babu P.
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [6] Tracking sub-syllabic features in zebra finch song during development
    Meagan E Woodford
    Matthew M Benavides
    Todd W Troyer
    BMC Neuroscience, 13 (Suppl 1)
  • [7] Hierarchical emotion recognition from speech using source, power spectral and prosodic features
    Arijul Haque
    K. Sreenivasa Rao
    Multimedia Tools and Applications, 2024, 83 : 19629 - 19661
  • [8] Hierarchical emotion recognition from speech using source, power spectral and prosodic features
    Haque, Arijul
    Rao, K. Sreenivasa
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 19629 - 19661
  • [9] Processing of sub-syllabic speech units in the posterior temporal lobe: An fMRI study
    Rimol, LM
    Specht, K
    Weis, S
    Savoy, R
    Hugdahl, K
    NEUROIMAGE, 2005, 26 (04) : 1059 - 1067
  • [10] Improving Speech Emotion Recognition System Using Spectral and Prosodic Features
    Chakhtouna, Adil
    Sekkate, Sara
    Adib, Abdellah
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 399 - 409