DESCU: Dyadic emotional speech corpus and recognition system for Urdu language

被引:3
|
作者
Qasim, Muhammad [1 ]
Habib, Tania [1 ]
Urooj, Saba [2 ]
Mumtaz, Benazir [2 ]
机构
[1] Univ Engn & Technol, Dept Comp Engn, Lahore, Pakistan
[2] Univ Engn & Technol, Ctr Language Engn, Lahore, Pakistan
关键词
Speech emotion recognition; Speech databases; Speech processing; Classification; FEATURES;
D O I
10.1016/j.specom.2023.02.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech signal contains the emotional state of a speaker along with the message. The recognition of the emotional state of a speaker helps in determining the true meaning of a message and allows for more natural communication between humans and machines. This paper presents the design and development of a dyadic emotional speech corpus for the Urdu language. The corpus is developed by recording dialog scenarios for anger, happy, neutral, and sad emotions. The performance of frame-level features, utterance -level features, and spectrograms have been evaluated in this work. Emotion recognition experiments have been conducted using classifiers including Support Vector Machine, Hidden Markov Models and Convolutional Neural Networks. Experimental results show that the utterance-level features outperform the frame-level features and spectrograms. The combined feature set of cepstral, spectral, prosodic, and voice quality features performs better than the individual feature sets. The unweighted average recalls of 84.1%, 80.2%, 84.7% have been achieved for speaker-dependent and speaker-independent and text-independent emotion recognition, respectively.
引用
收藏
页码:40 / 52
页数:13
相关论文
共 50 条
  • [21] Emotional Speech Recognition Using Rhythm Metrics and a New Arabic Corpus
    Meftah, Ali H.
    Qamhan, Mustafa
    Alotaibi, Yousef
    Selouani, Sid-Ahmed
    2020 16TH IEEE INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & ITS APPLICATIONS (CSPA 2020), 2020, : 57 - 62
  • [22] Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language
    Matsuura, Kohei
    Ueno, Sei
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2622 - 2628
  • [23] A Social Acquisition System for Minority Language Speech Corpus
    Liu, Jing-feng
    She, Yu-mei
    Hu, Wen-jun
    Pan, Wen-lin
    2015 INTERNATIONAL CONFERENCE ON APPLIED MECHANICS AND MECHATRONICS ENGINEERING (AMME 2015), 2015, : 600 - 604
  • [24] Speech emotion recognition for the Urdu languageDataset and evaluation
    Nimra Zaheer
    Obaid Ullah Ahmad
    Mudassir Shabbir
    Agha Ali Raza
    Language Resources and Evaluation, 2023, 57 : 915 - 944
  • [25] Speech Recognition Techniques for a Sign Language Recognition System
    Dreuw, Philippe
    Rybach, David
    Deselaers, Thomas
    Zahedi, Morteza
    Ney, Hermann
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 705 - 708
  • [26] Exploring corpus-invariant emotional acoustic feature for cross-corpus speech emotion recognition
    Lian, Hailun
    Lu, Cheng
    Zhao, Yan
    Li, Sunan
    Qi, Tianhua
    Zong, Yuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [27] SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla
    Sultana, Sadia
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    PLOS ONE, 2021, 16 (04):
  • [28] Language Models for Tamil Speech Recognition System
    Saraswathi, S.
    Geetha, T. V.
    IETE TECHNICAL REVIEW, 2007, 24 (05) : 375 - 383
  • [29] A PROTOTYPE FOR A SPEECH RECOGNITION SYSTEM FOR THE ITALIAN LANGUAGE
    BRANDETTI, M
    DORTA, P
    FERRETTI, M
    SCARCI, S
    ELETTROTECNICA, 1989, 76 (09): : 773 - 778
  • [30] Building a Recognition System of Speech Emotion and Emotional States
    Feng, Xiaoyan
    Watada, Junzo
    2013 SECOND INTERNATIONAL CONFERENCE ON ROBOT, VISION AND SIGNAL PROCESSING (RVSP), 2013, : 253 - 258