DESCU: Dyadic emotional speech corpus and recognition system for Urdu language

被引:3
|
作者
Qasim, Muhammad [1 ]
Habib, Tania [1 ]
Urooj, Saba [2 ]
Mumtaz, Benazir [2 ]
机构
[1] Univ Engn & Technol, Dept Comp Engn, Lahore, Pakistan
[2] Univ Engn & Technol, Ctr Language Engn, Lahore, Pakistan
关键词
Speech emotion recognition; Speech databases; Speech processing; Classification; FEATURES;
D O I
10.1016/j.specom.2023.02.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech signal contains the emotional state of a speaker along with the message. The recognition of the emotional state of a speaker helps in determining the true meaning of a message and allows for more natural communication between humans and machines. This paper presents the design and development of a dyadic emotional speech corpus for the Urdu language. The corpus is developed by recording dialog scenarios for anger, happy, neutral, and sad emotions. The performance of frame-level features, utterance -level features, and spectrograms have been evaluated in this work. Emotion recognition experiments have been conducted using classifiers including Support Vector Machine, Hidden Markov Models and Convolutional Neural Networks. Experimental results show that the utterance-level features outperform the frame-level features and spectrograms. The combined feature set of cepstral, spectral, prosodic, and voice quality features performs better than the individual feature sets. The unweighted average recalls of 84.1%, 80.2%, 84.7% have been achieved for speaker-dependent and speaker-independent and text-independent emotion recognition, respectively.
引用
收藏
页码:40 / 52
页数:13
相关论文
共 50 条
  • [31] Corpus for automatic speech recognition
    Adda-Decker, Martine
    REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2007, 12 (01): : 71 - 84
  • [32] Towards building a Urdu Language Corpus using Common Crawl
    Shafiq, Hafiz Muhammad
    Tahir, Bilal
    Mehmood, Muhammad Amir
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2445 - 2455
  • [33] DEVELOPING A THAI EMOTIONAL SPEECH CORPUS
    Kasuriya, Sawit
    Teeramunkong, Thanaruk
    Wutiwiwatchai, Chai
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [34] Development of Text and Speech Corpus for Designing the Multilingual Recognition System
    Bansal, Shweta
    Agrawal, Shyam S.
    2018 ORIENTAL COCOSDA - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2018, : 1 - 7
  • [35] Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus
    Yu, Jianwei
    Xie, Xurong
    Liu, Shansong
    Hu, Shoukang
    Lam, Max W. Y.
    Wu, Xixin
    Wong, Ka Ho
    Liu, Xunying
    Meng, Helen
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2938 - 2942
  • [36] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
    Bouziane, Ayoub
    Kadi, Houda
    Hourri, Soufiane
    Kharroubi, Jamal
    2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
  • [37] Language Model Adaptation for Emotional Speech Recognition using Tweet data
    Saeki, Kazuya
    Kato, Masaharu
    Kosaka, Tetsuo
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 371 - 375
  • [38] STRESS ANNOTATED URDU SPEECH CORPUS TO BUILD FEMALE VOICE FOR TTS
    Mumtaz, Benazir
    Urooj, Saba
    Hussain, Sarmad
    Habib, Wajiha
    2015 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2015 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2015, : 13 - 20
  • [39] A statistical based part of speech tagger for Urdu language
    Anwar, Waqas
    Wang, Xuan
    Li, Lu
    Wang, Xiao-Long
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3418 - 3424
  • [40] Determining the Voiceprint Recognition on the Basis of Emotional Speech Signal: Indonesia Language
    Idananta, Kanvadian
    Oktriono, Kristianus
    2017 3RD INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT (ICIM 2017), 2017, : 388 - 392