ShEMO: a large-scale validated database for Persian speech emotion detection

被引:35
|
作者
Nezami, Omid Mohamad [1 ]
Lou, Paria Jamshid [2 ]
Karami, Mansoureh [2 ]
机构
[1] Islamic Azad Univ, Bijar Branch, Bijar, Iran
[2] Sharif Univ Technol, Tehran, Iran
关键词
Emotional speech; Speech database; Emotion detection; Benchmark; Persian; RECOGNITION; MODEL; AGREEMENT; VALENCE; AROUSAL;
D O I
10.1007/s10579-018-9427-x
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper introduces a large-scale, validated database for Persian called Sharif Emotional Speech Database (ShEMO). The database includes 3000 semi-natural utterances, equivalent to 3h and 25min of speech data extracted from online radio plays. The ShEMO covers speech samples of 87 native-Persian speakers for five basic emotions including anger, fear, happiness, sadness and surprise, as well as neutral state. Twelve annotators label the underlying emotional state of utterances and majority voting is used to decide on the final labels. According to the kappa measure, the inter-annotator agreement is 64% which is interpreted as substantial agreement. We also present benchmark results based on common classification methods in speech emotion detection task. According to the experiments, support vector machine achieves the best results for both gender-independent (58.2%) and gender-dependent models (female=59.4%, male=57.6%). The ShEMO will be available for academic purposes free of charge to provide a baseline for further research on Persian emotional speech.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [1] ShEMO: a large-scale validated database for Persian speech emotion detection
    Omid Mohamad Nezami
    Paria Jamshid Lou
    Mansoureh Karami
    Language Resources and Evaluation, 2019, 53 : 1 - 16
  • [2] Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD)
    Keshtiari, Niloofar
    Kuhlmann, Michael
    Eslami, Moharram
    Klann-Delius, Gisela
    BEHAVIOR RESEARCH METHODS, 2015, 47 (01) : 275 - 294
  • [3] Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD)
    Niloofar Keshtiari
    Michael Kuhlmann
    Moharram Eslami
    Gisela Klann-Delius
    Behavior Research Methods, 2015, 47 : 275 - 294
  • [4] A Large-Scale Japanese Speech Database
    1600, (The International Society for Computers and Their Applications (ISCA)):
  • [5] Erratum to: Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD)
    Niloofar Keshtiari
    Michael Kuhlmann
    Moharram Eslami
    Gisela Klann-Delius
    Behavior Research Methods, 2015, 47 : 295 - 295
  • [6] HEU Emotion: a large-scale database for multimodal emotion recognition in the wild
    Jing Chen
    Chenhui Wang
    Kejun Wang
    Chaoqun Yin
    Cong Zhao
    Tao Xu
    Xinyi Zhang
    Ziqiang Huang
    Meichen Liu
    Tao Yang
    Neural Computing and Applications, 2021, 33 : 8669 - 8685
  • [7] HEU Emotion: a large-scale database for multimodal emotion recognition in the wild
    Chen, Jing
    Wang, Chenhui
    Wang, Kejun
    Yin, Chaoqun
    Zhao, Cong
    Xu, Tao
    Zhang, Xinyi
    Huang, Ziqiang
    Liu, Meichen
    Yang, Tao
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (14): : 8669 - 8685
  • [8] LSSED: A LARGE-SCALE DATASET AND BENCHMARK FOR SPEECH EMOTION RECOGNITION
    Fan, Weiquan
    Xu, Xiangmin
    Xing, Xiaofen
    Chen, Weidong
    Huang, Dongyan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 641 - 645
  • [9] A MULTI PURPOSE AND LARGE SCALE SPEECH CORPUS IN PERSIAN AND ENGLISH FOR SPEAKER AND SPEECH RECOGNITION: THE DEEPMINE DATABASE
    Zeinali, Hossein
    Burget, Lukas
    Cernocky, Jan Honza
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 397 - 402
  • [10] A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
    Esmaileyan, Z.
    Marvi, H.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2014, 27 (01): : 79 - 89