Gender-Aware Speech Emotion Recognition in Multiple Languages

被引:0
|
作者
Nicolini, Marco [1 ]
Ntalampiras, Stavros [1 ]
机构
[1] Univ Milan, Dept Comp Sci, Milan, Italy
关键词
Audio pattern recognition; Machine learning; Transfer learning; Convolutional neural network; YAMNet; Multilingual speech emotion recognition; CORPUS;
D O I
10.1007/978-3-031-54726-3_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a solution for Speech Emotion Recognition (SER) in multilingual setting using a hierarchical approach. The approach involves two levels, the first level identifies the gender of the speaker, while the second level predicts their emotional state. We evaluate the performance of three classifiers of increasing complexity: k-NN, transfer learning based on YAMNet, and Bidirectional Long Short-Term Memory neural networks. The models were trained, validated, and tested on a dataset that includes the big-six emotions and was collected from well-known SER datasets representing six different languages. Our results indicate that there are differences in classification accuracy when considering all data versus only female or male data, across all classifiers. Interestingly, prior knowledge of the speaker's gender can improve the overall classification performance.
引用
收藏
页码:111 / 123
页数:13
相关论文
共 50 条
  • [21] Gender Specific Emotion Recognition Through Speech Signals
    Vinay
    Gupta, Shilpi
    Mehra, Anu
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 727 - 733
  • [22] CONTEXT-AWARE ATTENTION MECHANISM FOR SPEECH EMOTION RECOGNITION
    Ramet, Gaetan
    Garner, Philip N.
    Baeriswyl, Michael
    Lazaridis, Alexandros
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 126 - 131
  • [23] Towards gender-aware data systems - Indian experience
    Mukherjee, M
    ECONOMIC AND POLITICAL WEEKLY, 1996, 31 (43) : WS63 - WS71
  • [24] Building gender-aware ecosystems for learning, leadership, and growth
    Hughes, Karen D.
    Yang, Te
    GENDER IN MANAGEMENT, 2020, 35 (03): : 275 - 290
  • [25] THE GENERALIZATION EFFECT FOR MULTILINGUAL SPEECH EMOTION RECOGNITION ACROSS HETEROGENEOUS LANGUAGES
    Lee, Shi-wook
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5881 - 5885
  • [26] The Generalization Effect for Multilingual Speech Emotion Recognition across Heterogeneous Languages
    Lee, Shi-Wook
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019, 2019-May : 5881 - 5885
  • [27] SENTIMENT-AWARE AUTOMATIC SPEECH RECOGNITION PRE-TRAINING FOR ENHANCED SPEECH EMOTION RECOGNITION
    Ghriss, Ayoub
    Yang, Bo
    Rozgic, Viktor
    Shriberg, Elizabeth
    Wang, Chao
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7347 - 7351
  • [28] The Acoustically Emotion-Aware Conversational Agent With Speech Emotion Recognition and Empathetic Responses
    Hu, Jiaxiong
    Huang, Yun
    Hu, Xiaozhu
    Xu, Yingqing
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (01) : 17 - 30
  • [29] Influences of Languages in Speech Emotion Recognition: A Comparative Study Using Malay, English and Mandarin languages
    Rajoo, Rajesvary
    Aun, Ching Chee
    2016 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE), 2016, : 35 - 39
  • [30] Speech Emotion Recognition based on Multiple Feature Fusion
    Jiang, Changjiang
    Mao, Rong
    Liu, Geng
    Wang, Mingyi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 907 - 912