Contribution of modulation spectral features for cross-lingual speech emotion recognition under noisy reverberant conditions

被引:0
|
作者
Guo, Taiyang [1 ]
Li, Sixia [1 ]
Kidani, Shunsuke [1 ]
Okada, Shogo [1 ]
Unoki, Masashi [1 ]
机构
[1] Japan Adv Inst Sci & Technol, 1-1 Asahidai, Nomi, Ishikawa 9231292, Japan
基金
日本学术振兴会;
关键词
D O I
10.1109/APSIPAASC58517.2023.10317449
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handling multiple languages under noisy reverberant conditions has become increasingly important for speech emotion recognition (SER). Previous studies found that modulation spectral features (MSFs) are robust to noisy reverberant conditions for SER. However, they mainly focused on specific languages; the universality of MSFs among languages is still unclear. To address this issue, we compared MSFs, hand-crafted features, Wav2Vec2.0-based features, MSFs+hand-crafted features for SER on four languages under 12 noisy reverberant conditions. Intra-lingual results showed that MSFs+hand-crafted features performed best on most conditions of all languages. Inter-lingual results showed that MSFs performed best on most conditions of test languages except training on a tonal language and testing on others. The results demonstrate that MSFs are robust to multilingual SER under noisy reverberant conditions and suggest that MSFs are potentially language-independent features for nontonal languages.
引用
收藏
页码:2221 / 2227
页数:7
相关论文
共 50 条
  • [31] Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
    Cahyawijaya, Samuel
    Lovenia, Holy
    Chung, Willy
    Frieske, Rita
    Liu, Zihan
    Fung, Pascale
    INTERSPEECH 2023, 2023, : 3352 - 3356
  • [32] Adverse Conditions and Techniques for Cross-Lingual Text Recognition
    Kaur, Achint
    Shrawankar, Urmila
    2017 INTERNATIONAL CONFERENCE ON INNOVATIVE MECHANISMS FOR INDUSTRY APPLICATIONS (ICIMIA), 2017, : 70 - 74
  • [33] Cross-Lingual Acoustic modeling for Dialectal Arabic Speech Recognition
    Elmahdy, Mohamed
    Gruhn, Rainer
    Minker, Wolfgang
    Abdennadher, Slim
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 873 - +
  • [34] A Preliminary Study of Cross-lingual Emotion Recognition from Speech: Automatic Classification versus Human Perception
    Jeon, Je Hun
    Le, Duc
    Xia, Rui
    Liu, Yang
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2836 - 2839
  • [35] Cross-lingual Speech Emotion Recognition System Based on a Three-Layer Model for Human Perception
    Elbarougy, Reda
    Akagi, Masato
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [36] An Auditory Based Modulation Spectral Feature for Reverberant Speech Recognition
    Maganti, HariKrishna
    Matassoni, Marco
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 570 - 573
  • [37] DISTANT SPEECH RECOGNITION IN REVERBERANT NOISY CONDITIONS EMPLOYING A MICROPHONE ARRAY
    Morales-Cordovilla, Juan A.
    Hagmueller, Martin
    Pessentheiner, Hannes
    Kubin, Gernot
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2380 - 2384
  • [38] Modulation Spectral Features for Intrusive Measurement of Reverberant Speech Quality
    Ma, Sai
    Zhang, Hui
    Xie, Lingyun
    Xie, Xi
    DIGITAL TV AND MULTIMEDIA COMMUNICATION, 2019, 1009 : 284 - 295
  • [39] Exploiting Cross-Domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition
    Hu, Shujie
    Xie, Xurong
    Geng, Mengzhe
    Cui, Mingyu
    Deng, Jiajun
    Li, Guinan
    Wang, Tianzi
    Meng, Helen
    Liu, Xunying
    INTERSPEECH 2023, 2023, : 2313 - 2317
  • [40] Modulation spectral features for speech emotion recognition using deep neural networks
    Singh, Premjeet
    Sahidullah, Md
    Saha, Goutam
    SPEECH COMMUNICATION, 2023, 146 : 53 - 69