A Cross-Corpus Speech-Based Analysis of Escalating Negative Interactions

被引:3
|
作者
Lefter, Iulia [1 ]
Baird, Alice [2 ]
Stappen, Lukas [2 ]
Schuller, Bjorn W. [2 ,3 ]
机构
[1] Delft Univ Technol, Dept Multiactor Syst, Delft, Netherlands
[2] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[3] Imperial Coll London, Grp Language Audio & Mus, London, England
来源
关键词
affective computing; negative interactions; cross-corpora analysis; conflict escalation; speech paralinguistics; emotion recognition; ACOUSTIC EMOTION RECOGNITION; CONFLICT;
D O I
10.3389/fcomp.2022.749804
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The monitoring of an escalating negative interaction has several benefits, particularly in security, (mental) health, and group management. The speech signal is particularly suited to this, as aspects of escalation, including emotional arousal, are proven to easily be captured by the audio signal. A challenge of applying trained systems in real-life applications is their strong dependence on the training material and limited generalization abilities. For this reason, in this contribution, we perform an extensive analysis of three corpora in the Dutch language. All three corpora are high in escalation behavior content and are annotated on alternative dimensions related to escalation. A process of label mapping resulted in two possible ground truth estimations for the three datasets as low, medium, and high escalation levels. To observe class behavior and inter-corpus differences more closely, we perform acoustic analysis of the audio samples, finding that derived labels perform similarly across each corpus, with escalation interaction increasing in pitch (F0) and intensity (dB). We explore the suitability of different speech features, data augmentation, merging corpora for training, and testing on actor and non-actor speech through our experiments. We find that the extent to which merging corpora is successful depends greatly on the similarities between label definitions before label mapping. Finally, we see that the escalation recognition task can be performed in a cross-corpus setup with hand-crafted speech features, obtaining up to 63.8% unweighted average recall (UAR) at best for a cross-corpus analysis, an increase from the inter-corpus results of 59.4% UAR.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Synthesized speech for model training in cross-corpus recognition of human emotion
    Björn Schuller
    Zixing Zhang
    Felix Weninger
    Felix Burkhardt
    International Journal of Speech Technology, 2012, 15 (3) : 313 - 323
  • [22] Adversarial Domain Generalized Transformer for Cross-Corpus Speech Emotion Recognition
    Gao, Yuan
    Wang, Longbiao
    Liu, Jiaxing
    Dang, Jianwu
    Okada, Shogo
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 697 - 708
  • [23] Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Song, Peng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (02) : 265 - 275
  • [24] Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
    Lu, Cheng
    Tang, Chuangao
    Zhang, Jiacheng
    Zong, Yuan
    ENTROPY, 2022, 24 (08)
  • [25] Personality Traits from Speech Signal using Cross-Corpus Technique
    Vijay, Nekha
    Tripathi, Shikha
    Lalitha, S.
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2017, : 527 - 532
  • [26] Transferable discriminant linear regression for cross-corpus speech emotion recognition
    Li, Shaokai
    Song, Peng
    Zhang, Wenjing
    APPLIED ACOUSTICS, 2022, 197
  • [27] Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
    Lee, Shi-Wook
    2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, 2021, : 389 - 396
  • [28] Transfer Subspace Learning for Unsupervised Cross-Corpus Speech Emotion Recognition
    Liu, Na
    Zhang, Baofeng
    Liu, Bin
    Shi, Jingang
    Yang, Lei
    Li, Zhiwei
    Zhu, Junchao
    IEEE ACCESS, 2021, 9 : 95925 - 95937
  • [29] Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation
    Fu, Hongliang
    Zhuang, Zhihao
    Wang, Yang
    Huang, Chen
    Duan, Wenzhuo
    ENTROPY, 2023, 25 (01)
  • [30] Cross-Corpus Speech Emotion Recognition Based on Few-Shot Learning and Domain Adaptation
    Ahn, Youngdo
    Lee, Sung Joo
    Shin, Jong Won
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1190 - 1194