Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

被引:6
|
作者
Triantafyllopoulos, Andreas [1 ]
Wagner, Johannes [2 ]
Wierstorf, Hagen [2 ]
Schmitt, Maximilian [2 ]
Reichel, Uwe [2 ]
Eyben, Florian [2 ]
Burkhardt, Felix [2 ]
Schuller, Bjoern W. [1 ,2 ,3 ]
机构
[1] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[2] audEERING GmbH, Gilching, Germany
[3] Imperial Coll, GLAM Grp Language Audio & Mus, London, England
来源
关键词
speech emotion recognition; transformers;
D O I
10.21437/Interspeech.2022-10371
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance - and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should be included in their testing.
引用
收藏
页码:146 / 150
页数:5
相关论文
共 50 条
  • [41] Multiroom Speech Emotion Recognition
    Shalev, Erez
    Cohen, Israel
    European Signal Processing Conference, 2022, 2022-August : 135 - 139
  • [42] Emotion recognition in Arabic speech
    Klaylat, Samira
    Osman, Ziad
    Hamandi, Lama
    Zantout, Rached
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2018, 96 (02) : 337 - 351
  • [43] Effects of Vocabulary and Implicit Linguistic Knowledge on Speech Recognition in Adverse Listening Conditions
    Fletcher, Annalise
    McAuliffe, Megan
    Kerr, Sarah
    Sinex, Donal
    AMERICAN JOURNAL OF AUDIOLOGY, 2019, 28 (03) : 742 - 755
  • [44] ViTFER: Facial Emotion Recognition with Vision Transformers
    Chaudhari, Aayushi
    Bhatt, Chintan
    Krishna, Achyut
    Mazzeo, Pier Luigi
    APPLIED SYSTEM INNOVATION, 2022, 5 (04)
  • [45] The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)
    Oh, Qi Qi
    Seow, Chee Kiat
    Yusuff, Mulliana
    Pranata, Sugiri
    Cao, Qi
    2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 523 - 531
  • [46] Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
    Liu, Yang
    Sun, Haoqin
    Chen, Geng
    Wang, Qingyue
    Zhao, Zhen
    Lu, Xugang
    Wang, Longbiao
    INTERSPEECH 2023, 2023, : 1893 - 1897
  • [47] The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
    Costantini, Giovanni
    Parada-Cabaleiro, Emilia
    Casali, Daniele
    Cesarini, Valerio
    SENSORS, 2022, 22 (07)
  • [48] Speech Emotion Recognition Exploiting ASR-based and Phonological Knowledge Representations
    Liang, Shuang
    Xie, Xiang
    Zhan, Qingran
    Cheng, Hao
    6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 216 - 220
  • [49] Emotion Recognition using Imperfect Speech Recognition
    Metze, Florian
    Batliner, Anton
    Eyben, Florian
    Polzehl, Tim
    Schuller, Bjoern
    Steidl, Stefan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 478 - +
  • [50] Unsupervised Domain Adaptation Integrating Transformers and Mutual Information for Cross-Corpus Speech Emotion Recognition
    Zhang, Shiqing
    Liu, Ruixin
    Yang, Yijiao
    Zhao, Xiaoming
    Yu, Jun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,