Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

被引:6
|
作者
Triantafyllopoulos, Andreas [1 ]
Wagner, Johannes [2 ]
Wierstorf, Hagen [2 ]
Schmitt, Maximilian [2 ]
Reichel, Uwe [2 ]
Eyben, Florian [2 ]
Burkhardt, Felix [2 ]
Schuller, Bjoern W. [1 ,2 ,3 ]
机构
[1] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[2] audEERING GmbH, Gilching, Germany
[3] Imperial Coll, GLAM Grp Language Audio & Mus, London, England
来源
关键词
speech emotion recognition; transformers;
D O I
10.21437/Interspeech.2022-10371
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance - and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should be included in their testing.
引用
收藏
页码:146 / 150
页数:5
相关论文
共 50 条
  • [31] Emotion Recognition in Arabic Speech
    Klaylat, Samira
    Hamandi, Lama
    Osman, Ziad
    Zantout, Rached
    2017 SENSORS NETWORKS SMART AND EMERGING TECHNOLOGIES (SENSET), 2017,
  • [32] Multiroom Speech Emotion Recognition
    Shalev, Erez
    Cohen, Israel
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 135 - 139
  • [33] Emotion recognition in Arabic speech
    Samira Klaylat
    Ziad Osman
    Lama Hamandi
    Rached Zantout
    Analog Integrated Circuits and Signal Processing, 2018, 96 : 337 - 351
  • [34] Persian Speech Emotion Recognition
    Savargiv, Mohammad
    Bastanfard, Azam
    2015 7TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2015,
  • [35] Windowing for Speech Emotion Recognition
    Puterka, Boris
    Kacur, Juraj
    Pavlovicova, Jarmila
    2019 61ST INTERNATIONAL SYMPOSIUM ELMAR, 2019, : 147 - 150
  • [36] Mandarin emotion recognition in speech
    Pao, TL
    Chen, YT
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 227 - 230
  • [37] Progress in speech emotion recognition
    Zhang, Xueying
    Sun, Ying
    Duan, Shufei
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [38] Review on speech emotion recognition
    Han, W.-J. (hanwenjing07@gmail.com), 1600, Chinese Academy of Sciences (25):
  • [39] Emotion recognition in Arabic speech
    Hadjadji, Imene
    Falek, Leila
    Demri, Lyes
    Teffahi, Hocine
    2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRICAL ENGINEERING (ICAEE), 2019,
  • [40] Bengali Speech Emotion Recognition
    Mohanta, Abhijit
    Sharma, Uzzal
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2812 - 2814