SAPBERT: Speaker-Aware Pretrained BERT for Emotion Recognition in Conversation

被引:2
|
作者
Lim, Seunguook [1 ]
Kim, Jihie [1 ]
机构
[1] Dongguk Univ Seoul, Dept Artificial Intelligence, 30 Pildong Ro 1 Gil, Seoul 04620, South Korea
关键词
natural language processing; motion recognition in conversation; dialogue modeling; pre-training; hierarchical BERT;
D O I
10.3390/a16010008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition in conversation (ERC) is receiving more and more attention, as interactions between humans and machines increase in a variety of services such as chat-bot and virtual assistants. As emotional expressions within a conversation can heavily depend on the contextual information of the participating speakers, it is important to capture self-dependency and inter-speaker dynamics. In this study, we propose a new pre-trained model, SAPBERT, that learns to identify speakers in a conversation to capture the speaker-dependent contexts and address the ERC task. SAPBERT is pre-trained with three training objectives including Speaker Classification (SC), Masked Utterance Regression (MUR), and Last Utterance Generation (LUG). We investigate whether our pre-trained speaker-aware model can be leveraged for capturing speaker-dependent contexts for ERC tasks. Experiments show that our proposed approach outperforms baseline models through demonstrating the effectiveness and validity of our method.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Speaker Clustering in Emotion Recognition
    Ding, Ni
    Epps, Julien
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1162 - 1165
  • [32] A speaker-aware multiparty dialogue discourse parser with heterogeneous graph neural network
    Li, Jiaqi
    Liu, Ming
    Wang, Yuxin
    Zhang, Daxing
    Qin, Bing
    COGNITIVE SYSTEMS RESEARCH, 2023, 79 : 15 - 23
  • [33] Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation
    Wang, Fan-Lin
    Peng, Yu-Huai
    Lee, Hung-Shin
    Wang, Hsin-Min
    INTERSPEECH 2021, 2021, : 3061 - 3065
  • [34] Multimodal Speaker Recognition in a Conversation Scenario
    Marchegiani, Maria Letizia
    Pirri, Fiora
    Pizzoli, Matia
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2009, 5815 : 11 - 20
  • [35] Enhanced Speaker-Aware Multi-Party Multi-Turn Dialogue Comprehension
    Ma, Xinbei
    Zhang, Zhuosheng
    Zhao, Hai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2410 - 2423
  • [36] AN ONLINE SPEAKER-AWARE SPEECH SEPARATION APPROACH BASED ON TIME-DOMAIN REPRESENTATION
    Wang, Hui
    Song, Yan
    Li, Zeng-Xi
    McLoughlin, Ian
    Dai, Li-Rong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6379 - 6383
  • [37] An Emotion Evolution Network for Emotion Recognition in Conversation
    Tang, Shimin
    Wang, Changjian
    Xu, Kele
    Huang, Zhen
    Xu, Minpeng
    Peng, Yuxing
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1231 - 1238
  • [38] COMPARISON OF SPEAKER DEPENDENT AND SPEAKER INDEPENDENT EMOTION RECOGNITION
    Rybka, Jan
    Janicki, Artur
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2013, 23 (04) : 797 - 808
  • [39] Emotion Recognition Using Pretrained Deep Neural Networks
    Dobes, Marek
    Sabolova, Natalia
    ACTA POLYTECHNICA HUNGARICA, 2023, 20 (04) : 195 - 204
  • [40] Speaker Awareness for Speech Emotion Recognition
    Assuncao, Gustavo
    Menezes, Paulo
    Perdigao, Fernando
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2020, 16 (04) : 15 - 22