Cross-modal dynamic sentiment annotation for speech sentiment analysis

被引:0
|
作者
Chen, Jincai [1 ]
Sun, Chao [1 ]
Zhang, Sheng [1 ]
Zeng, Jiangfeng [2 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Peoples R China
[2] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
Speech sentiment analysis; Multi-modal video; Sentiment profiles; Cross-modal annotation;
D O I
10.1016/j.compeleceng.2023.108598
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Traditionally, one single hard label determines the sentiment label of an entire utterance for speech sentiment analysis. It obviously ignores the inherent dynamic and ambiguity of speech sentiments. Moreover, there are few segment-level ground truth labels in the most existing sentiment corpora, due to the label ambiguity and annotation cost. In this work, to capture segment-level sentiment fluctuations across one utterance, we propose sentiment profiles (SPs) to express segment-level soft labels. Meanwhile, we introduce massive multi-modal wild video data to solve the data shortage problem, and facial expression knowledge is used to guide audio segments generate soft labels through the Cross-modal Sentiment Annotation Module. Then, we design a Speech Encoder Module to encode audio segments into SPs. We further exploit the sentiment profile purifier (SPP) to iteratively improve the accuracy of SPs. Numerous experiments show that our model achieves state-of-the-art performance on CH-SIMS and IEMOCAP datasets with unlabeled data respectively.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Mual: enhancing multimodal sentiment analysis with cross-modal attention and difference loss
    Deng, Yang
    Li, Yonghong
    Xian, Sidong
    Li, Laquan
    Qiu, Haiyang
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
  • [22] Which is Making the Contribution: Modulating Unimodal and Cross-modal Dynamics for Multimodal Sentiment Analysis
    Zeng, Ying
    Mai, Sijie
    Hu, Haifeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1262 - 1274
  • [23] Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
    Huang, Ju
    Lu, Pengtao
    Sun, Shuifa
    Wang, Fangyi
    ELECTRONICS, 2023, 12 (16)
  • [24] Cross-modal Analysis between Phonation Differences and Texture Images based on Sentiment Correlations
    Kyaw, Win Thuzar
    Sagisaka, Yoshinori
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 679 - 683
  • [25] CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis
    Yang, Kaicheng
    Xu, Hua
    Gao, Kai
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 521 - 528
  • [26] Multimodal Sentiment Analysis Network Based on Distributional Transformation and Gated Cross-Modal Fusion
    Zhang, Yuchen
    Thong, Hong
    Chen, Guilin
    Alhusaini, Naji
    Zhao, Shenghui
    Wu, Cheng
    2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024, 2024, : 496 - 503
  • [27] A shared-private sentiment analysis approach based on cross-modal information interaction
    Hou, Yilin
    Zhong, Xianjing
    Cao, Hui
    Zhu, Zheng
    Zhou, Yunfeng
    Zhang, Jie
    PATTERN RECOGNITION LETTERS, 2024, 183 : 140 - 146
  • [28] Image sentiment analysis via active sample refinement and cross-modal semantics mining
    Zhang H.-B.
    Shi H.-W.
    Xiong Q.-P.
    Hou J.-Y.
    Kongzhi yu Juece/Control and Decision, 2022, 37 (11): : 2949 - 2958
  • [29] Cross-Modal Sentiment Analysis Based on CLIP Image-Text Attention Interaction
    Lu, Xintao
    Ni, Yonglong
    Ding, Zuohua
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 895 - 903
  • [30] Video multimodal sentiment analysis using cross-modal feature translation and dynamical propagation
    Gan, Chenquan
    Tang, Yu
    Fu, Xiang
    Zhu, Qingyi
    Jain, Deepak Kumar
    Garcia, Salvador
    KNOWLEDGE-BASED SYSTEMS, 2024, 299