Cross-modal dynamic sentiment annotation for speech sentiment analysis

被引：0

作者：

Chen, Jincai ^{[1
]}

Sun, Chao ^{[1
]}

Zhang, Sheng ^{[1
]}

Zeng, Jiangfeng ^{[2
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Peoples R China

[2] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2023年 / 106卷

基金：

中国国家自然科学基金;

关键词：

Speech sentiment analysis; Multi-modal video; Sentiment profiles; Cross-modal annotation;

D O I：

10.1016/j.compeleceng.2023.108598

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Traditionally, one single hard label determines the sentiment label of an entire utterance for speech sentiment analysis. It obviously ignores the inherent dynamic and ambiguity of speech sentiments. Moreover, there are few segment-level ground truth labels in the most existing sentiment corpora, due to the label ambiguity and annotation cost. In this work, to capture segment-level sentiment fluctuations across one utterance, we propose sentiment profiles (SPs) to express segment-level soft labels. Meanwhile, we introduce massive multi-modal wild video data to solve the data shortage problem, and facial expression knowledge is used to guide audio segments generate soft labels through the Cross-modal Sentiment Annotation Module. Then, we design a Speech Encoder Module to encode audio segments into SPs. We further exploit the sentiment profile purifier (SPP) to iteratively improve the accuracy of SPs. Numerous experiments show that our model achieves state-of-the-art performance on CH-SIMS and IEMOCAP datasets with unlabeled data respectively.

引用

页数：14

共 50 条

[21] Mual: enhancing multimodal sentiment analysis with cross-modal attention and difference loss
Deng, Yang
Li, Yonghong
Xian, Sidong
Li, Laquan
Qiu, Haiyang
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
[22] Which is Making the Contribution: Modulating Unimodal and Cross-modal Dynamics for Multimodal Sentiment Analysis
Zeng, Ying
Mai, Sijie
Hu, Haifeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1262 - 1274
[23] Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
Huang, Ju
Lu, Pengtao
Sun, Shuifa
Wang, Fangyi
ELECTRONICS, 2023, 12 (16)
[24] Cross-modal Analysis between Phonation Differences and Texture Images based on Sentiment Correlations
Kyaw, Win Thuzar
Sagisaka, Yoshinori
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 679 - 683
[25] CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis
Yang, Kaicheng
Xu, Hua
Gao, Kai
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 521 - 528
[26] Multimodal Sentiment Analysis Network Based on Distributional Transformation and Gated Cross-Modal Fusion
Zhang, Yuchen
Thong, Hong
Chen, Guilin
Alhusaini, Naji
Zhao, Shenghui
Wu, Cheng
2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024, 2024, : 496 - 503
[27] A shared-private sentiment analysis approach based on cross-modal information interaction
Hou, Yilin
Zhong, Xianjing
Cao, Hui
Zhu, Zheng
Zhou, Yunfeng
Zhang, Jie
PATTERN RECOGNITION LETTERS, 2024, 183 : 140 - 146
[28] Image sentiment analysis via active sample refinement and cross-modal semantics mining
Zhang H.-B.
Shi H.-W.
Xiong Q.-P.
Hou J.-Y.
Kongzhi yu Juece/Control and Decision, 2022, 37 (11): : 2949 - 2958
[29] Cross-Modal Sentiment Analysis Based on CLIP Image-Text Attention Interaction
Lu, Xintao
Ni, Yonglong
Ding, Zuohua
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (02) : 895 - 903
[30] Video multimodal sentiment analysis using cross-modal feature translation and dynamical propagation
Gan, Chenquan
Tang, Yu
Fu, Xiang
Zhu, Qingyi
Jain, Deepak Kumar
Garcia, Salvador
KNOWLEDGE-BASED SYSTEMS, 2024, 299

← 1 2 3 4 5 →