AFFECTIVE VIDEO CONTENT ANALYSES BY USING CROSS-MODAL EMBEDDING LEARNING FEATURES

被引：0

作者：

Li, Benchao ^{[1
,4
]}

Chen, Zhenzhong ^{[2
,4
]}

Li, Shan ^{[4
]}

Zheng, Wei-Shi ^{[3
,5
]}

机构：

[1] Sun Yat Sen Univ, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China

[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Hubei, Peoples R China

[3] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Guangdong, Peoples R China

[4] Tencent, Palo Alto, CA 94306 USA

[5] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Beijing, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2019年

关键词：

Affective Video Content Analyses; Cross-modal Embedding; Learning Features;

D O I：

10.1109/ICME.2019.00150

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Most existing methods on affective video content analyses are dedicated to single media, either visual content or audio content and few attempts for combined analysis of the two media signals are made. In this paper, we employ a cross-modal embedding learning approach to learn the compact feature representations of different modalities that are discriminative for analyzing the emotion attributes of the video. Specifically, we introduce inter-modal similarity constraints and intra-modal similarity constraints to promote the joint embedding learning procedure for obtaining the robust features. In order to capture cues in different grains, global and local features are extracted from both visual and audio signals, thereafter a unified framework consisting with global and local features embedding networks is built for affective video content analyses. Experiments show that our proposed approach significantly outperforms the state-of-the-art methods and demonstrate the effectiveness of our approach.

引用

页码：844 / 849

页数：6

共 50 条

[41] Cross-Modal Retrieval via Similarity-Preserving Learning and Semantic Average Embedding
Zhi, Tao
Fan, Yingchun
Han, Hong
IEEE ACCESS, 2020, 8 : 223918 - 223930
[42] Learning Joint Embedding with Modality Alignments for Cross-Modal Retrieval of Recipes and Food Images
Xie, Zhongwei
Liu, Ling
Li, Lin
Zhong, Luo
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2221 - 2230
[43] A multimodal embedding transfer approach for consistent and selective learning processes in cross-modal retrieval
Zeng, Zhixiong
He, Shuyi
Zhang, Yuhao
Mao, Wenji
INFORMATION SCIENCES, 2025, 704
[44] Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning
Das, Srijan
Ryoo, Michael
2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
[45] Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Alwassel, Humam
Mahajan, Dhruv
Korbar, Bruno
Torresani, Lorenzo
Ghanem, Bernard
Tran, Du
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[46] XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Sarkar, Pritam
Etemad, Ali
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14875 - 14885
[47] Hierarchical Cross-Modal Graph Consistency Learning for Video-Text Retrieval
Jin, Weike
Zhao, Zhou
Zhang, Pengcheng
Zhu, Jieming
He, Xiuqiang
Zhuang, Yueting
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1114 - 1124
[48] Cross-Modal Learning with Adversarial Samples
Li, Chao
Deng, Cheng
Gao, Shangqian
Xie, De
Liu, Wei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[49] Auditory and cross-modal implicit learning
Green, CD
Groff, P
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 15442 - 15442
[50] Continual learning in cross-modal retrieval
Wang, Kai
Herranz, Luis
van de Weijer, Joost
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633

← 1 2 3 4 5 →