Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[31] Cross-Modal Graph Knowledge Representation and Distillation Learning for Land Cover Classification
Wang, Wenzhen
Liu, Fang
Liao, Wenzhi
Xiao, Liang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[32] XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Sarkar, Pritam
Etemad, Ali
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14875 - 14885
[33] Hybrid DAER Based Cross-Modal Retrieval Exploiting Deep Representation Learning
Huang, Zhao
Hu, Haowu
Su, Miao
ENTROPY, 2023, 25 (08)
[34] Unsupervised Cross-Modal Audio Representation Learning from Unstructured Multilingual Text
Schindler, Alexander
Gordea, Sergiu
Knees, Peter
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 706 - 713
[35] Cross-Modal Representation Learning for Lightweight and Accurate Facial Action Unit Detection
Chen, Yingjie
Wu, Han
Wang, Tao
Wang, Yizhou
Liang, Yun
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04): : 7619 - 7626
[36] Adversarial Learning-Based Semantic Correlation Representation for Cross-Modal Retrieval
Zhu, Lei
Song, Jiayu
Zhu, Xiaofeng
Zhang, Chengyuan
Zhang, Shichao
Yuan, Xinpan
IEEE MULTIMEDIA, 2020, 27 (04) : 79 - 90
[37] Probability Distribution Representation Learning for Image-Text Cross-Modal Retrieval
Yang C.
Liu L.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (05): : 751 - 759
[38] Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification
Fang, Zhiyu
Zhu, Xiaobin
Yang, Chun
Han, Zheng
Qin, Jingyan
Yin, Xu-Cheng
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6605 - 6613
[39] Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection
Dai, Rui
Das, Srijan
Bremond, Francois
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13033 - 13044
[40] Cross-Modal Learning with Adversarial Samples
Li, Chao
Deng, Cheng
Gao, Shangqian
Xie, De
Liu, Wei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →