Cross-modal Representation Learning with Nonlinear Dimensionality Reduction

被引：0

作者：

Kaya, Semih ^{[1
]}

Vural, Elif ^{[1
]}

机构：

[1] Orta Dogu Tekn Univ, Elektr & Elekt Muhendisligi Bolumu, Ankara, Turkey

来源：

2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2019年

关键词：

Cross-modal learning; multi-view learning; nonlinear projections;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In many problems in machine learning there exist relations between data collections from different modalities. The purpose of multi-modal learning algorithms is to efficiently use the information present in different modalities when solving multi-modal retrieval problems. In this work, a multi-modal representation learning algorithm is proposed, which is based on nonlinear dimensionality reduction. Compared to linear dimensionality reduction methods, nonlinear methods provide more flexible representations especially when there is high discrepancy between the structures of different modalities. In this work, we propose to align different modalities by mapping same-class training data from different modalities to nearby coordinates, while we also learn a Lipschitz-continuous interpolation function that generalizes the learnt representation to the whole data space. Experiments in image-text retrieval applications show that the proposed method yields high performance when compared to multi-modal learning methods in the literature.

引用

页数：4

共 50 条

[1] Cross-Modal Discrete Representation Learning
Liu, Alexander H.
Jin, SouYoung
Lai, Cheng-I Jeff
Rouditchenko, Andrew
Oliva, Aude
Glass, James
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3013 - 3035
[2] Quaternion Representation Learning for cross-modal matching
Wang, Zheng
Xu, Xing
Wei, Jiwei
Xie, Ning
Shao, Jie
Yang, Yang
KNOWLEDGE-BASED SYSTEMS, 2023, 270
[3] Hybrid representation learning for cross-modal retrieval
Cao, Wenming
Lin, Qiubin
He, Zhihai
He, Zhiquan
NEUROCOMPUTING, 2019, 345 : 45 - 57
[4] Disentangled Representation Learning for Cross-Modal Biometric Matching
Ning, Hailong
Zheng, Xiangtao
Lu, Xiaoqiang
Yuan, Yuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1763 - 1774
[5] Learning Cross-Modal Aligned Representation With Graph Embedding
Zhang, Youcai
Cao, Jiayan
Gu, Xiaodong
IEEE ACCESS, 2018, 6 : 77321 - 77333
[6] Cross-modal Representation Learning for Understanding Manufacturing Procedure
Hashimoto, Atsushi
Nishimura, Taichi
Ushiku, Yoshitaka
Kameko, Hirotaka
Mori, Shinsuke
CROSS-CULTURAL DESIGN-APPLICATIONS IN LEARNING, ARTS, CULTURAL HERITAGE, CREATIVE INDUSTRIES, AND VIRTUAL REALITY, CCD 2022, PT II, 2022, 13312 : 44 - 57
[7] Enhanced Multimodal Representation Learning with Cross-modal KD
Chen, Mengxi
Xing, Linyu
Wang, Yu
Zhang, Ya
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11766 - 11775
[8] Towards Cross-Modal Causal Structure and Representation Learning
Mao, Haiyi
Liu, Hongfu
Dou, Jason Xiaotian
Benos, Panayiotis V.
MACHINE LEARNING FOR HEALTH, VOL 193, 2022, 193 : 120 - 140
[9] Variational Deep Representation Learning for Cross-Modal Retrieval
Yang, Chen
Deng, Zongyong
Li, Tianyu
Liu, Hao
Liu, Libo
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 498 - 510
[10] Multi-grained Representation Learning for Cross-modal Retrieval
Zhao, Shengwei
Xu, Linhai
Liu, Yuying
Du, Shaoyi
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2194 - 2198

← 1 2 3 4 5 →