Semi-supervised cross-modal representation learning with GAN-based Asymmetric Transfer Network

被引:1
|
作者
Zhang, Lei [1 ,2 ]
Chen, Leiting [1 ,2 ,3 ]
Ou, Weihua [4 ]
Zhou, Chuan [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Digital Media Technol Key Lab Sichuan Prov, Chengdu, Peoples R China
[3] Inst Elect & Informat Engn UESTC Guangdong, Dongguan, Peoples R China
[4] Guizhou Normal Univ, Sch Big Data & Comp Sci, Guiyang, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Modality gap; Generative adversarial network;
D O I
10.1016/j.jvcir.2020.102899
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we proposed a semi-supervised common representation learning method with GAN-based Asymmetric Transfer Network (GATN) for cross modality retrieval. GATN utilizes the asymmetric pipeline to guarantee the semantic consistency and adopt (Generative Adversarial Network) GAN to fit the distributions of different modalities. Specifically, the common representation learning across modalities includes two stages: (1) the first stage, GATN trains source mapping network to learn the semantic representation of text modality by supervised method; and (2) the second stage, GAN-based unsupervised modality transfer method is proposed to guide the training of target mapping network, which includes generative network (target mapping network) and discriminative network. Experimental results on three widely-used benchmarks show that GATN have achieved better performance comparing with several existing state-of-the-art methods.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Semi-supervised constrained graph convolutional network for cross-modal retrieval
    Zhang, Lei
    Chen, Leiting
    Ou, Weihua
    Zhou, Chuan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
  • [12] Generalized Semi-supervised and Structured Subspace Learning for Cross-Modal Retrieval
    Zhang, Liang
    Ma, Bingpeng
    Li, Guorong
    Huang, Qingming
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (01) : 128 - 141
  • [13] Semi-supervised cross-modal hashing with multi-view graph representation
    Shen, Xiao
    Zhang, Haofeng
    Li, Lunbo
    Yang, Wankou
    Liu, Li
    INFORMATION SCIENCES, 2022, 604 : 45 - 60
  • [14] Semi-supervised cross-modal retrieval with graph-based semantic alignment network
    Zhang, Lei
    Chen, Leiting
    Ou, Weihua
    Zhou, Chuan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
  • [15] Semi-Supervised Semi-Paired Cross-Modal Hashing
    Zhang, Xuening
    Liu, Xingbo
    Nie, Xiushan
    Kang, Xiao
    Yin, Yilong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6517 - 6529
  • [16] Semi-supervised Deep Quantization for Cross-modal Search
    Wang, Xin
    Zhu, Wenwu
    Liu, Chenghao
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1730 - 1739
  • [17] Semi-Supervised Cross-Modal Retrieval With Label Prediction
    Mandal, Devraj
    Rao, Pramod
    Biswas, Soma
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (09) : 2345 - 2353
  • [18] Semi-Supervised Knowledge Distillation for Cross-Modal Hashing
    Su, Mingyue
    Gu, Guanghua
    Ren, Xianlong
    Fu, Hao
    Zhao, Yao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 662 - 675
  • [19] Semi-supervised Prototype Semantic Association Learning for Robust Cross-modal Retrieval
    Wang, Junsheng
    Gong, Tiantian
    Yan, Yan
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 872 - 881
  • [20] Cross-modal Common Representation Learning by Hybrid Transfer Network
    Huang, Xin
    Peng, Yuxin
    Yuan, Mingkuan
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1893 - 1900