Cross-modal semantic autoencoder with embedding consensus

被引:1
|
作者
Sun, Shengzi [1 ,2 ,3 ,4 ,5 ]
Guo, Binghui [1 ,2 ,3 ,4 ,5 ]
Mi, Zhilong [1 ,2 ,3 ,4 ,5 ]
Zheng, Zhiming [1 ,2 ,3 ,4 ,5 ]
机构
[1] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[2] Beihang Univ, NLSDE, Beijing 100191, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
[4] Beihang Univ, LMIB, Beijing 100191, Peoples R China
[5] Beihang Univ, Sch Math Sci, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1038/s41598-021-92750-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cross-modal retrieval has become a topic of popularity, since multi-data is heterogeneous and the similarities between different forms of information are worthy of attention. Traditional single-modal methods reconstruct the original information and lack of considering the semantic similarity between different data. In this work, a cross-modal semantic autoencoder with embedding consensus (CSAEC) is proposed, mapping the original data to a low-dimensional shared space to retain semantic information. Considering the similarity between the modalities, an automatic encoder is utilized to associate the feature projection to the semantic code vector. In addition, regularization and sparse constraints are applied to low-dimensional matrices to balance reconstruction errors. The high dimensional data is transformed into semantic code vector. Different models are constrained by parameters to achieve denoising. The experiments on four multi-modal data sets show that the query results are improved and effective cross-modal retrieval is achieved. Further, CSAEC can also be applied to fields related to computer and network such as deep and subspace learning. The model breaks through the obstacles in traditional methods, using deep learning methods innovatively to convert multi-modal data into abstract expression, which can get better accuracy and achieve better results in recognition.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Cross-modal semantic autoencoder with embedding consensus
    Shengzi Sun
    Binghui Guo
    Zhilong Mi
    Zhiming Zheng
    Scientific Reports, 11
  • [2] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [3] Cross-modal hashing with semantic deep embedding
    Yan, Cheng
    Bai, Xiao
    Wang, Shuai
    Zhou, Jun
    Hancock, Edwin R.
    NEUROCOMPUTING, 2019, 337 : 58 - 66
  • [4] Deep supervised multimodal semantic autoencoder for cross-modal retrieval
    Tian, Yu
    Yang, Wenjing
    Liu, Qingsong
    Yang, Qiong
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2020, 31 (4-5)
  • [5] ONION: Online Semantic Autoencoder Hashing for Cross-Modal Retrieval
    Zhang, Donglin
    Wu, Xiao-Jun
    Chen, Guoqing
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [6] Discrete semantic embedding hashing for scalable cross-modal retrieval
    Liu, Junjie
    Fei, Lunke
    Jia, Wei
    Zhao, Shuping
    Wen, Jie
    Teng, Shaohua
    Zhang, Wei
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
  • [7] Semantic embedding based online cross-modal hashing method
    Zhang, Meijia
    Li, Junzheng
    Zheng, Xiyuan
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [8] Semantic embedding based online cross-modal hashing method
    Meijia Zhang
    Junzheng Li
    Xiyuan Zheng
    Scientific Reports, 14
  • [9] Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
    Song, Yale
    Soleymani, Mohammad
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1979 - 1988
  • [10] Cross-modal Retrieval with Correspondence Autoencoder
    Feng, Fangxiang
    Wang, Xiaojie
    Li, Ruifan
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 7 - 16