Cross-modal semantic autoencoder with embedding consensus

被引:1
|
作者
Sun, Shengzi [1 ,2 ,3 ,4 ,5 ]
Guo, Binghui [1 ,2 ,3 ,4 ,5 ]
Mi, Zhilong [1 ,2 ,3 ,4 ,5 ]
Zheng, Zhiming [1 ,2 ,3 ,4 ,5 ]
机构
[1] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[2] Beihang Univ, NLSDE, Beijing 100191, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
[4] Beihang Univ, LMIB, Beijing 100191, Peoples R China
[5] Beihang Univ, Sch Math Sci, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1038/s41598-021-92750-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cross-modal retrieval has become a topic of popularity, since multi-data is heterogeneous and the similarities between different forms of information are worthy of attention. Traditional single-modal methods reconstruct the original information and lack of considering the semantic similarity between different data. In this work, a cross-modal semantic autoencoder with embedding consensus (CSAEC) is proposed, mapping the original data to a low-dimensional shared space to retain semantic information. Considering the similarity between the modalities, an automatic encoder is utilized to associate the feature projection to the semantic code vector. In addition, regularization and sparse constraints are applied to low-dimensional matrices to balance reconstruction errors. The high dimensional data is transformed into semantic code vector. Different models are constrained by parameters to achieve denoising. The experiments on four multi-modal data sets show that the query results are improved and effective cross-modal retrieval is achieved. Further, CSAEC can also be applied to fields related to computer and network such as deep and subspace learning. The model breaks through the obstacles in traditional methods, using deep learning methods innovatively to convert multi-modal data into abstract expression, which can get better accuracy and achieve better results in recognition.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] THE DEVELOPMENT OF CROSS-MODAL SEMANTIC INTEGRATION
    MURRAY, S
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1982, 35 (MAY): : 214 - 214
  • [22] DEEP SEMANTIC ADVERSARIAL HASHING BASED ON AUTOENCODER FOR LARGE-SCALE CROSS-MODAL RETRIEVAL
    Li, Mingyong
    Wang, Hongya
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,
  • [23] Cross-Modal Retrieval via Similarity-Preserving Learning and Semantic Average Embedding
    Zhi, Tao
    Fan, Yingchun
    Han, Hong
    IEEE ACCESS, 2020, 8 : 223918 - 223930
  • [24] Semantic-guided autoencoder adversarial hashing for large-scale cross-modal retrieval
    Li, Mingyong
    Li, Qiqi
    Ma, Yan
    Yang, Degang
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 1603 - 1617
  • [25] Semantic-guided autoencoder adversarial hashing for large-scale cross-modal retrieval
    Mingyong Li
    Qiqi Li
    Yan Ma
    Degang Yang
    Complex & Intelligent Systems, 2022, 8 : 1603 - 1617
  • [26] Cross-Modal Joint Embedding with Diverse Semantics
    Xie, Zhongwei
    Liu, Ling
    Wu, Yanzhao
    Li, Lin
    Zhong, Luo
    2020 IEEE SECOND INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2020), 2020, : 157 - 166
  • [27] Deep Relation Embedding for Cross-Modal Retrieval
    Zhang, Yifan
    Zhou, Wengang
    Wang, Min
    Tian, Qi
    Li, Houqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 617 - 627
  • [28] Cross-modal Metric Learning with Graph Embedding
    Zhang, Youcai
    Gu, Xiaodong
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 758 - 764
  • [29] Cross-Modal Retrieval with Heterogeneous Graph Embedding
    Chen, Dapeng
    Wang, Min
    Chen, Haobin
    Wu, Lin
    Qin, Jing
    Peng, Wei
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3291 - 3300
  • [30] Binary Set Embedding for Cross-Modal Retrieval
    Yu, Mengyang
    Liu, Li
    Shao, Ling
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (12) : 2899 - 2910