Cross-modal semantic autoencoder with embedding consensus

被引:1
|
作者
Sun, Shengzi [1 ,2 ,3 ,4 ,5 ]
Guo, Binghui [1 ,2 ,3 ,4 ,5 ]
Mi, Zhilong [1 ,2 ,3 ,4 ,5 ]
Zheng, Zhiming [1 ,2 ,3 ,4 ,5 ]
机构
[1] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[2] Beihang Univ, NLSDE, Beijing 100191, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
[4] Beihang Univ, LMIB, Beijing 100191, Peoples R China
[5] Beihang Univ, Sch Math Sci, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1038/s41598-021-92750-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cross-modal retrieval has become a topic of popularity, since multi-data is heterogeneous and the similarities between different forms of information are worthy of attention. Traditional single-modal methods reconstruct the original information and lack of considering the semantic similarity between different data. In this work, a cross-modal semantic autoencoder with embedding consensus (CSAEC) is proposed, mapping the original data to a low-dimensional shared space to retain semantic information. Considering the similarity between the modalities, an automatic encoder is utilized to associate the feature projection to the semantic code vector. In addition, regularization and sparse constraints are applied to low-dimensional matrices to balance reconstruction errors. The high dimensional data is transformed into semantic code vector. Different models are constrained by parameters to achieve denoising. The experiments on four multi-modal data sets show that the query results are improved and effective cross-modal retrieval is achieved. Further, CSAEC can also be applied to fields related to computer and network such as deep and subspace learning. The model breaks through the obstacles in traditional methods, using deep learning methods innovatively to convert multi-modal data into abstract expression, which can get better accuracy and achieve better results in recognition.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Correlation Autoencoder Hashing for Supervised Cross-Modal Search
    Cao, Yue
    Long, Mingsheng
    Wang, Jianmin
    Zhu, Han
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 197 - 204
  • [32] Hierarchical Consensus Hashing for Cross-Modal Retrieval
    Sun, Yuan
    Ren, Zhenwen
    Hu, Peng
    Peng, Dezhong
    Wang, Xu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 824 - 836
  • [33] Multi-view visual semantic embedding for cross-modal image-text retrieval
    Li, Zheng
    Guo, Caili
    Wang, Xin
    Zhang, Hao
    Hu, Lin
    PATTERN RECOGNITION, 2025, 159
  • [34] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
  • [35] Semantic consistency hashing for cross-modal retrieval
    Yao, Tao
    Kong, Xiangwei
    Fu, Haiyan
    Tian, Qi
    NEUROCOMPUTING, 2016, 193 : 250 - 259
  • [36] CROSS-MODAL SEMANTIC INTEGRATION IN CHILDRENS MEMORY
    MURRAY, S
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1982, 35 (SEP): : A66 - A66
  • [37] Analyzing semantic correlation for cross-modal retrieval
    Liang Xie
    Peng Pan
    Yansheng Lu
    Multimedia Systems, 2015, 21 : 525 - 539
  • [38] Analyzing semantic correlation for cross-modal retrieval
    Xie, Liang
    Pan, Peng
    Lu, Yansheng
    MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
  • [39] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [40] Cross-modal semantic transfer for point cloud semantic segmentation
    Cao, Zhen
    Mi, Xiaoxin
    Qiu, Bo
    Cao, Zhipeng
    Long, Chen
    Yan, Xinrui
    Zheng, Chao
    Dong, Zhen
    Yang, Bisheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 265 - 279