Deep Cross-Modal Retrieval for Remote Sensing Image and Audio

被引:0
|
作者
Guo Mao [1 ,2 ]
Yuan Yuan [1 ]
Lu Xiaoqiang [1 ]
机构
[1] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Ctr Opt IMagery Anal & Learning OPTIMAL, Xian 710119, Shaanxi, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
关键词
cross-modal retrieval; remote sensing image; spoken audio; convolutional neural network; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Remote sensing image retrieval has many important applications in civilian and military fields, such as disaster monitoring and target detecting. However, the existing research on image retrieval, mainly including to two directions, text based and content based, cannot meet the rapid and convenient needs of some special applications and emergency scenes. Based on text, the retrieval is limited by keyboard inputting because of its lower efficiency for some urgent situations and based on content, it needs an example image as reference, which usually does not exist. Yet speech, as a direct, natural and efficient human-machine interactive way, can make up these shortcomings. Hence, a novel cross-modal retrieval method for remote sensing image and spoken audio is proposed in this paper. We first build a large-scale remote sensing image dataset with plenty of manual annotated spoken audio captions for the cross-modal retrieval task. Then a Deep Visual-Audio Network is designed to directly learn the correspondence of image and audio. And this model integrates feature extracting and multi-modal learning into the same network. Experiments on the proposed dataset verify the effectiveness of our approach and prove that it is feasible for speech-to-image retrieval.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] A TEXTURE AND SALIENCY ENHANCED IMAGE LEARNING METHOD FOR CROSS-MODAL REMOTE SENSING IMAGE-TEXT RETRIEVAL
    Yang, Rui
    Zhang, Di
    Guo, YanHe
    Wang, Shuang
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4895 - 4898
  • [32] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [33] Scale-Aware Adaptive Refinement and Cross-Interaction for Remote Sensing Audio-Visual Cross-Modal Retrieval
    Chen, Yaxiong
    Du, Chuang
    Zi, Yunfei
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [34] Interacting-Enhancing Feature Transformer for Cross-Modal Remote-Sensing Image and Text Retrieval
    Tang, Xu
    Wang, Yijing
    Ma, Jingjing
    Zhang, Xiangrong
    Liu, Fang
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [35] MULTI-SCALE INTERACTIVE TRANSFORMER FOR REMOTE SENSING CROSS-MODAL IMAGE-TEXT RETRIEVAL
    Wang, Yijing
    Ma, Jingjing
    Li, Mingteng
    Tang, Xu
    Han, Xiao
    Jiao, Licheng
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 839 - 842
  • [36] Audio-to-Image Cross-Modal Generation
    Zelaszczyk, Maciej
    Mandziuk, Jacek
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [37] Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech
    Qian, Xinyuan
    Xue, Wei
    Zhang, Qiquan
    Tao, Ruijie
    Li, Haizhou
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 (4480-4489) : 4480 - 4489
  • [38] Deep Adversarial Cascaded Hashing for Cross-Modal Vessel Image Retrieval
    Guo, Jiaen
    Guan, Xin
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 2205 - 2220
  • [39] Cross-Modal Compositional Learning for Multilabel Remote Sensing Image Classification
    Guo, Jie
    Jiao, Shuchang
    Sun, Hao
    Song, Bin
    Chi, Yuhao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 5810 - 5823
  • [40] MCRN: A Multi-source Cross-modal Retrieval Network for remote sensing
    Yuan, Zhiqiang
    Zhang, Wenkai
    Tian, Changyuan
    Mao, Yongqiang
    Zhou, Ruixue
    Wang, Hongqi
    Fu, Kun
    Sun, Xian
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 115