Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

被引:0
|
作者
Xu, Yifan [1 ]
Shamsolmoali, Pourya [1 ]
Yang, Jie [1 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai, Peoples R China
关键词
D O I
10.1109/ICPR56361.2022.9956641
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual geo-localization aims to estimate the geographical location of a query image by identifying the best-matched reference image from a GPS-tagged database. It remains a challenging task because of image appearance changes such as lighting, scale and pose. The current approaches do not have satisfactory performance for large-scale environments owing to the lack of learning discriminative features for image matching. To address the above problem, we introduce a practical method to exploit a weak-supervised model with selective transfer for feature distillation. We propose an image matching method that uses image sub-regions to adequately analyze the potential of difficult positive images. For improving the network generations and performance, the model estimates image-to-region similarity labels at no additional parameters or manual annotations by use of soft-labeled loss. Moreover, to have optimal training we propose a novel knowledge distillation (KD) method to effectively capture and transfer knowledge of a teacher network to a student network. More specifically, our method uses an attention network to learn relative similarities within features and utilizes these similarities to enhance the distillation intensities by further exploring the potential of difficult positive images. Our model achieves significant localization performance over large variations of appearance on three challenging datasets with satisfactory efficiency. Our code is available at https://github.com/XuYifan98/WAKD.
引用
收藏
页码:1815 / 1821
页数:7
相关论文
共 50 条
  • [41] A Multifunctional Network with Uncertainty Estimation and Attention-Based Knowledge Distillation to Address Practical Challenges in Respiration Rate Estimation
    Rathore, Kapil Singh
    Vijayarangan, Sricharan
    Sp, Preejith
    Sivaprakasam, Mohanasankar
    SENSORS, 2023, 23 (03)
  • [42] SEQUENCE-LEVEL KNOWLEDGE DISTILLATION FOR MODEL COMPRESSION OF ATTENTION-BASED SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION
    Mun'im, Raden Mu'az
    Inoue, Nakamasa
    Shinoda, Koichi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6151 - 6155
  • [43] TirSA: A Three Stage Approach for UAV-Satellite Cross-View Geo-Localization Based on Self-Supervised Feature Enhancement
    Sun, Jian
    Sun, Hao
    Lei, Lin
    Ji, Kefeng
    Kuang, Gangyao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 7882 - 7895
  • [44] MoMA: Momentum contrastive learning with multi-head attention-based knowledge distillation for histopathology image analysis
    Vuong, Trinh Thi Le
    Kwak, Jin Tae
    MEDICAL IMAGE ANALYSIS, 2025, 101
  • [45] Visual Attention-Based Self-Supervised Absolute Depth Estimation Using Geometric Priors in Autonomous Driving
    Xiang, Jie
    Wang, Yun
    An, Lifeng
    Liu, Haiyang
    Wang, Zijun
    Liu, Jian
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11998 - 12005
  • [46] Occlusion and Deformation Handling Visual Tracking for UAV via Attention-Based Mask Generative Network
    Bai, Yashuo
    Song, Yong
    Zhao, Yufei
    Zhou, Ya
    Wu, Xiyan
    He, Yuxin
    Zhang, Zishuo
    Yang, Xin
    Hao, Qun
    REMOTE SENSING, 2022, 14 (19)
  • [47] Weakly supervised learning for land cover mapping of satellite image time series via attention-based CNN
    Ienco, Dino
    Eudes Gbodjo, Yawogan Jean
    Gaetano, Raffaele
    Interdonato, Roberto
    IEEE Access, 2020, 8 : 179547 - 179560
  • [48] Weakly Supervised Learning for Land Cover Mapping of Satellite Image Time Series via Attention-Based CNN
    Ienco, Dino
    Gbodjo, Yawogan Jean Eudes
    Gaetano, Raffaele
    Interdonato, Roberto
    IEEE ACCESS, 2020, 8 : 179547 - 179560
  • [49] Tensor-Based Emotional Category Classification via Visual Attention-Based Heterogeneous CNN Feature Fusion
    Moroto, Yuya
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    SENSORS, 2020, 20 (07)
  • [50] Auxiliary Learning for Self-Supervised Video Representation via Similarity-based Knowledge Distillation
    Dadashzadeh, Amirhossein
    Whone, Alan
    Mirmehdi, Majid
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4230 - 4239