DSAP: Dynamic Sparse Attention Perception Matcher for Accurate Local Feature Matching

被引:0
|
作者
Dai, Kun [1 ]
Wang, Ke [2 ,3 ]
Xie, Tao [1 ,4 ]
Sun, Tao [4 ]
Zhang, Jinhang [1 ]
Kong, Qingjia [1 ]
Jiang, Zhiqiang [1 ]
Li, Ruifeng [1 ]
Zhao, Lijun [2 ,3 ]
Omar, Mohamed [1 ]
机构
[1] Harbin Inst Technol, State Key Lab Robot & Syst, Harbin 150006, Peoples R China
[2] Harbin Inst Technol, State Key Lab Robot & Syst, Harbin 150006, Peoples R China
[3] Harbin Inst Technol, Zhengzhou Res Inst, Harbin 150006, Peoples R China
[4] Yangtze River Delta HIT Robot Technol Res Inst, Wuhu 241000, Peoples R China
关键词
Deep learning; dynamic attention perception; local feature matching; relative pose estimation; sparse attention; visual localization;
D O I
10.1109/TIM.2024.3370781
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Local feature matching, which aims to establish the matches between image pairs, is a pivotal component of multiple visual applications. While current transformer-based works exhibit remarkable performance, they mechanically alternate self- and cross-attention in a predetermined order without considering their prioritization, culminating in inadequate enhancement of visual descriptors. Moreover, when calculating attention matrices to integrate global context, current methods only explicitly model the correlation among the feature channels without taking their importance into account, leaving insufficient message propagation. In this work, we develop a dynamic sparse attention perception (DSAP) matcher to tackle the aforementioned issues. To resolve the first issue, DSAP presents a dynamic perception strategy (DPS) that enables the network to dynamically implement feature enhancement via modifying both forward and backward propagation. During forward propagation, DPS assigns a learnable perception score to each transformer layer and employs an exponential moving average algorithm (EMA) to calculate the current score. After that, DPS utilizes an indicator function to binarize the score, allowing DSAP to adaptively determine the appropriate utilization of self- or cross-attention at the current iteration. During backward propagation, DPS employs a gradient estimator that adjusts the gradient of perception scores, thus rendering them differentiable. To tackle the second issue, DSAP introduces a weighted sparse transformer (WSFormer) that recalibrates attention matrices by concurrently considering both channel importance and channel correlation. WSFormer predicts attention vectors to weight attention matrices while constructing multiple sparse attention matrices to integrate various global messages, thus highlighting informative channels and inhibiting redundant message propagation. Extensive experiments in public datasets and real environments demonstrate that DSAP achieves exceptional performances across various downstream tasks, including relative pose estimation and visual localization. The code is available at https://github.com/mooncake199809/DSAP.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 28 条
  • [1] AAPMatcher: Adaptive attention pruning matcher for accurate local feature matching
    Fan, Xuan
    Liu, Sijia
    Liu, Shuaiyan
    Zhao, Lijun
    Li, Ruifeng
    Neural Networks, 2025, 188
  • [2] MR-Matcher: A Multirouting Transformer-Based Network for Accurate Local Feature Matching
    Jiang, Zhiqiang
    Wang, Ke
    Kong, Qingjia
    Dai, Kun
    Xie, Tao
    Qin, Zhonghao
    Li, Ruifeng
    Perner, Petra
    Zhao, Lijun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [3] Improving sparse graph attention for feature matching by informative keypoints exploration
    Jiang, Xingyu
    Zhang, Shihua
    Zhang, Xiao-Ping
    Ma, Jiayi
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 235
  • [4] A New Feature Detector and Stereo Matching Method for Accurate High-Performance Sparse Stereo Matching
    Schauwecker, Konstantin
    Klette, Reinhard
    Zell, Andreas
    2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 5171 - 5176
  • [5] FMAP: Learning robust and accurate local feature matching with anchor points
    Dai, Kun
    Xie, Tao
    Wang, Ke
    Jiang, Zhiqiang
    Li, Ruifeng
    Zhao, Lijun
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 236
  • [6] Matching Highly Accurate Maps to Local Environmental Perception at Road Construction Sites
    Wimmer, Andreas
    Graf, Regine
    Dietmayer, Klaus C. J.
    2011 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2011, : 975 - 980
  • [7] DSD-MatchingNet:Deformable sparse-to-dense feature matching for learning accurate correspondences
    Yicheng ZHAO
    Han ZHANG
    Ping LU
    Ping LI
    Enhua WU
    Bin SHENG
    虚拟现实与智能硬件(中英文), 2022, 4 (05) : 432 - 443
  • [8] DSD-MatchingNet: Deformable Sparse-to-Dense Feature Matching for Learning Accurate Correspondences
    Zhao Y.
    Zhang H.
    Lu P.
    Li P.
    Wu E.
    Sheng B.
    Virtual Reality and Intelligent Hardware, 2022, 4 (05): : 432 - 443
  • [9] Multi-Scale Attention and Structural Relation Graph for Local Feature Matching
    Nan, Xiaohu
    Ding, Lei
    IEEE ACCESS, 2022, 10 : 110603 - 110615
  • [10] DeepMatcher: A deep transformer-based network for robust and accurate local feature matching
    Xie, Tao
    Dai, Kun
    Wang, Ke
    Li, Ruifeng
    Zhao, Lijun
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237