Visual SLAM method for dynamic environment based on deep learning image features

被引:0
|
作者
Liu D. [1 ]
Yu T. [1 ]
Cong M. [1 ]
Du Y. [2 ]
机构
[1] School of Mechanical Engineering, Dalian University of Technology, Liaoning, Dalian
[2] School of Mechanical Engineering, Dalian Jiaotong University, Liaoning, Dalian
关键词
attention; deep learning; feature extraction; multi-task distillation; visual SLAM;
D O I
10.13245/j.hust.240658
中图分类号
学科分类号
摘要
Aiming at the problem that traditional visual simultaneous localization and mapping (SLAM) algorithms rely on handcrafted features,which are not stable enough for dynamic objects and change of illumination conditions and are prone to lose tracking,a stable and real-time method of image feature extraction and matching was presented based on deep learning.The neural network model with attention mechanism was trained through multi-task distillation training to realize feature extraction and matching for scenes with dramatic changes in illumination conditions. Based on the global and local features,a relocalization method based on hierarchical features was proposed to improve the overall accuracy and stability of the system,and it was real-time.Feature extraction and matching tests were performed on images with different illumination and angles in the same scene compared with Superpoint,and localization accuracy tests were performed on TUM datasets compared with ORB SLAM2 and GCN SLAM.Results show that the proposed method can extract sufficient stable features when illumination conditions change dramatically,and it performs better on fr3/sitting_static and fr3/walking_static than the other two methods.The root mean square error of tracks were 6.131 mm and 124.493 mm. Finally,sparse mapping was carried out in real indoor environments,and the effectiveness of the improved relocalization method was verified. © 2024 Huazhong University of Science and Technology. All rights reserved.
引用
收藏
页码:156 / 163
页数:7
相关论文
共 17 条
  • [1] 48, 9, pp. 25-30, (2020)
  • [2] CHEUNG W,, HAMARNEH G.n-SIFT:n-dimensional scale invariant feature transform[J], IEEE Transactions on Image Processing, 18, 9, pp. 2012-2021, (2009)
  • [3] RUBLEE E,, RABAUD V,, KONOLIGE K, ORB:an efficient alternative to SIFT or SURF[C], Proc of 2011 International Conference on Computer Vision, pp. 2564-2571, (2011)
  • [4] SARLIN P E,, CADENA C,, SIEGWART R, From coarse to fine: robust hierarchical localization at large scale[C], Proc of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12716-12725, (2019)
  • [5] DETONE D,, MALISIEWICZ T,, RABINOVICH A., Superpoint: self-supervised interest point detection and description[C], Proc of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224-236, (2018)
  • [6] TANG J,, ERICSON L,, FOLKESSON J, GCNv2: Efficient Correspondence Prediction for Real-Time SLAM[J], IEEE Robotics and Automation Letters, 4, 4, pp. 3505-3512, (2019)
  • [7] TEED Z,, DENG J., DROID-SLAM: deep visual SLAM for monocular,stereo,and RGB-D cameras[J], Advances in Neural Information Processing Systems, 34, pp. 16558-16569, (2021)
  • [8] SU P, LUO S, HUANG X., Real-time dynamic SLAM algorithm based on deep learning[J], IEEE Access, 10, pp. 87754-87766, (2022)
  • [9] 48, 1, pp. 16-23
  • [10] GALVEZ-LPEZ D,, TARDOS J D., Bags of binary words for fast place recognition in image sequences[J], IEEE Transactions on Robotics, 28, 5, pp. 1188-1197, (2012)