SWformer-VO: A Monocular Visual Odometry Model Based on Swin Transformer

被引:3
|
作者
Wu, Zhigang [1 ]
Zhu, Yaohui [1 ]
机构
[1] Jiangxi Univ Sci & Technol, Sch Energy & Mech Engn, Nanchang 330013, Peoples R China
关键词
Deep learning; monocular visual odometry; transformer; DEPTH;
D O I
10.1109/LRA.2024.3384911
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This letter introduces a novel monocular visual odometry network structure, leveraging the Swin Transformer as the backbone network, named SWformer-VO. It can directly estimate the six degrees of freedom camera pose under monocular camera conditions by utilizing a modest volume of image sequence data with an end-to-end methodology. SWformer-VO introduces an Embed module called "Mixture Embed", which fuses consecutive pairs of images into a single frame and converts them into tokens passed into the backbone network. This approach replaces traditional temporal sequence schemes by addressing the problem at the image level. Building upon this foundation, various parameters of the backbone network are continually improved and optimized. Additionally, experiments are conducted to explore the impact of different layers and depths of the backbone network on accuracy. Excitingly, on the KITTI dataset, SWformer-VO demonstrates superior accuracy compared with common deep learning-based methods such as SFMlearner, Deep-VO, TSformer-VO, Depth-VO-Feat, GeoNet, Masked Gans and others introduced in recent years. Moreover, the effectiveness of SWformer-VO is also validated on our self-collected dataset consisting of nine indoor corridor routes for visual odometry.
引用
收藏
页码:4766 / 4773
页数:8
相关论文
共 50 条
  • [21] A Monocular Visual-Inertial Odometry Based on Hybrid Residuals
    Lai, Zhenghong
    Gui, Jianjun
    Xu, Dengke
    Dong, Hongbin
    Deng, Baosong
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3304 - 3311
  • [22] Appearance-Based Monocular Visual Odometry for Ground Vehicles
    Yu, Yang
    Pradalier, Cedric
    Zong, Guanghua
    2011 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2011, : 862 - 867
  • [23] Monocular Visual Odometry Based on Optical Flow and Feature Matching
    Cheng Chuanqi
    Hao Xiangyang
    Zhang Zhenjie
    Zhao Mandan
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4554 - 4558
  • [24] A Visible-Thermal Fusion Based Monocular Visual Odometry
    Poujol, Julien
    Aguilera, Cristhian A.
    Danos, Etienne
    Vintimilla, Boris X.
    Toledo, Ricardo
    Sappa, Angel D.
    ROBOT 2015: SECOND IBERIAN ROBOTICS CONFERENCE: ADVANCES IN ROBOTICS, VOL 1, 2016, 417 : 517 - 528
  • [25] Realtime edge-based visual odometry for a monocular camera
    Tarrio, Juan Jose
    Pedre, Sol
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 702 - 710
  • [26] Monocular Non-linear Photometric Transformation Visual Odometry Based on Direct Sparse Odometry
    Yuan, Junyi
    Hirota, Kaoru
    Zhang, Zelong
    Dai, Yaping
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2682 - 2687
  • [27] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
    Yang, Nan
    von Stumberg, Lukas
    Wang, Rui
    Cremers, Daniel
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1278 - 1289
  • [28] SuperVO: A Monocular Visual Odometry based on Learned Feature Matching with GNN
    Rao, Shi
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS AND COMPUTER ENGINEERING (ICCECE), 2021, : 18 - 26
  • [29] An Unsupervised Monocular Visual Odometry Based on Multi-Scale Modeling
    Zhi, Henghui
    Yin, Chenyang
    Li, Huibin
    Pang, Shanmin
    SENSORS, 2022, 22 (14)
  • [30] Experimental Evaluation of Direct Monocular Visual Odometry Based on Nonlinear Optimization
    Liang, Jian
    Cheng, Xin
    He, Yezhou
    Li, Xiaoli
    Liu, Huashan
    2019 WORLD ROBOT CONFERENCE SYMPOSIUM ON ADVANCED ROBOTICS AND AUTOMATION (WRC SARA 2019), 2019, : 291 - 295