SWformer-VO: A Monocular Visual Odometry Model Based on Swin Transformer

被引：3

作者：

Wu, Zhigang ^{[1
]}

Zhu, Yaohui ^{[1
]}

机构：

[1] Jiangxi Univ Sci & Technol, Sch Energy & Mech Engn, Nanchang 330013, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 05期

关键词：

Deep learning; monocular visual odometry; transformer; DEPTH;

D O I：

10.1109/LRA.2024.3384911

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

This letter introduces a novel monocular visual odometry network structure, leveraging the Swin Transformer as the backbone network, named SWformer-VO. It can directly estimate the six degrees of freedom camera pose under monocular camera conditions by utilizing a modest volume of image sequence data with an end-to-end methodology. SWformer-VO introduces an Embed module called "Mixture Embed", which fuses consecutive pairs of images into a single frame and converts them into tokens passed into the backbone network. This approach replaces traditional temporal sequence schemes by addressing the problem at the image level. Building upon this foundation, various parameters of the backbone network are continually improved and optimized. Additionally, experiments are conducted to explore the impact of different layers and depths of the backbone network on accuracy. Excitingly, on the KITTI dataset, SWformer-VO demonstrates superior accuracy compared with common deep learning-based methods such as SFMlearner, Deep-VO, TSformer-VO, Depth-VO-Feat, GeoNet, Masked Gans and others introduced in recent years. Moreover, the effectiveness of SWformer-VO is also validated on our self-collected dataset consisting of nine indoor corridor routes for visual odometry.

引用

页码：4766 / 4773

页数：8

共 50 条

[21] A Monocular Visual-Inertial Odometry Based on Hybrid Residuals
Lai, Zhenghong
Gui, Jianjun
Xu, Dengke
Dong, Hongbin
Deng, Baosong
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3304 - 3311
[22] Appearance-Based Monocular Visual Odometry for Ground Vehicles
Yu, Yang
Pradalier, Cedric
Zong, Guanghua
2011 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2011, : 862 - 867
[23] Monocular Visual Odometry Based on Optical Flow and Feature Matching
Cheng Chuanqi
Hao Xiangyang
Zhang Zhenjie
Zhao Mandan
2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4554 - 4558
[24] A Visible-Thermal Fusion Based Monocular Visual Odometry
Poujol, Julien
Aguilera, Cristhian A.
Danos, Etienne
Vintimilla, Boris X.
Toledo, Ricardo
Sappa, Angel D.
ROBOT 2015: SECOND IBERIAN ROBOTICS CONFERENCE: ADVANCES IN ROBOTICS, VOL 1, 2016, 417 : 517 - 528
[25] Realtime edge-based visual odometry for a monocular camera
Tarrio, Juan Jose
Pedre, Sol
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 702 - 710
[26] Monocular Non-linear Photometric Transformation Visual Odometry Based on Direct Sparse Odometry
Yuan, Junyi
Hirota, Kaoru
Zhang, Zelong
Dai, Yaping
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2682 - 2687
[27] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
Yang, Nan
von Stumberg, Lukas
Wang, Rui
Cremers, Daniel
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1278 - 1289
[28] SuperVO: A Monocular Visual Odometry based on Learned Feature Matching with GNN
Rao, Shi
2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS AND COMPUTER ENGINEERING (ICCECE), 2021, : 18 - 26
[29] An Unsupervised Monocular Visual Odometry Based on Multi-Scale Modeling
Zhi, Henghui
Yin, Chenyang
Li, Huibin
Pang, Shanmin
SENSORS, 2022, 22 (14)
[30] Experimental Evaluation of Direct Monocular Visual Odometry Based on Nonlinear Optimization
Liang, Jian
Cheng, Xin
He, Yezhou
Li, Xiaoli
Liu, Huashan
2019 WORLD ROBOT CONFERENCE SYMPOSIUM ON ADVANCED ROBOTICS AND AUTOMATION (WRC SARA 2019), 2019, : 291 - 295

← 1 2 3 4 5 →