SWformer-VO: A Monocular Visual Odometry Model Based on Swin Transformer

被引:3
|
作者
Wu, Zhigang [1 ]
Zhu, Yaohui [1 ]
机构
[1] Jiangxi Univ Sci & Technol, Sch Energy & Mech Engn, Nanchang 330013, Peoples R China
关键词
Deep learning; monocular visual odometry; transformer; DEPTH;
D O I
10.1109/LRA.2024.3384911
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This letter introduces a novel monocular visual odometry network structure, leveraging the Swin Transformer as the backbone network, named SWformer-VO. It can directly estimate the six degrees of freedom camera pose under monocular camera conditions by utilizing a modest volume of image sequence data with an end-to-end methodology. SWformer-VO introduces an Embed module called "Mixture Embed", which fuses consecutive pairs of images into a single frame and converts them into tokens passed into the backbone network. This approach replaces traditional temporal sequence schemes by addressing the problem at the image level. Building upon this foundation, various parameters of the backbone network are continually improved and optimized. Additionally, experiments are conducted to explore the impact of different layers and depths of the backbone network on accuracy. Excitingly, on the KITTI dataset, SWformer-VO demonstrates superior accuracy compared with common deep learning-based methods such as SFMlearner, Deep-VO, TSformer-VO, Depth-VO-Feat, GeoNet, Masked Gans and others introduced in recent years. Moreover, the effectiveness of SWformer-VO is also validated on our self-collected dataset consisting of nine indoor corridor routes for visual odometry.
引用
收藏
页码:4766 / 4773
页数:8
相关论文
共 50 条
  • [31] Semi-Direct Monocular Visual Odometry Based on Visual-Inertial Fusion
    Gong Z.
    Zhang X.
    Peng X.
    Li X.
    Zhang, Xiaoli (zhxl@xmu.edu.cn), 1600, Chinese Academy of Sciences (42): : 595 - 605
  • [32] A Comparison of Deep Learning-Based Monocular Visual Odometry Algorithms
    Jeong, Eunju
    Lee, Jaun
    Kim, Pyojin
    PROCEEDINGS OF THE 2021 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY (APISAT 2021), VOL 2, 2023, 913 : 923 - 934
  • [33] Bundle Adjustment for Monocular Visual Odometry Based on Detections of Traffic Signs
    Zhang, Yanting
    Zhang, Haotian
    Wang, Gaoang
    Yang, Jie
    Hwang, Jenq-Neng
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (01) : 151 - 162
  • [34] ViTVO: Vision Transformer based Visual Odometry with Attention Supervision
    Chiu, Chu-Chi
    Yang, Hsuan-Kung
    Chen, Hao-Wei
    Chen, Yu-Wen
    Lee, Chun-Yi
    2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [35] Seismic facies identification model based on Swin Transformer
    Shuo, Liangxun
    Li, Zhixuan
    Chai, Bianfang
    Wang, Tianyi
    Zheng, Xiaodong
    Natural Gas Industry, 2024, 44 (12) : 63 - 72
  • [36] MAR-VO: A Match-and-Refine Framework for UAV's Monocular Visual Odometry in Planetary Environments
    Liu, Jiayuan
    Zhou, Bo
    Wan, Xue
    Pan, Yan
    Li, Zicong
    Shao, Yuanbin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [37] Tight Integration of Feature-based Relocalization in Monocular Direct Visual Odometry
    Gladkova, Mariia
    Wang, Rui
    Zeller, Niclas
    Cremers, Daniel
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 9608 - 9614
  • [38] ORBDeepOdometry - A Feature-Based Deep Learning Approach to Monocular Visual Odometry
    Krishnan, Karthik Sivarama
    Sahin, Ferat
    2019 14TH ANNUAL CONFERENCE SYSTEM OF SYSTEMS ENGINEERING (SOSE), 2019, : 296 - 301
  • [39] Towards Dynamic Monocular Visual Odometry Based on an Event Camera and IMU Sensor
    Mohamed, Sherif A. S.
    Haghbayan, Mohammad-Hashem
    Rabah, Mohammed
    Heikkonen, Jukka
    Tenhunen, Hannu
    Plosila, Juha
    INTELLIGENT TRANSPORT SYSTEMS, 2020, 310 : 249 - 263
  • [40] Monocular Visual Odometry Based Navigation for a Differential Mobile Robot with Android OS
    Villanueva-Escudero, Carla
    Villegas-Cortez, Juan
    Zuniga-Lopez, Arturo
    Aviles-Cruz, Carlos
    HUMAN-INSPIRED COMPUTING AND ITS APPLICATIONS, PT I, 2014, 8856 : 281 - 292