TFDEPTH: SELF-SUPERVISED MONOCULARDEPTH ESTIMATION WITH MULITI-SCALE SELECTIVE TRANSFORMER FEATURE FUSION

被引:0
|
作者
Hu, Hongli [1 ]
Miao, Jun [1 ,2 ]
Zhu, Guanghu [1 ]
Yan, Je [2 ]
Chu, Jun [3 ]
机构
[1] Nanchang Hangkong Univ, Sch Aeronaut Mfg Engn, Nanchang, Peoples R China
[2] Chinese Acad Sci, Key Lab Lunar & Deep Space Explorat, Beijing, Peoples R China
[3] Nanchang Hangkong Univ, Key Lab Jiangxi Prov Image Proc & Pattern Recognit, Nanchang 330063, Peoples R China
来源
IMAGE ANALYSIS & STEREOLOGY | 2024年 / 43卷 / 02期
关键词
monocular depth estimation; multi-scale fusion; self-supervised learning; transformer;
D O I
10.105566/ias.2987
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing self -supervised models for monocular depth estimation suffer from issues such as discontinuity, blurred edges, and unclear contours, particularly for small objects. We propose a self -supervised monocular depth estimation network with multi -scale selective Transformer feature fusion. To preserve more detailed features, this paper constructs a multi -scale encoder to extract features and leverages the self -attention mechanism of Transformer to capture global contextual information, enabling better depth prediction for small objects. Additionally, the multi -scale selective fusion module (MSSF) is also proposed, which can make full use of multi -scale feature information in the decoding part and perform selective fusion step by step, which can effectively eliminate noise and retain local detail features to obtain a clear depth map with clear edges. Experimental evaluations on the KITTI dataset demonstrate that the proposed algorithm achieves an absolute relative error (Abs Rel) of 0.098 and an accuracy rate (delta) of 0.983. The results indicate that the proposed algorithm not only estimates depth values with high accuracy but also predicts the continuous depth map with clear edges.
引用
收藏
页码:139 / 149
页数:11
相关论文
共 50 条
  • [31] Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation
    Zhou, Zhengming
    Dong, Qiulei
    COMPUTER VISION - ECCV 2022, PT I, 2022, 13661 : 709 - 726
  • [32] Self-Supervised Scale Recovery for Monocular Depth and Egomotion Estimation
    Wagstaff, Brandon
    Kelly, Jonathan
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 2620 - 2627
  • [33] Transformer-based Self-supervised Representation Learning for Emotion Recognition Using Bio-signal Feature Fusion
    Sawant, Shrutika S.
    Erick, F. X.
    Arora, Pulkit
    Pahl, Jaspar
    Foltyn, Andreas
    Holzer, Nina
    Gotz, Theresa
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [34] Self-supervised Depth Estimation based on Feature Sharing and Consistency Constraints
    Mendoza, Julio
    Pedrini, Helio
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 134 - 141
  • [35] LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity
    Karmali, Tejan
    Atrishi, Abhinav
    Harsha, Sai Sree
    Agrawal, Susmit
    Jampani, Varun
    Babu, R. Venkatesh
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3046 - 3055
  • [36] Self-Supervised Monocular Depth Estimation Using HOG Feature Prediction
    He, Xin
    Zhao, Xiao
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 382 - 387
  • [37] MSDFNet: multi-scale detail feature fusion encoder-decoder network for self-supervised monocular thermal image depth estimation
    Kong, Lingjun
    Zheng, Qianhui
    Wang, Wenju
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (01)
  • [38] Depth Estimation Using a Self-Supervised Network Based on Cross-Layer Feature Fusion and the Quadtree Constraint
    Tian, Fangzheng
    Gao, Yongbin
    Fang, Zhijun
    Fang, Yuming
    Gu, Jia
    Fujita, Hamido
    Hwang, Jenq-Neng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 1751 - 1766
  • [39] Decoupled spatiotemporal adaptive fusion network for self-supervised motion estimation
    Sun, Zitang
    Luo, Zhengbo
    Nishida, Shin'ya
    NEUROCOMPUTING, 2023, 534 : 133 - 146
  • [40] Lightweight Self-Supervised Monocular Depth Estimation Through CNN and Transformer Integration
    Wang, Zhe
    Zou, Yongjia
    Lv, Jin
    Cao, Yang
    Yu, Hongfei
    IEEE ACCESS, 2024, 12 : 167934 - 167943