Multi-Scale Monocular Depth Estimation Based on Global Understanding

被引:1
|
作者
Xiao, Jiejie [1 ]
Li, Lihong [2 ]
Su, Xu [1 ]
Tan, Guopeng [1 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056038, Peoples R China
[2] Hebei Univ Engn, Hebei Key Lab Secur & Protect Informat Sensing & P, Handan 056038, Peoples R China
关键词
Convolutional neural networks; Network architecture; Transformers; Spatial resolution; depth estimation; global understanding module; difference module; cascade module;
D O I
10.1109/ACCESS.2024.3382572
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advancement of Convolutional Neural Networks, numerous convolutional neural network-based methods have been proposed for depth estimation and have achieved significant achievements. However, the repetitive convolutional layers and spatial pooling layers in these networks often lead to a reduction in spatial resolution and loss of local information, such as edge contours. To address this issue, this study presents a multi-scale monocular depth estimation model. Specifically, a Global Understanding Module was introduced on top of a generic encoder to increase the receptive field and capture contextual information. Additionally, the decoding process incorporates a Difference Module and a Multi-scale Cascade Module to guide the decoding information and refine edge contour details. Finally, extensive experiments were conducted using the KITTI and NYUv2 datasets. For the KITTI dataset, the Absolute Relative Error (Abs. Rel) was 0.057, and the Root Mean Squared Error (RMSE) was 2.415. On the NYUv2 dataset, Abs.Rel was 0.104, and RMSE was 0.380. These results indicate that the model performs well in accurately estimating depth information.
引用
收藏
页码:46930 / 46939
页数:10
相关论文
共 50 条
  • [21] Dense monocular depth estimation for stereoscopic vision based on pyramid transformer and multi-scale feature fusion
    Zhongyi Xia
    Tianzhao Wu
    Zhuoyan Wang
    Man Zhou
    Boqi Wu
    C. Y. Chan
    Ling Bing Kong
    Scientific Reports, 14
  • [22] Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement
    Xu, Xianfa
    Chen, Zhe
    Yin, Fuliang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8811 - 8822
  • [23] Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks
    Xu, Dan
    Ricci, Elise
    Ouyang, Wanli
    Wang, Xiaogang
    Sebe, Nicu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (06) : 1426 - 1440
  • [24] Hierarchical Multi-scale Architecture Search for Self-supervised Monocular Depth Estimation
    Ren, Jian
    Xie, Jin
    Jin, Zhong
    PATTERN RECOGNITION, ACPR 2021, PT II, 2022, 13189 : 447 - 461
  • [25] Self-supervised monocular Depth estimation with multi-scale structure similarity loss
    Han, Chenggong
    Cheng, Deqiang
    Kou, Qiqi
    Wang, Xiaoyi
    Chen, Liangliang
    Zhao, Jiamin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 82 (24) : 38035 - 38050
  • [26] Self-supervised monocular Depth estimation with multi-scale structure similarity loss
    Chenggong Han
    Deqiang Cheng
    Qiqi Kou
    Xiaoyi Wang
    Liangliang Chen
    Jiamin Zhao
    Multimedia Tools and Applications, 2023, 82 : 38035 - 38050
  • [27] HMA-Depth: A New Monocular Depth Estimation Model Using Hierarchical Multi-Scale Attention
    Niu, Zhaofeng
    Fujimoto, Yuichiro
    Kanbara, Masayuki
    Kato, Hirokazu
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [28] Joint Attention Mechanisms for Monocular Depth Estimation With Multi-Scale Convolutions and Adaptive Weight Adjustment
    Liu, Peng
    Zhang, Zonghua
    Meng, Zhaozong
    Gao, Nan
    IEEE ACCESS, 2020, 8 (184437-184450) : 184437 - 184450
  • [29] Efficient and High-Quality Monocular Depth Estimation via Gated Multi-Scale Network
    Lin, Lixiong
    Huang, Guohui
    Chen, Yanjie
    Zhang, Liwei
    He, Bingwei
    IEEE ACCESS, 2020, 8 : 7709 - 7718
  • [30] MS360: A Multi-Scale Feature Fusion Framework for 360 Monocular Depth Estimation
    Mohadikar, Payal
    Fan, Chuanmao
    Duan, Ye
    PROCEEDINGS OF THE 50TH GRAPHICS INTERFACE CONFERENCE, GI 2024, 2024,