Integrating convolutional guidance and Transformer fusion with Markov Random Fields smoothing for monocular depth estimation

被引:0
|
作者
Peng, Xiaorui [1 ]
Meng, Yu [1 ]
Shi, Boqiang [1 ]
Zheng, Chao [1 ]
Wang, Meijun [1 ]
机构
[1] Univ Sci & Technol Beijing, XueYuan Rd 30, Beijing 100083, Peoples R China
关键词
Monocular depth estimation; Intelligent transportation; Environment perception;
D O I
10.1016/j.engappai.2025.110011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Monocular depth estimation is a challenging and prominent problem in current computer vision research and is widely used in intelligent transportation like environment perception, navigation and localization. Accurately delineating object boundaries and ensuring smooth transitions in estimated depth images from a single image remain significant challenges. These issues place higher demands on the network's global and local feature extraction capabilities. In response, we proposed a depth estimation framework, designed to address detection accuracy and the global smooth transition of predicted depth maps. Our method introduces a novel feature decoding structure named Convolutional Guided Fusion (CoGF), which utilizes local features extracted by a convolutional neural network as a guide and fuses them with long-range dependent features extracted by a Transformer. This approach enables the model to retain both local details and global contextual information during the decoding process. To ensure global smoothness in the depth estimation results, we incorporate a smoothing strategy based on Markov Random Fields (MRF), enhancing pixel-to-pixel continuity and ensuring robust spatial consistency in the generated depth maps. Our proposed method is evaluated on current mainstream benchmarks. Experimental results demonstrate that our depth estimation method outperforms previous approaches. The code is available at https://github.com/pxrw/CGTF-Depth.git.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] MDEConvFormer: estimating monocular depth as soft regression based on convolutional transformer
    Su, Wen
    He, Ye
    Zhang, Haifeng
    Yang, Wenzhen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 68793 - 68811
  • [22] Estimation for hidden Markov random fields
    Elliott, RJ
    Aggoun, L
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1996, 50 (03) : 343 - 351
  • [23] Optimal smoothing for spherical Gauss-Markov Random Fields with application to weather data estimation
    Borri, Alessandro
    Carravetta, Francesco
    White, Langford B.
    EUROPEAN JOURNAL OF CONTROL, 2017, 33 : 43 - 51
  • [24] A Contour-Aware Monocular Depth Estimation Network Using Swin Transformer and Cascaded Multiscale Fusion
    Li, Tao
    Zhang, Yi
    IEEE SENSORS JOURNAL, 2024, 24 (08) : 13620 - 13628
  • [25] MobileDepth: Monocular Depth Estimation Based on Lightweight Vision Transformer
    Li, Yundong
    Wei, Xiaokun
    APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
  • [26] METER: A Mobile Vision Transformer Architecture for Monocular Depth Estimation
    Papa, Lorenzo
    Russo, Paolo
    Amerini, Irene
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5882 - 5893
  • [27] STRUCTURE GENERATION AND GUIDANCE NETWORK FOR UNSUPERVISED MONOCULAR DEPTH ESTIMATION
    Wang, Chaoqun
    Chen, Xuejin
    Min, Shaobo
    Wu, Feng
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1264 - 1269
  • [28] MODE: Monocular omnidirectional depth estimation via consistent depth fusion
    Liu, Yunbiao
    Chen, Chunyi
    IMAGE AND VISION COMPUTING, 2023, 136
  • [29] Depth estimation for monocular image based on convolutional neural networks
    Niu B.
    Tang M.
    Chen X.
    International Journal of Circuits, Systems and Signal Processing, 2021, 15 : 533 - 540
  • [30] MobileXNet: An Efficient Convolutional Neural Network for Monocular Depth Estimation
    Dong, Xingshuai
    Garratt, Matthew A.
    Anavatti, Sreenatha G.
    Abbass, Hussein A.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 20134 - 20147