Integrating convolutional guidance and Transformer fusion with Markov Random Fields smoothing for monocular depth estimation

被引：0

作者：

Peng, Xiaorui ^{[1
]}

Meng, Yu ^{[1
]}

Shi, Boqiang ^{[1
]}

Zheng, Chao ^{[1
]}

Wang, Meijun ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, XueYuan Rd 30, Beijing 100083, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 143卷

关键词：

Monocular depth estimation; Intelligent transportation; Environment perception;

D O I：

10.1016/j.engappai.2025.110011

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Monocular depth estimation is a challenging and prominent problem in current computer vision research and is widely used in intelligent transportation like environment perception, navigation and localization. Accurately delineating object boundaries and ensuring smooth transitions in estimated depth images from a single image remain significant challenges. These issues place higher demands on the network's global and local feature extraction capabilities. In response, we proposed a depth estimation framework, designed to address detection accuracy and the global smooth transition of predicted depth maps. Our method introduces a novel feature decoding structure named Convolutional Guided Fusion (CoGF), which utilizes local features extracted by a convolutional neural network as a guide and fuses them with long-range dependent features extracted by a Transformer. This approach enables the model to retain both local details and global contextual information during the decoding process. To ensure global smoothness in the depth estimation results, we incorporate a smoothing strategy based on Markov Random Fields (MRF), enhancing pixel-to-pixel continuity and ensuring robust spatial consistency in the generated depth maps. Our proposed method is evaluated on current mainstream benchmarks. Experimental results demonstrate that our depth estimation method outperforms previous approaches. The code is available at https://github.com/pxrw/CGTF-Depth.git.

引用

页数：10

共 50 条

[21] MDEConvFormer: estimating monocular depth as soft regression based on convolutional transformer
Su, Wen
He, Ye
Zhang, Haifeng
Yang, Wenzhen
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 68793 - 68811
[22] Estimation for hidden Markov random fields
Elliott, RJ
Aggoun, L
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1996, 50 (03) : 343 - 351
[23] Optimal smoothing for spherical Gauss-Markov Random Fields with application to weather data estimation
Borri, Alessandro
Carravetta, Francesco
White, Langford B.
EUROPEAN JOURNAL OF CONTROL, 2017, 33 : 43 - 51
[24] A Contour-Aware Monocular Depth Estimation Network Using Swin Transformer and Cascaded Multiscale Fusion
Li, Tao
Zhang, Yi
IEEE SENSORS JOURNAL, 2024, 24 (08) : 13620 - 13628
[25] MobileDepth: Monocular Depth Estimation Based on Lightweight Vision Transformer
Li, Yundong
Wei, Xiaokun
APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
[26] METER: A Mobile Vision Transformer Architecture for Monocular Depth Estimation
Papa, Lorenzo
Russo, Paolo
Amerini, Irene
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5882 - 5893
[27] STRUCTURE GENERATION AND GUIDANCE NETWORK FOR UNSUPERVISED MONOCULAR DEPTH ESTIMATION
Wang, Chaoqun
Chen, Xuejin
Min, Shaobo
Wu, Feng
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1264 - 1269
[28] MODE: Monocular omnidirectional depth estimation via consistent depth fusion
Liu, Yunbiao
Chen, Chunyi
IMAGE AND VISION COMPUTING, 2023, 136
[29] Depth estimation for monocular image based on convolutional neural networks
Niu B.
Tang M.
Chen X.
International Journal of Circuits, Systems and Signal Processing, 2021, 15 : 533 - 540
[30] MobileXNet: An Efficient Convolutional Neural Network for Monocular Depth Estimation
Dong, Xingshuai
Garratt, Matthew A.
Anavatti, Sreenatha G.
Abbass, Hussein A.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 20134 - 20147

← 1 2 3 4 5 →