Integrating convolutional guidance and Transformer fusion with Markov Random Fields smoothing for monocular depth estimation

被引：0

作者：

Peng, Xiaorui ^{[1
]}

Meng, Yu ^{[1
]}

Shi, Boqiang ^{[1
]}

Zheng, Chao ^{[1
]}

Wang, Meijun ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, XueYuan Rd 30, Beijing 100083, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 143卷

关键词：

Monocular depth estimation; Intelligent transportation; Environment perception;

D O I：

10.1016/j.engappai.2025.110011

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Monocular depth estimation is a challenging and prominent problem in current computer vision research and is widely used in intelligent transportation like environment perception, navigation and localization. Accurately delineating object boundaries and ensuring smooth transitions in estimated depth images from a single image remain significant challenges. These issues place higher demands on the network's global and local feature extraction capabilities. In response, we proposed a depth estimation framework, designed to address detection accuracy and the global smooth transition of predicted depth maps. Our method introduces a novel feature decoding structure named Convolutional Guided Fusion (CoGF), which utilizes local features extracted by a convolutional neural network as a guide and fuses them with long-range dependent features extracted by a Transformer. This approach enables the model to retain both local details and global contextual information during the decoding process. To ensure global smoothness in the depth estimation results, we incorporate a smoothing strategy based on Markov Random Fields (MRF), enhancing pixel-to-pixel continuity and ensuring robust spatial consistency in the generated depth maps. Our proposed method is evaluated on current mainstream benchmarks. Experimental results demonstrate that our depth estimation method outperforms previous approaches. The code is available at https://github.com/pxrw/CGTF-Depth.git.

引用

页数：10

共 50 条

[41] A contextual conditional random field network for monocular depth estimation
Liu, Jun
Li, Qing
Cao, Rui
Tang, Wenming
Qiu, Guoping
IMAGE AND VISION COMPUTING, 2020, 98
[42] MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer
Zhao, Chaoqiang
Zhang, Youmin
Poggi, Matteo
Tosi, Fabio
Guo, Xianda
Zhu, Zheng
Huang, Guan
Tang, Yang
Mattoccia, Stefano
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 668 - 678
[43] Underwater Monocular Depth Estimation Based on Physical-Guided Transformer
Wang, Chen
Xu, Haiyong
Jiang, Gangyi
Yu, Mei
Luo, Ting
Chen, Yeyao
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 18 - 18
[44] Monocular Depth Estimation With Multi-Scale Feature Fusion
Xu, Xianfa
Chen, Zhe
Yin, Fuliang
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 678 - 682
[45] AFNet: Asymmetric fusion network for monocular panorama depth estimation
Huang, Chengchao
Shao, Feng
Chen, Hangwei
Mu, Baoyang
Jiang, Qiuping
DISPLAYS, 2024, 84
[46] Unsupervised Monocular Depth Estimation Based on Dense Feature Fusion
Chen Ying
Wang Yiliang
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (10) : 2976 - 2984
[47] CNNapsule: A Lightweight Network with Fusion Features for Monocular Depth Estimation
Wang, Yinchu
Zhu, Haijiang
Liu, Mengze
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 507 - 518
[48] Monocular depth estimation with multi-scale feature fusion
Wang Q.
Zhang S.
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2020, 48 (05): : 7 - 12
[49] Radar Fusion Monocular Depth Estimation Based on Dual Attention
Long, JianYu
Huang, JinGui
Wang, ShengChun
ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT I, 2022, 13338 : 166 - 179
[50] Monocular Depth Estimation Based on Dilated Convolutions and Feature Fusion
Li, Hang
Liu, Shuai
Wang, Bin
Wu, Yuanhao
APPLIED SCIENCES-BASEL, 2024, 14 (13):

← 1 2 3 4 5 →