Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information

被引：7

作者：

Hu, Henan ^{[1
,2
,3
]}

Zhu, Ming ^{[1
]}

Li, Muyu ^{[4
]}

Chan, Kwok-Leung ^{[3
]}

机构：

[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Changchun 130033, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

[4] Ctr Intelligent Multidimens Data Anal Ltd, Hong Kong, Peoples R China

来源：

SENSORS | 2022年 / 22卷 / 07期

关键词：

3D object detection; monocular image; point cloud; deep learning; depth estimation; autonomous driving; NETWORK;

D O I：

10.3390/s22072576

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inaccuracy of the target position and the uncertainty in the depth distribution of the foreground target. These two problems arise from the inaccurate depth estimation. To deal with the aforementioned problems, we propose two innovative solutions. The first is a novel method based on joint image segmentation and geometric constraints, used to predict the target depth and provide the depth prediction confidence measure. The predicted target depth is fused with the overall depth of the scene and results in the optimal target position. For the second, we utilize the target scale, normalized with the Gaussian function, as a priori information. The uncertainty of depth distribution, which can be visualized as long-tail noise, is reduced. With the refined depth information, we convert the optimized depth map into the point cloud representation, called a pseudo-LiDAR point cloud. Finally, we input the pseudo-LiDAR point cloud to the LiDAR-based algorithm to detect the 3D target. We conducted extensive experiments on the challenging KITTI dataset. The results demonstrate that our proposed framework outperforms various state-of-the-art methods by more than 12.37% and 5.34% on the easy and hard settings of the KITTI validation subset, respectively. On the KITTI test set, our framework also outperformed state-of-the-art methods by 5.1% and 1.76% on the easy and hard settings, respectively.

引用

页数：20

共 50 条

[21] DEPTH-ASSISTED JOINT DETECTION NETWORK FOR MONOCULAR 3D OBJECT DETECTION
Lei, Jianjun
Guo, Tingyi
Peng, Bo
Yu, Chuanbo
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2204 - 2208
[22] Monocular 3D object detection with thermodynamic loss and decoupled instance depth
Liu, Gang
Xie, Xiaoxiao
Yu, Qingchen
CONNECTION SCIENCE, 2024, 36 (01)
[23] MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer
Huang, Kuan-Chih
Wu, Tsung-Han
Su, Hung-Ting
Hsu, Winston H.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4002 - 4011
[24] MonoDFNet: Monocular 3D Object Detection with Depth Fusion and Adaptive Optimization
Gao, Yuhan
Wang, Peng
Li, Xiaoyan
Sun, Mengyu
Di, Ruohai
Li, Liangliang
Hong, Wei
SENSORS, 2025, 25 (03)
[25] Exploiting Ground Depth Estimation for Mobile Monocular 3D Object Detection
Zhou, Yunsong
Liu, Quan
Zhu, Hongzi
Li, Yunzhe
Chang, Shan
Guo, Minyi
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 3079 - 3093
[26] Depth dynamic center difference convolutions for monocular 3D object detection
Wu, Xinyu
Ma, Dongliang
Qu, Xin
Jiang, Xin
Zeng, Dan
NEUROCOMPUTING, 2023, 520 : 73 - 81
[27] MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
Zhang, Renrui
Qiu, Han
Wang, Tai
Guo, Ziyu
Cui, Ziteng
Qiao, Yu
Li, Hongsheng
Gao, Peng
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9121 - 9132
[28] Task-Aware Monocular Depth Estimation for 3D Object Detection
Wang, Xinlong
Yin, Wei
Kong, Tao
Jiang, Yuning
Li, Lei
Shen, Chunhua
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12257 - 12264
[29] Boosting Monocular 3D Object Detection With Object-Centric Auxiliary Depth Supervision
Kim, Youngseok
Kim, Sanmin
Sim, Sangmin
Choi, Jun Won
Kum, Dongsuk
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (02) : 1801 - 1813
[30] Efficient Active Learning Strategies for Monocular 3D Object Detection
Hekimoglu, Aral
Schmidt, Michael
Marcos-Ramiro, Alvaro
Rigoll, Gerhard
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 295 - 302

← 1 2 3 4 5 →