Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information

被引:7
|
作者
Hu, Henan [1 ,2 ,3 ]
Zhu, Ming [1 ]
Li, Muyu [4 ]
Chan, Kwok-Leung [3 ]
机构
[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Changchun 130033, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[4] Ctr Intelligent Multidimens Data Anal Ltd, Hong Kong, Peoples R China
关键词
3D object detection; monocular image; point cloud; deep learning; depth estimation; autonomous driving; NETWORK;
D O I
10.3390/s22072576
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inaccuracy of the target position and the uncertainty in the depth distribution of the foreground target. These two problems arise from the inaccurate depth estimation. To deal with the aforementioned problems, we propose two innovative solutions. The first is a novel method based on joint image segmentation and geometric constraints, used to predict the target depth and provide the depth prediction confidence measure. The predicted target depth is fused with the overall depth of the scene and results in the optimal target position. For the second, we utilize the target scale, normalized with the Gaussian function, as a priori information. The uncertainty of depth distribution, which can be visualized as long-tail noise, is reduced. With the refined depth information, we convert the optimized depth map into the point cloud representation, called a pseudo-LiDAR point cloud. Finally, we input the pseudo-LiDAR point cloud to the LiDAR-based algorithm to detect the 3D target. We conducted extensive experiments on the challenging KITTI dataset. The results demonstrate that our proposed framework outperforms various state-of-the-art methods by more than 12.37% and 5.34% on the easy and hard settings of the KITTI validation subset, respectively. On the KITTI test set, our framework also outperformed state-of-the-art methods by 5.1% and 1.76% on the easy and hard settings, respectively.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] 3D Street Object Detection from Monocular Images Using Deep Learning and Depth Information
    Liu, Wei
    Zhang, Tao
    Ma, Yun
    Wei, Longsheng
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2023, 27 (02) : 198 - 206
  • [2] Depth-Enhanced Deep Learning Approach For Monocular Camera Based 3D Object Detection
    Wang, Chuyao
    Aouf, Nabil
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (03)
  • [3] Deep Optics for Monocular Depth Estimation and 3D Object Detection
    Chang, Julie
    Wetzstein, Gordon
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 10192 - 10201
  • [4] A Survey on Monocular 3D Object Detection Algorithms Based on Deep Learning
    Wu, Junhui
    Yin, Dong
    Chen, Jie
    Wu, Yusheng
    Si, Huiping
    Lin, Kaiyan
    2020 4TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2020), 2020, 1518
  • [5] A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection
    Kim, Seong-heum
    Hwang, Youngbae
    ELECTRONICS, 2021, 10 (04) : 1 - 22
  • [6] Learning Depth-Guided Convolutions for Monocular 3D Object Detection
    Ng, Mingyu
    Huo, Yuqi
    Yi, Hongwei
    Wang, Zhe
    Shi, Jianping
    Lu, Zhiwu
    Luo, Ping
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4306 - 4315
  • [7] Depth-discriminative Metric Learning for Monocular 3D Object Detection
    Choi, Wonhyeok
    Shin, Mingyu
    Im, Sunghoon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Survey on deep learning-based 3D object detection in autonomous driving
    Liang, Zhenming
    Huang, Yingping
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2023, 45 (04) : 761 - 776
  • [9] Monocular 3D Object Detection with Depth from Motion
    Wang, Tai
    Pang, Jiangmiao
    Lin, Dahua
    COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 386 - 403
  • [10] Mono-DCNet: Monocular 3D Object Detection via Depth-based Centroid Refinement and Pose Estimation
    Astudillo, Armando
    Al-Kaff, Abdulla
    Garcia, Fernando
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 664 - 669