Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information

被引:7
|
作者
Hu, Henan [1 ,2 ,3 ]
Zhu, Ming [1 ]
Li, Muyu [4 ]
Chan, Kwok-Leung [3 ]
机构
[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Changchun 130033, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[4] Ctr Intelligent Multidimens Data Anal Ltd, Hong Kong, Peoples R China
关键词
3D object detection; monocular image; point cloud; deep learning; depth estimation; autonomous driving; NETWORK;
D O I
10.3390/s22072576
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recently, the research on monocular 3D target detection based on pseudo-LiDAR data has made some progress. In contrast to LiDAR-based algorithms, the robustness of pseudo-LiDAR methods is still inferior. After conducting in-depth experiments, we realized that the main limitations are due to the inaccuracy of the target position and the uncertainty in the depth distribution of the foreground target. These two problems arise from the inaccurate depth estimation. To deal with the aforementioned problems, we propose two innovative solutions. The first is a novel method based on joint image segmentation and geometric constraints, used to predict the target depth and provide the depth prediction confidence measure. The predicted target depth is fused with the overall depth of the scene and results in the optimal target position. For the second, we utilize the target scale, normalized with the Gaussian function, as a priori information. The uncertainty of depth distribution, which can be visualized as long-tail noise, is reduced. With the refined depth information, we convert the optimized depth map into the point cloud representation, called a pseudo-LiDAR point cloud. Finally, we input the pseudo-LiDAR point cloud to the LiDAR-based algorithm to detect the 3D target. We conducted extensive experiments on the challenging KITTI dataset. The results demonstrate that our proposed framework outperforms various state-of-the-art methods by more than 12.37% and 5.34% on the easy and hard settings of the KITTI validation subset, respectively. On the KITTI test set, our framework also outperformed state-of-the-art methods by 5.1% and 1.76% on the easy and hard settings, respectively.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
    Wang, Li
    Du, Liang
    Ye, Xiaoqing
    Fu, Yanwei
    Guo, Guodong
    Xue, Xiangyang
    Feng, Jianfeng
    Zhang, Li
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 454 - 463
  • [42] Revisiting Depth-guided Methods for Monocular 3D Object Detection by Hierarchical Balanced Depth
    Chen, Yi-Rong
    Tseng, Ching-Yu
    Liou, Yi-Syuan
    Wu, Tsung-Han
    Hsu, Winston H.
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [43] Absolute Distance Prediction Based on Deep Learning Object Detection and Monocular Depth Estimation Models
    Masoumian, Armin
    Marei, David G. F.
    Abdulwahab, Saddam
    Cristiano, Julian
    Puig, Domenec
    Rashwan, Hatem A.
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2021, 339 : 325 - 334
  • [44] 3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose From Monocular Video
    Wang, Guangming
    Zhong, Jiquan
    Zhao, Shijie
    Wu, Wenhua
    Liu, Zhe
    Wang, Hesheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1776 - 1786
  • [45] Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection
    Wu, Zizhang
    Wu, Yunzhe
    Pu, Jian
    Li, Xianzhi
    Wang, Xiaoquan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 2892 - 2900
  • [46] Monocular 3D Object Detection Based on Uncertainty Prediction of Keypoints
    Chen, Mu
    Zhao, Huaici
    Liu, Pengfei
    MACHINES, 2022, 10 (01)
  • [47] MonoDCN: Monocular 3D object detection based on dynamic convolution
    Qu, Shenming
    Yang, Xinyu
    Gao, Yiming
    Liang, Shengbin
    PLOS ONE, 2022, 17 (10):
  • [48] eGAC3D: enhancing depth adaptive convolution and depth estimation for monocular 3D object pose detection
    Ngo, Duc Tuan
    Bui, Minh-Quan Viet
    Nguyen, Duc Dung
    Pham, Hoang-Anh
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [49] Triangulation Learning Network: from Monocular to Stereo 3D Object Detection
    Qin, Zengyi
    Wang, Jinglu
    Lu, Yan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7607 - 7615
  • [50] Monocular 3D Object Detection Utilizing Auxiliary Learning With Deformable Convolution
    Chen, Jiun-Han
    Shieh, Jeng-Lun
    Haq, Muhamad Amirul
    Ruan, Shanq-Jang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (03) : 2424 - 2436